A Survey on Automatic Speech Recognition

Seiichi NAKAGAWA  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E85-D   No.3   pp.465-486
Publication Date: 2002/03/01
Online ISSN: 
DOI: 
Print ISSN: 0916-8532
Type of Manuscript: INVITED SURVEY PAPER
Category: Speech and Hearing
Keyword: 
speech recognition,  acoustic model,  HMM,  language model,  ngram,  

Full Text: PDF(1.1MB)>>
Buy this Article




Summary: 
In this paper, we describe the recent trend in automatic speech recognition. First, we should point out that the current art of speech recognition by machines is admittedly inferior to the ability of human beings. In particular, we assert that the improvement of acoustic models is necessary. Second, we describe robust feature parameters for noisy environments, which are important in practical usage. Then, we indicate that much training data in the same environment as the recognition stage are useful from the viewpoints of information theory and pattern recognition. Third, we discuss acoustic models and language models which are central issues in speech recognition techniques. Then the principle and limitations of the hidden Markov model (HMM) and recent extended models are discussed. The role of language models is to eliminate improbable candidate words, that is, to reduce the search space. In other words, language models having smaller entropy are preferable. From this standpoint, we survey stochastic language models. Finally, we state some points which deserve attention when constructing speech recognition systems.