For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Processing Unknown Words in Continuous Speech Recognition
Kenji KITA Terumasa EHARA Tsuyoshi MORIMOTO
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences
Publication Date: 1991/07/25
Print ISSN: 0916-8508
Type of Manuscript: Special Section PAPER (Special Issue on Continuous Speech Recognition and Understanding)
Category: Continuous Speech Recognition
Full Text: PDF>>
Current continuous speech recognition systems essentially ignore unknown words. Systems are designed to recognize words in the lexicon. However, for using speech recognition systems in a real application such as spoken-language processing, it is very important to process unknown words. This paper proposes a continuous speech recognition method which accepts any utterance that might include unknown words. In this method, words not in the lexicon are transcribed as phone sequences, while words in the lexicon are recognized correctly. The HMM-LR speech recognition system, which is an integration of Hidden Markov Models and generalized LR parsing, is used as the baseline system, and enhanced with the trigram model of syllables to take into account the stochastic characteristics of a language. In our approach, two kinds of grammars, a task grammar which describes the task and a phonetic grammar which describes constraints between phones, are merged and used in the HMM-LR system. The system can output a phonetic transcription for an unknown word by using the phonetic grammar. Experiment results indicate that our approach is very promising.