Multisegment Multiple VQ Codebooks-Based Speaker Independent Isolated-Word Recognition Using Unbiased Mel Cepstrum

Liang ZHOU  Satoshi IMAI  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E78-D   No.9   pp.1178-1187
Publication Date: 1995/09/25
Online ISSN: 
DOI: 
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Speech Processing and Acoustics
Keyword: 
multisegment vector quantization (MSVQ),  isolatedword speech recognition,  unbiased estimation of log spectrum,  LPC,  FFT,  

Full Text: PDF>>
Buy this Article




Summary: 
In this paper, we propose a new approach to speaker independent isolated-word speech recognition using multisegment multiple vector quantization (VQ) codebooks. In this approach, words are recognized by means of multisegment multiple VQ codebooks, a separate multisegment multiple VQ codebooks are designed for each word in the recognition vocabulary by dividing equally the word into multiple segments which is correlative with number of syllables or phonemes of the word, and designing two individual VQ codebooks consisting of both instantaneous and transitional speech features for each segment. Using this approach, the influence of the within-word coarticulation can be minimized, the time-sequence information of speech can be used, and the word length differences in the vocabulary or speaking rates variations can be adapted automatically. Moreover, the mel-cepstral coefficients based on unbiased estimation of log spectrum (UELS) are used, and comparison experiment with LPC derived mel cepstral coefficients is made. Recognition experiments Using testing databases consisting of 100 Japanese words (Waseda database) and 216 phonetically balanced words (ATR database), confirmed the effectiveness of the new method and the new speech features. The approach is described, computational complexity as well as memory requirements are analyzed, the experimental results are presented.