N-Gram Modeling Based on Recognized Phonemes in Automatic Language Identification

Hingkeung KWAN  Keikichi HIROSE  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E81-D   No.11   pp.1224-1231
Publication Date: 1998/11/25
Online ISSN: 
DOI: 
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Speech Processing and Acoustics
Keyword: 
language identification,  N-gram,   phoneme,  recognized labels,  mixed phoneme recognition (MPR),  

Full Text: PDF(625.8KB)>>
Buy this Article




Summary: 
Due to a rather low phoneme recognition rate for noisy telephone speech, there may arise large differences between N-gram built upon recognized phoneme labels and those built upon original attached phoneme labels, which in turn would affect the performances of N-gram based language identification methods. Use of N-gram built upon recognized phoneme labels from the training data was evaluated and was shown to be more effective for the language identification. The performance of mixed phoneme recognizer, in which both language-dependent and language-independent phonemes were included, was also evaluated. Results showed that the performance was better than that using parallel language-dependent phoneme recognizers in which bias existed due to different numbers of phonemes among languages.