For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Effectiveness of Word String Language Models on Noisy Broadcast News Speech Recognition
Kazuyuki TAKAGI Rei OGURO Kazuhiko OZEKI
IEICE TRANSACTIONS on Information and Systems
Publication Date: 2002/07/01
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Speech and Hearing
word string, language model, robustness, broadcast news speech, noisy speech recognition,
Full Text: PDF>>
Experiments were conducted to examine an approach from language modeling side to improving noisy speech recognition performance. By adopting appropriate word strings as new units of processing, speech recognition performance was improved by acoustic effects as well as by test-set perplexity reduction. Three kinds of word string language models were evaluated, whose additional lexical entries were selected based on combinations of part of speech information, word length, occurrence frequency, and log likelihood ratio of the hypotheses about the bigram frequency. All of the three word string models reduced errors in broadcast news speech recognition, and also lowered test-set perplexity. The word string model based on log likelihood ratio exhibited the best improvement for noisy speech recognition, by which deletion errors were reduced by 26%, substitution errors by 9.3%, and insertion errors by 13%, in the experiments using the speaker-dependent, noise-adapted triphone. Effectiveness of word string models on error reduction was more prominent for noisy speech than for studio-clean speech.