Acoustic Model Adaptation for Speech Recognition

Koichi SHINODA  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E93-D   No.9   pp.2348-2362
Publication Date: 2010/09/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E93.D.2348
Print ISSN: 0916-8532
Type of Manuscript: INVITED PAPER (Special Section on Processing Natural Speech Variability for Improved Verbal Human-Computer Interaction)
Category: 
Keyword: 
speech recognition,  acoustic model adaptation,  hidden Markov models,  

Full Text: PDF(311.2KB)
>>Buy this Article


Summary: 
Statistical speech recognition using continuous-density hidden Markov models (CDHMMs) has yielded many practical applications. However, in general, mismatches between the training data and input data significantly degrade recognition accuracy. Various acoustic model adaptation techniques using a few input utterances have been employed to overcome this problem. In this article, we survey these adaptation techniques, including maximum a posteriori (MAP) estimation, maximum likelihood linear regression (MLLR), and eigenvoice. We also present a schematic view called the adaptation pyramid to illustrate how these methods relate to each other.