Deterministic Annealing EM Algorithm in Acoustic Modeling for Speaker and Speech Recognition

Yohei ITAYA  Heiga ZEN  Yoshihiko NANKAKU  Chiyomi MIYAJIMA  Keiichi TOKUDA  Tadashi KITAMURA  

IEICE TRANSACTIONS on Information and Systems   Vol.E88-D   No.3   pp.425-431
Publication Date: 2005/03/01
Online ISSN: 
DOI: 10.1093/ietisy/e88-d.3.425
Print ISSN: 0916-8532
Type of Manuscript: Special Section PAPER (Special Section on Corpus-Based Speech Technologies)
Category: Feature Extraction and Acoustic Medelings
DAEM algorithm,  acoustic modeling,  EM algorithm,  GMMs,  HMMs,  

Full Text: PDF(680.4KB)>>
Buy this Article

This paper investigates the effectiveness of the DAEM (Deterministic Annealing EM) algorithm in acoustic modeling for speaker and speech recognition. Although the EM algorithm has been widely used to approximate the ML estimates, it has the problem of initialization dependence. To relax this problem, the DAEM algorithm has been proposed and confirmed the effectiveness in artificial small tasks. In this paper, we applied the DAEM algorithm to practical speech recognition tasks: speaker recognition based on GMMs and continuous speech recognition based on HMMs. Experimental results show that the DAEM algorithm can improve the recognition performance as compared to the standard EM algorithm with conventional initialization algorithms, especially in the flat start training for continuous speech recognition.