Parameter Sharing in Mixture of Factor Analyzers for Speaker Identification

Hiroyoshi YAMAMOTO  Yoshihiko NANKAKU  Chiyomi MIYAJIMA  Keiichi TOKUDA  Tadashi KITAMURA  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E88-D   No.3   pp.418-424
Publication Date: 2005/03/01
Online ISSN: 
Print ISSN: 0916-8532
Type of Manuscript: Special Section PAPER (Special Section on Corpus-Based Speech Technologies)
Category: Feature Extraction and Acoustic Medelings
Keyword: 
speaker identification,  GMM,  mixture of factor analyzers,  parameter sharing,  minimum classification error training,  

Full Text: PDF(545.3KB)
>>Buy this Article


Summary: 
This paper investigates the parameter tying structures of a mixture of factor analyzers (MFA) and discriminative training of MFA for speaker identification. The parameters of factor loading matrices or diagonal matrices are shared in different mixtures of MFA. Then, minimum classification error (MCE) training is applied to the MFA parameters to enhance the discrimination ability. The result of a text-independent speaker identification experiment shows that MFA outperforms the conventional Gaussian mixture model (GMM) with diagonal or full covariance matrices and achieves the best performance when sharing the diagonal matrices, resulting in a relative gain of 26% over the GMM with diagonal covariance matrices. The improvement is more significant especially in sparse training data condition. The recognition performance is further improved by MCE training with an additional gain of 3% error reduction.