distribution size uniformity". An algorithm is introduced for automatically determining the number of mixture components. The performance of this algorithm is shown through recognition experiments involving all Japanese phonemes. Two types of experiments are carried out. One assumes that the number of mixture components for each state is the same within a phonetic model but may vary between states belonging to different phonemes. The other assumes that each state has a variable number of mixture components. These two experiments give better results than the conventional method." />


Automatic Determination of the Number of Mixture Components for Continuous HMMs Based a Uniform Variance Criterion

Tetsuo KOSAKA  Shigeki SAGAYAMA  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E78-D   No.6   pp.642-647
Publication Date: 1995/06/25
Online ISSN: 
DOI: 
Print ISSN: 0916-8532
Type of Manuscript: Special Section PAPER (Special Issue on Spoken Language Processing)
Category: 
Keyword: 
acoustic model,  continuous mixture density HMM,  speaker-independent speech recognition,  mixture components,  

Full Text: PDF>>
Buy this Article




Summary: 
We discuss how to determine automatically the number of mixture components in continuous mixture density HMMs (CHMMs). A notable trend has been the use of CHMMs in recent years. One of the major problems with a CHMM is how to determine its structure, that is, how many mixture components and states it has and its optimal topology. The number of mixture components has been determined heuristically so far. To solve this problem, we first investigate the influence of the number of mixture components on model parameters and the output log likelihood value. As a result, in contrast to the mixture number uniformity" which is applied in conventional approaches to determine the number of mixture components, we propose the principle of distribution size uniformity". An algorithm is introduced for automatically determining the number of mixture components. The performance of this algorithm is shown through recognition experiments involving all Japanese phonemes. Two types of experiments are carried out. One assumes that the number of mixture components for each state is the same within a phonetic model but may vary between states belonging to different phonemes. The other assumes that each state has a variable number of mixture components. These two experiments give better results than the conventional method.