
For FullText PDF, please login, if you are a member of IEICE,
or go to Pay Per View on menu list, if you are a nonmember of IEICE.

An HMM State Duration Control Algorithm Applied to LargeVocabulary Spontaneous Speech Recognition
Satoshi TAKAHASHI Yasuhiro MINAMI Kiyohiro SHIKANO
Publication
IEICE TRANSACTIONS on Information and Systems
Vol.E78D
No.6
pp.648653 Publication Date: 1995/06/25
Online ISSN:
DOI:
Print ISSN: 09168532 Type of Manuscript: Special Section PAPER (Special Issue on Spoken Language Processing) Category: Keyword: HMM, duration control, trellis likelihood calculation, largevocabulary speech recognition,
Full Text: PDF(588.2KB)>>
Summary:
Although Hidden Markov Modeling (HMM) is widely and successfully used in many speech recognition applications, duration control for HMMs is still an important issue in improving recognition accuracy since a HMM places no constraints on duration. For compensating this defect, some duration control algorithms that employ precise duration models have been proposed. However, they suffer from greatly increased computational complexity. This paper proposes a new state duration control algorithm for limiting both the maximum and the minimum state durations. The algorithm is for the HMM trellis likelihood calculation, not for the Viterbi calculation. The amount of computation required by this algorithm is only order one (O(1)) for the maximum state duration n; that is, the computation amount is independent of the maximum state duration while many conventional duration control algorithm require computation in the amount of order n or order n^{2}. Thus, the algorithm can drastically reduce the computation needed for duration control. The algorithm uses the property that the trellis likelihood calculation is a summation of many path likelihoods. At each frame, the path likelihood that exceeds the maximum likelihood is subtracted, and the path likelihood that satisfies the minimum likelihood is added to the forward probability. By iterating this procedure, the algorithm calculates the trellis likelihood efficiently. The algorithm was evaluated using a largevocabulary speakerindependent spontaneous speech recognition system for telephone directory assistance. The average reduction in error rate for sentence understanding was about 7% when using contextindependent HMMs, and 3% when using contextdependent HMMs. We could confirm the improvement by using the proposed state duration control algorithm even though the maximum and the minimum state durations were not optimized for the task (speakerindependent duration settings obtained from a different task were used).

