A Minimum Error Approach to Spotting-Based Pattern Recognition

Takashi KOMORI  Shigeru KATAGIRI  

IEICE TRANSACTIONS on Information and Systems   Vol.E78-D   No.8   pp.1032-1043
Publication Date: 1995/08/25
Online ISSN: 
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Speech Processing and Acoustics
pattern recognition,  word spotting,  MCE/GPD,  speech recognition,  

Full Text: PDF(1014.6KB)>>
Buy this Article

Keyword spotting is a fundamental approach to recognizing/understanding naturally and spontaneously spoken language. To spot acoustic events such as keywords, an overall spotting system, comprising acoustic models and decision thresholds, primarily needs to be optimized to minimize all spoting errors. However, in most conventional spotting systems, the acoustic models and the thresholds are separately and heuristically designed: There has not necessarily been a theoretical basis that has allowed one to design an overall system consistently. This paper introduces a novel approach to spotting, by proposing a new design method called Minimum SPotting Error learning (MSPE). MSPE is conceptually based on a recent discriminative learning theory, i.e., the Minimum Classification Error learning/Generalized Probabilistic Descent method (MCE/GPD); it features a rigorous framework for minimizing spotting error objectives. MSPE can be used in a wide range of pattern spotting applications, such as spoken phonemes, written characters as well as spoken words. Experimental results for a Japanese consonant spotting task clearly demonstrate the promising future of the proposed approach.