Discriminative Training Based on Minimum Classification Error for a Small Amount of Data Enhanced by Vector-Field-Smoothed Bayesian Learning

Jun-ichi TAKAHASHI  Shigeki SAGAYAMA  

IEICE TRANSACTIONS on Information and Systems   Vol.E79-D   No.12   pp.1700-1707
Publication Date: 1996/12/25
Online ISSN: 
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Speech Processing and Acoustics
speech recognition,  hidden Markov model,  discriminative training,  speaker adaptation,  

Full Text: PDF>>
Buy this Article

This paper describes how to effectively use discriminative training based on Minimum Classification Error (MCE) criterion for a small amount of data in order to attain the highest level of recognition performance. This method is a combination of MCE training and Vector-Field-Smoothed Bayesian learning called MAP/VFS, which combines maximum a posteriori (MAP) estimation with Vector Field Smoothing (VFS). In the proposed method, MAP/VFS can significantly enhance MCE training in the robustness of acoustic modeling. In model training, MCE training is performed using the MAP/VFS-trained model as an initial model. The same data are used in both trainings. For speaker adaptation using several dozen training words, the proposed method has been experimentally proven to be very effective. For 50-word training data, recognition errors are drastically reduced by 47% compared with 16.5% when using only MCE. This high rate, in which 39% is due to MAP, an additional 4% is due to VFS, and a further improvement of 4% is due to MCE, can be attained by enhancing MCE training capability by MAP/VFS.