Word Error Rate Minimization Using an Integrated Confidence Measure

Akio KOBAYASHI  Kazuo ONOE  Shinichi HOMMA  Shoei SATO  Toru IMAI  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E90-D   No.5   pp.835-843
Publication Date: 2007/05/01
Online ISSN: 1745-1361
DOI: 10.1093/ietisy/e90-d.5.835
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Speech and Hearing
Keyword: 
word error rate minimization,  maximum entropy,  support vector machines,  n-best rescoring,  

Full Text: PDF>>
Buy this Article




Summary: 
This paper describes a new criterion for speech recognition using an integrated confidence measure to minimize the word error rate (WER). The conventional criteria for WER minimization obtain the expected WER of a sentence hypothesis merely by comparing it with other hypotheses in an n-best list. The proposed criterion estimates the expected WER by using an integrated confidence measure with word posterior probabilities for a given acoustic input. The integrated confidence measure, which is implemented as a classifier based on maximum entropy (ME) modeling or support vector machines (SVMs), is used to acquire probabilities reflecting whether the word hypotheses are correct. The classifier is comprised of a variety of confidence measures and can deal with a temporal sequence of them to attain a more reliable confidence. Our proposed criterion for minimizing WER achieved a WER of 9.8% and a 3.9% reduction, relative to conventional n-best rescoring methods in transcribing Japanese broadcast news in various environments such as under noisy field and spontaneous speech conditions.