|
For Full-Text PDF, please login, if you are a member of IEICE,
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
|
Utterance Verification Using Word Voiceprint Models Based on Probabilistic Distributions of Phone-Level Log-Likelihood Ratio and Phone Duration
Suk-Bong KWON HoiRin KIM
Publication
IEICE TRANSACTIONS on Information and Systems
Vol.E91-D
No.11
pp.2746-2750 Publication Date: 2008/11/01 Online ISSN: 1745-1361
DOI: 10.1093/ietisy/e91-d.11.2746 Print ISSN: 0916-8532 Type of Manuscript: LETTER Category: Speech and Hearing Keyword: utterance verification, confidence measure, likelihood ratio testing, word voiceprint,
Full Text: PDF(181.7KB)>>
Summary:
This paper suggests word voiceprint models to verify the recognition results obtained from a speech recognition system. Word voiceprint models have word-dependent information based on the distributions of phone-level log-likelihood ratio and duration. Thus, we can obtain a more reliable confidence score for a recognized word by using its word voiceprint models that represent the more proper characteristics of utterance verification for the word. Additionally, when obtaining a log-likelihood ratio-based word voiceprint score, this paper proposes a new log-scale normalization function using the distribution of the phone-level log-likelihood ratio, instead of the sigmoid function widely used in obtaining a phone-level log-likelihood ratio. This function plays a role of emphasizing a mis-recognized phone in a word. This individual information of a word is used to help achieve a more discriminative score against out-of-vocabulary words. The proposed method requires additional memory, but it shows that the relative reduction in equal error rate is 16.9% compared to the baseline system using simple phone log-likelihood ratios.
|
|