For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Effective Acoustic Modeling for Pronunciation Quality Scoring of Strongly Accented Mandarin Speech
Fengpei GE Changliang LIU Jian SHAO Fuping PAN Bin DONG Yonghong YAN
IEICE TRANSACTIONS on Information and Systems
Publication Date: 2008/10/01
Online ISSN: 1745-1361
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Speech and Hearing
CALL, speech recognition, HLDA, speaker-dependent CMN, e-learning,
Full Text: PDF(734.2KB)>>
In this paper we present our investigation into improving the performance of our computer-assisted language learning (CALL) system through exploiting the acoustic model and features within the speech recognition framework. First, to alleviate channel distortion, speaker-dependent cepstrum mean normalization (CMN) is adopted and the average correlation coefficient (average CC) between machine and expert scores is improved from 78.00% to 84.14%. Second, heteroscedastic linear discriminant analysis (HLDA) is adopted to enhance the discriminability of the acoustic model, which successfully increases the average CC from 84.14% to 84.62%. Additionally, HLDA causes the scoring accuracy to be more stable at various pronunciation proficiency levels, and thus leads to an increase in the speaker correct-rank rate from 85.59% to 90.99%. Finally, we use maximum a posteriori (MAP) estimation to tune the acoustic model to fit strongly accented test speech. As a result, the average CC is improved from 84.62% to 86.57%. These three novel techniques improve the accuracy of evaluating pronunciation quality.