Effective Prediction of Errors by Non-native Speakers Using Decision Tree for Speech Recognition-Based CALL System

Hongcui WANG  Tatsuya KAWAHARA  

IEICE TRANSACTIONS on Information and Systems   Vol.E92-D   No.12   pp.2462-2468
Publication Date: 2009/12/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E92.D.2462
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Speech and Hearing
speech recognition,  CALL,  grammar network,  decision tree,  

Full Text: PDF(480.7KB)>>
Buy this Article

CALL (Computer Assisted Language Learning) systems using ASR (Automatic Speech Recognition) for second language learning have received increasing interest recently. However, it still remains a challenge to achieve high speech recognition performance, including accurate detection of erroneous utterances by non-native speakers. Conventionally, possible error patterns, based on linguistic knowledge, are added to the lexicon and language model, or the ASR grammar network. However, this approach easily falls in the trade-off of coverage of errors and the increase of perplexity. To solve the problem, we propose a method based on a decision tree to learn effective prediction of errors made by non-native speakers. An experimental evaluation with a number of foreign students learning Japanese shows that the proposed method can effectively generate an ASR grammar network, given a target sentence, to achieve both better coverage of errors and smaller perplexity, resulting in significant improvement in ASR accuracy.