Speech Recognition of English by Japanese Using Lexicon Represented by Multiple Reduced Phoneme Sets

Xiaoyun WANG  Seiichi YAMAMOTO  

IEICE TRANSACTIONS on Information and Systems   Vol.E98-D   No.12   pp.2271-2279
Publication Date: 2015/12/01
Publicized: 2015/09/10
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2015EDP7061
Type of Manuscript: PAPER
Category: Speech and Hearing
second language (L2) speech recognition,  proficiency of L2 speakers,  reduced phoneme set (RPS),  multiple reduced phoneme sets,  proficiency-dependent reduced phoneme set,  

Full Text: PDF(1.9MB)>>
Buy this Article

Recognition of second language (L2) speech is still a challenging task even for state-of-the-art automatic speech recognition (ASR) systems, partly because pronunciation by L2 speakers is usually significantly influenced by the mother tongue of the speakers. The authors previously proposed using a reduced phoneme set (RPS) instead of the canonical one of L2 when the mother tongue of speakers is known, and demonstrated that this reduced phoneme set improved the recognition performance through experiments using English utterances spoken by Japanese. However, the proficiency of L2 speakers varies widely, as does the influence of the mother tongue on their pronunciation. As a result, the effect of the reduced phoneme set is different depending on the speakers' proficiency in L2. In this paper, the authors examine the relation between proficiency of speakers and a reduced phoneme set customized for them. The experimental results are then used as the basis of a novel speech recognition method using a lexicon in which the pronunciation of each lexical item is represented by multiple reduced phoneme sets, and the implementation of a language model most suitable for that lexicon is described. Experimental results demonstrate the high validity of the proposed method.