Multi-Task Learning in Deep Neural Networks for Mandarin-English Code-Mixing Speech Recognition

Mengzhe CHEN  Jielin PAN  Qingwei ZHAO  Yonghong YAN  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E99-D   No.10   pp.2554-2557
Publication Date: 2016/10/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2016SLL0004
Type of Manuscript: Special Section LETTER (Special Section on Recent Advances in Machine Learning for Spoken Language Processing)
Category: Acoustic modeling
Keyword: 
multi-task learning,  deep neural network,  Mandarin-English code mixing,  speech recognition,  

Full Text: PDF(205.3KB)
>>Buy this Article


Summary: 
Multi-task learning in deep neural networks has been proven to be effective for acoustic modeling in speech recognition. In the paper, this technique is applied to Mandarin-English code-mixing recognition. For the primary task of the senone classification, three schemes of the auxiliary tasks are proposed to introduce the language information to networks and improve the prediction of language switching. On the real-world Mandarin-English test corpus in mobile voice search, the proposed schemes enhanced the recognition on both languages and reduced the relative overall error rates by 3.5%, 3.8% and 5.8% respectively.