For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Speech Emotion Recognition Using Transfer Learning
Peng SONG Yun JIN Li ZHAO Minghai XIN
IEICE TRANSACTIONS on Information and Systems
Publication Date: 2014/09/01
Online ISSN: 1745-1361
Type of Manuscript: LETTER
Category: Speech and Hearing
speech emotion recognition, transfer learning, cross-corpus, dimension reduction,
Full Text: PDF(272.8KB)>>
A major challenge for speech emotion recognition is that when the training and deployment conditions do not use the same speech corpus, the recognition rates will obviously drop. Transfer learning, which has successfully addressed the cross-domain classification or recognition problem, is presented for cross-corpus speech emotion recognition. First, by using the maximum mean discrepancy embedding (MMDE) optimization and dimension reduction algorithms, two close low-dimensional feature spaces are obtained for source and target speech corpora, respectively. Then, a classifier function is trained using the learned low-dimensional features in the labeled source corpus, and directly applied to the unlabeled target corpus for emotion label recognition. Experimental results demonstrate that the transfer learning method can significantly outperform the traditional automatic recognition technique for cross-corpus speech emotion recognition.