Learning Corpus-Invariant Discriminant Feature Representations for Speech Emotion Recognition

Shifeng OU
Zhenbin DU
Yanyan GUO
Wenming MA
Jinglei LIU
Wenming ZHENG

IEICE TRANSACTIONS on Information and Systems   Vol.E100-D    No.5    pp.1136-1139
Publication Date: 2017/05/01
Publicized: 2017/02/02
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2016EDL8222
Type of Manuscript: LETTER
Category: Speech and Hearing
speech emotion recognition,  transfer learning,  dimensionality reduction,  

Full Text: PDF>>
Buy this Article

As a hot topic of speech signal processing, speech emotion recognition methods have been developed rapidly in recent years. Some satisfactory results have been achieved. However, it should be noted that most of these methods are trained and evaluated on the same corpus. In reality, the training data and testing data are often collected from different corpora, and the feature distributions of different datasets often follow different distributions. These discrepancies will greatly affect the recognition performance. To tackle this problem, a novel corpus-invariant discriminant feature representation algorithm, called transfer discriminant analysis (TDA), is presented for speech emotion recognition. The basic idea of TDA is to integrate the kernel LDA algorithm and the similarity measurement of distributions into one objective function. Experimental results under the cross-corpus conditions show that our proposed method can significantly improve the recognition rates.