Spectral Features Based on Local Normalized Center Moments for Speech Emotion Recognition

Huawei TAO  Ruiyu LIANG  Xinran ZHANG  Li ZHAO  

IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E99-A    No.10    pp.1863-1866
Publication Date: 2016/10/01
Online ISSN: 1745-1337
DOI: 10.1587/transfun.E99.A.1863
Type of Manuscript: LETTER
Category: Speech and Hearing
normalized center moments,  Hu invariant moments,  speech emotion recognition,  spectral features,  

Full Text: PDF(290.9KB)>>
Buy this Article

To discuss whether rotational invariance is the main role in spectrogram features, new spectral features based on local normalized center moments, denoted by LNCMSF, are proposed. The proposed LNCMSF firstly adopts 2nd order normalized center moments to describe local energy distribution of the logarithmic energy spectrum, then normalized center moment spectrograms NC1 and NC2 are gained. Secondly, DCT (Discrete Cosine Transform) is used to eliminate the correlation of NC1 and NC2, then high order cepstral coefficients TNC1 and TNC2 are obtained. Finally, LNCMSF is generated by combining NC1, NC2, TNC1 and TNC2. The rotational invariance test experiment shows that the rotational invariance is not a necessary property in partial spectrogram features. The recognition experiment shows that the maximum UA (Unweighted Average of Class-Wise Recall Rate) of LNCMSF are improved by at least 10.7% and 1.2% respectively, compared to that of MFCC (Mel Frequency Cepstrum Coefficient) and HuWSF (Weighted Spectral Features Based on Local Hu Moments).