Pitch Estimation and Voicing Classification Using Reconstructed Spectrum from MFCC

JianFeng WU  HuiBin QIN  YongZhu HUA  LingYan FAN  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E101-D   No.2   pp.556-559
Publication Date: 2018/02/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2017EDL8162
Type of Manuscript: LETTER
Category: Speech and Hearing
Keyword: 
pitch estimation,  voicing classification,  MFCC,  GMM,  

Full Text: PDF(355.7KB)
>>Buy this Article


Summary: 
In this paper, a novel method for pitch estimation and voicing classification is proposed using reconstructed spectrum from Mel-frequency cepstral coefficients (MFCC). The proposed algorithm reconstructs spectrum from MFCC with Moore-Penrose pseudo-inverse by Mel-scale weighting functions. The reconstructed spectrum is compressed and filtered in log-frequency. Pitch estimation is achieved by modeling the joint density of pitch frequency and the filter spectrum with Gaussian Mixture Model (GMM). Voicing classification is also achieved by GMM-based model, and the test results show that over 99% frames can be correctly classified. The results of pitch estimation demonstrate that the proposed GMM-based pitch estimator has high accuracy, and the relative error is 6.68% on TIMIT database.