Speech Enhancement Using Band-Dependent Spectral Estimators


IEICE TRANSACTIONS on Information and Systems   Vol.E86-D   No.5   pp.937-946
Publication Date: 2003/05/01
Online ISSN: 
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Speech and Hearing
speech enhancement,  Map estimation,  signal reconstruction,  digital signal processing,  

Full Text: PDF>>
Buy this Article

Our work introduces a speech enhancement algorithm that modifies on-line the spectral representation of degraded speech to approximate the spectral coefficients of high quality speech. The proposed framework is based on the application of Discrete Fourier Transform (DFT) to a large ensemble of clean speech frames and the estimation of parametric, heavy-tail non-Gaussian probability distributions for the spectral magnitude. Each clean spectral band possesses a unique pdf. This is selected according to the smallest Kullback-Leibler divergence between each candidate heavy-tail pdf and the non-parametric pdf of the magnitude of each spectral band of the clean ensemble. The parameters of the distributions are derived by Maximum Likelihood Estimation (MLE). A maximum a-posteriori (MAP) formulation of the degraded spectral bands leads to soft threshold functions, optimally derived from the statistics of each spectral band and effectively reducing white and slowly varying coloured Gaussian noise. We evaluate the new algorithm on the task of improving the quality of speech perception as well as Automatic Speech Recognition (ASR) and demonstrate its robustness at SNRs as low as 0 dB.