Spectral Subtraction Based on Statistical Criteria of the Spectral Distribution

Hidetoshi NAKASHIMA  Yoshifumi CHISAKI  Tsuyoshi USAGAWA  Masanao EBATA  

Publication
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E85-A   No.10   pp.2283-2292
Publication Date: 2002/10/01
Online ISSN: 
DOI: 
Print ISSN: 0916-8508
Type of Manuscript: PAPER
Category: Digital Signal Processing
Keyword: 
speech enhancement,  musical noise,  spectral subtraction,  variance,  hypothesis testing,  

Full Text: PDF>>
Buy this Article




Summary: 
This paper addresses the single channel speech enhancement method which utilizes the mean value and variance of the logarithmic noise power spectra. An important issue for single channel speech enhancement algorithm is to determine the trade-off point for the spectral distortion and residual noise. Thus the accurate discrimination between speech spectral and noise components is required. The conventional methods determine the trade-off point using parameters obtained experimentally. As a result spectral discrimination is not adequate. And the enhanced speech is deteriorated by spectral distortion or residual noise. Therefore, a criteria to determine the point is necessary. The proposed method determines the trade-off point of spectral distortion and residual noise level by discrimination between speech spectral and noise components based on statistical criteria. The spectral discrimination is performed using hypothesis testing that utilizes means and variances of the logarithmic power spectra. The discriminated spectral components are divided into speech-dominant spectral components and noise-dominant ones. For the speech-dominant ones, spectral subtraction is performed to minimize the spectral distortion. For the noise-dominant ones, attenuation is performed to reduce the noise level. The performance of the method is confirmed in terms of waveform, spectrogram, noise reduction level and speech recognition task. As a result, the noise reduction level and speech recognition rate are improved so that the method reduces the musical noise effectively and improves the enhanced speech quality.