Noise Robust Speech Recognition Using Subband-Crosscorrelation Analysis

Shoji KAJITA  Kazuya TAKEDA  Fumitada ITAKURA  

IEICE TRANSACTIONS on Information and Systems   Vol.E81-D   No.10   pp.1079-1086
Publication Date: 1998/10/25
Online ISSN: 
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Speech Processing and Acoustics
subband processing,  autocorrelation,  crosscorrelation,  noise robustness,  DTW word recognition,  

Full Text: PDF(630.6KB)>>
Buy this Article

This paper describes subband-crosscorrelation analysis (SBXCOR) using two input channel signals. SBXCOR is an extended signal processing technique of subband-autocorrelation analysis (SBCOR) that extracts periodicities associated with the inverse of center frequencies present in speech signals. In addition, to extract more periodicity information associated with the inverse of center frequencies, the multi-delay weighting (MDW) processing is applied to SBXCOR. In experiments, the noise robustness of SBXCOR is evaluated using a DTW word recognizer under (1) a simulated acoustic condition with white noise and (2) a real acoustic condition in a sound proof room with human speech-like noise. As the results, under the simulated acoustic condition, it is shown that SBXCOR is more robust than the conventional one-channel SBCOR, but less robust than SBCOR extracted from the two-channel-summed signal. Furthermore, by applying MDW processing, the performance of SBXCOR improved about 2% at SNR 0 dB. The resultant performance of SBXCOR with MDW processing was much better than those of smoothed group delay spectrum (SGDS) and mel-filterbank cepstral coefficient (MFCC) below SNR 10 dB. The results under the real acoustic condition were almost the same as the simulated acoustic condition.