A Robust Speaker Identification System Based on Wavelet Transform

Ching-Tang HSIEH  You-Chuang WANG  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E84-D   No.7   pp.839-846
Publication Date: 2001/07/01
Online ISSN: 
DOI: 
Print ISSN: 0916-8532
Type of Manuscript: Special Section PAPER (Special Issue on Biometric Person Authentication)
Category: 
Keyword: 
wavelet transform,  quadrature mirror filters,  linear predict coding cepstrum,  Mandarin Speech Across Taiwan (MAT),  

Full Text: PDF>>
Buy this Article




Summary: 
A new approach for extracting significant characteristic within speech signal for distinct speaker is presented. Based on the multiresolution property of wavelet transform, quadrature mirror filters (QMFs) derived by Daubechies is used to decompose the input signal into varied frequency channels. Owning to the uncorrelation property of each resolution derived from QMFs, Linear Predict Coding Cepstrum (LPCC) of lower frequency region and entropy information of higher frequency region for each decomposition process are calculated as the speech feature vectors. In addition, a hard thresholding technique for lower resolution in each decomposition process is also used to remove the effect of noise interference. The experimental result shows that by using this mechanism, not only effectively reduce the effect of noise inference but improve the recognition rate. The proposed feature extraction algorithm is evaluated on MAT telephone speech database for Text-Independent speaker identification using vector quantization (VQ). Some popular existing methods are also evaluated for comparison in this paper. Experimental results show that the performance of the proposed method is more effective and robust than that of the other existing methods. For 80 speakers and 2 seconds utterance, the identification rate is 98.52%. In addition, the performance of our method is very satisfactory even at low SNR.