For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
An Approach Using Combination of Multiple Features through Sigmoid Function for Speech-Presence/Absence Discrimination
Kun-Ching WANG Chiun-Li CHIN
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences
Publication Date: 2011/08/01
Online ISSN: 1745-1337
Print ISSN: 0916-8508
Type of Manuscript: PAPER
Category: Engineering Acoustics
speech detection, combination of multiple features, bark-scale wavelet decomposition, adaptive frequency sub-band extraction, sigmoid function,
Full Text: PDF>>
In this paper, we present an approach of detecting speech presence for which the decision rule is based on a combination of multiple features using a sigmoid function. A minimum classification error (MCE) training is used to update the weights adjustment for the combination. The features, consisting of three parameters: the ratio of ZCR, the spectral energy, and spectral entropy, are combined linearly with weights derived from the sub-band domain. First, the Bark-scale wavelet decomposition (BSWD) is used to split the input speech into 24 critical sub-bands. Next, the feature parameters are derived from the selected frequency sub-band to form robust voice feature parameters. In order to discard the seriously corrupted frequency sub-band, a strategy of adaptive frequency sub-band extraction (AFSE) dependant on the sub-band SNR is then applied to only the frequency sub-band used. Finally, these three feature parameters, which only consider the useful sub-band, are combined through a sigmoid type function incorporating optimal weights based on MSE training to detect either a speech present frame or a speech absent frame. Experimental results show that the performance of the proposed algorithm is superior to the standard methods such as G.729B and AMR2.