|
For Full-Text PDF, please login, if you are a member of IEICE,
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
|
An Approach Using Combination of Multiple Features through Sigmoid Function for Speech-Presence/Absence Discrimination
Kun-Ching WANG Chiun-Li CHIN
Publication
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences
Vol.E94-A
No.8
pp.1630-1637 Publication Date: 2011/08/01 Online ISSN: 1745-1337
DOI: 10.1587/transfun.E94.A.1630 Print ISSN: 0916-8508 Type of Manuscript: PAPER Category: Engineering Acoustics Keyword: speech detection, combination of multiple features, bark-scale wavelet decomposition, adaptive frequency sub-band extraction, sigmoid function,
Full Text: PDF>>
Summary:
In this paper, we present an approach of detecting speech presence for which the decision rule is based on a combination of multiple features using a sigmoid function. A minimum classification error (MCE) training is used to update the weights adjustment for the combination. The features, consisting of three parameters: the ratio of ZCR, the spectral energy, and spectral entropy, are combined linearly with weights derived from the sub-band domain. First, the Bark-scale wavelet decomposition (BSWD) is used to split the input speech into 24 critical sub-bands. Next, the feature parameters are derived from the selected frequency sub-band to form robust voice feature parameters. In order to discard the seriously corrupted frequency sub-band, a strategy of adaptive frequency sub-band extraction (AFSE) dependant on the sub-band SNR is then applied to only the frequency sub-band used. Finally, these three feature parameters, which only consider the useful sub-band, are combined through a sigmoid type function incorporating optimal weights based on MSE training to detect either a speech present frame or a speech absent frame. Experimental results show that the performance of the proposed algorithm is superior to the standard methods such as G.729B and AMR2.
|
|