Speech Enhancement Combining NMF Weighted by Speech Presence Probability and Statistical Model

Yonggang HU  Xiongwei ZHANG  Xia ZOU  Gang MIN  Meng SUN  Yunfei ZHENG  

IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E98-A   No.12   pp.2701-2704
Publication Date: 2015/12/01
Online ISSN: 1745-1337
DOI: 10.1587/transfun.E98.A.2701
Type of Manuscript: LETTER
Category: Speech and Hearing
non-negative matrix factorization,  speech presence probability,  statistical model-based filter,  

Full Text: PDF(436.2KB)>>
Buy this Article

 | Errata[Uploaded on November 1,2016]

The conventional non-negative matrix factorization (NMF)-based speech enhancement is accomplished by updating iteratively with the prior knowledge of the clean speech and noise spectra bases. With the probabilistic estimation of whether the speech is present or not in a certain frame, this letter proposes a speech enhancement algorithm incorporating the speech presence probability (SPP) obtained via noise estimation to the NMF process. To take advantage of both the NMF-based and statistical model-based approaches, the final enhanced speech is achieved by applying a statistical model-based filter to the output of the SPP weighted NMF. Objective evaluations using perceptual evaluation of speech quality (PESQ) on TIMIT with 20 noise types at various signal-to-noise ratio (SNR) levels demonstrate the superiority of the proposed algorithm over the conventional NMF and statistical model-based baselines.