Deep Neural Network Based Monaural Speech Enhancement with Low-Rank Analysis and Speech Present Probability

Wenhua SHI  Xiongwei ZHANG  Xia ZOU  Meng SUN  Wei HAN  Li LI  Gang MIN  

Publication
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E101-A   No.3   pp.585-589
Publication Date: 2018/03/01
Online ISSN: 1745-1337
DOI: 10.1587/transfun.E101.A.585
Type of Manuscript: LETTER
Category: Noise and Vibration
Keyword: 
speech enhancement,  deep neural networks,  sparse and low-rank decomposition,  supervised learning,  

Full Text: PDF(632.7KB)
>>Buy this Article


Summary: 
A monaural speech enhancement method combining deep neural network (DNN) with low rank analysis and speech present probability is proposed in this letter. Low rank and sparse analysis is first applied on the noisy speech spectrogram to get the approximate low rank representation of noise. Then a joint feature training strategy for DNN based speech enhancement is presented, which helps the DNN better predict the target speech. To reduce the residual noise in highly overlapping regions and high frequency domain, speech present probability (SPP) weighted post-processing is employed to further improve the quality of the speech enhanced by trained DNN model. Compared with the supervised non-negative matrix factorization (NMF) and the conventional DNN method, the proposed method obtains improved speech enhancement performance under stationary and non-stationary conditions.