For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Joint Optimization of Perceptual Gain Function and Deep Neural Networks for Single-Channel Speech Enhancement
Wei HAN Xiongwei ZHANG Gang MIN Xingyu ZHOU Meng SUN
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences
Publication Date: 2017/02/01
Online ISSN: 1745-1337
Type of Manuscript: LETTER
Category: Noise and Vibration
speech enhancement, deep neural networks, perceptual gain function, joint optimization,
Full Text: PDF(632.4KB)>>
In this letter, we explore joint optimization of perceptual gain function and deep neural networks (DNNs) for a single-channel speech enhancement task. A DNN architecture is proposed which incorporates the masking properties of the human auditory system to make the residual noise inaudible. This new DNN architecture directly trains a perceptual gain function which is used to estimate the magnitude spectrum of clean speech from noisy speech features. Experimental results demonstrate that the proposed speech enhancement approach can achieve significant improvements over the baselines when tested with TIMIT sentences corrupted by various types of noise, no matter whether the noise conditions are included in the training set or not.