Characteristics of Multi-Layer Perceptron Models in Enhancing Degraded Speech

Thanh Tung LE  John MASON  Tadashi KITAMURA 

Publication
IEICE TRANSACTIONS on Information and Systems  Vol.E78-D  No.6  pp.744-750
Publication Date: 1995/06/20
Online ISSN: 
Print ISSN: 0916-8532
Type of Manuscript: Special Section PAPER (Special Issue on Spoken Language Processing)
Category: 
Keyword: 
multi-layer perceptronspeech enhancementnon-linear time-domain filteringCELP coding

Full Text: PDF


Summary: 
A multi-layer perceptron (MLP) acting directly in the time-domain is applied as a speech signal enhancer, and the performance examined in the context of three common classes of degradation, namely low bit-rate CELP degradation is non-linear system degradation, additive noise, and convolution by a linear system. The investigation focuses on two topics: (i) the influence of non-linearities within the network and (ii) network topology, comparing single and multiple output structures. The objective is to examine how these characteristics influence network performance and whether this depends on the class of degradation. Experimental results show the importance of matching the enhancer to the class of degradation. In the case of the CELP coder the standard MLP with its inherently non-linear characteristics is shown to be consistently better than any equivalent linear structure (up to 3.2 dB compared with 1.6 dB SNR improvement). In contrast, when the degradation is from additive noise, a linear enhancer is always, superior.