Noise Post-Processing for Low Bit-Rate CELP Coders

Hiroyuki EHARA  Kazutoshi YASUNAGA  Koji YOSHIDA  Yusuke HIWASAKI  Kazunori MANO  Takao KANEKO  

IEICE TRANSACTIONS on Information and Systems   Vol.E87-D   No.6   pp.1507-1516
Publication Date: 2004/06/01
Online ISSN: 
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Speech and Hearing
CELP,  background noise,  post-processing,  frame erasure,  ITU-T,  

Full Text: PDF>>
Buy this Article

This paper presents a newly developed noise post-processing (NPP) algorithm and the results of several tests demonstrating its subjective performance. This NPP algorithm is designed to improve the subjective performance of low bit-rate code excited linear prediction (CELP) decoding under background noise conditions. The NPP algorithm is based on a stationary noise generator and improves the subjective quality of noisy signal input. A backward adaptive detector defines noisy input signal frames from decoded LSF, energy, and pitch parameters. The noise generator estimates and produces stationary noise signals using past line spectral frequency (LSF) and energy parameters. The stationary noise generator has a frame erasure concealment (FEC) scheme designed for stationary noise signals and therefore improves the speech decoder's robustness for frame erasure under background noise conditions. The algorithm has been applied to the following CELP decoders: 1) a candidate algorithm of the ITU-T 4-kbit/s speech coding standard and 2) existing ITU-T standards, the G.729 and G.723.1 series. In both cases, NPP improved the subjective performance of the baseline decoders. Improvements of approximately 0.25 CMOS (CCR MOS: comparison category rating mean opinion score) and around 0.2-0.8 DMOS (DCR MOS: degradation category rating mean opinion score) were demonstrated in the results of our subjective tests when applied to the 4-kbit/s decoder and G.729/G.723.1 decoders respectively. Other test results show that NPP improves the subjective performance of a G.729 decoder by around 0.45 in DMOS under both error-free and frame-erasure conditions, and a further improvement of around 0.2 DMOS is achieved by the FEC scheme in the noise generator.