Robust ASR Based on ETSI Advanced Front-End Using Complex Speech Analysis

Keita HIGA  Keiichi FUNAKI  

IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E98-A   No.11   pp.2211-2219
Publication Date: 2015/11/01
Online ISSN: 1745-1337
DOI: 10.1587/transfun.E98.A.2211
Type of Manuscript: Special Section PAPER (Special Section on Smart Multimedia & Communication Systems)
robust ASR,  ETSI AFE,  iterative Wiener filter (IWF),  complex speech analysis,  analytic signal,  

Full Text: PDF(2.5MB)>>
Buy this Article

The advanced front-end (AFE) for automatic speech recognition (ASR) was standardized by the European Telecommunications Standards Institute (ETSI). The AFE provides speech enhancement realized by an iterative Wiener filter (IWF) in which a smoothed FFT spectrum over adjacent frames is used to design the filter. We have previously proposed robust time-varying complex Auto-Regressive (TV-CAR) speech analysis for an analytic signal and evaluated the performance of speech processing such as F0 estimation and speech enhancement. TV-CAR analysis can estimate more accurate spectrum than FFT, especially in low frequencies because of the nature of the analytic signal. In addition, TV-CAR can estimate more accurate speech spectrum against additive noise. In this paper, a time-invariant version of wide-band TV-CAR analysis is introduced to the IWF in the AFE and is evaluated using the CENSREC-2 database and its baseline script.