Sparse Time-Varying Complex AR (TV-CAR) Speech Analysis Based on Adaptive LASSO

Keiichi FUNAKI  

IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E102-A   No.12   pp.1910-1914
Publication Date: 2019/12/01
Online ISSN: 1745-1337
DOI: 10.1587/transfun.E102.A.1910
Type of Manuscript: Special Section LETTER (Special Section on Smart Multimedia & Communication Systems)
Category: Speech and Hearing
sparse LP,  time-varying analysis,  complex analysis,  analytic signal,  adaptive LASSO,  l1-norm regularization,  F0 estimation,  

Full Text: PDF(194.8KB)>>
Buy this Article

Linear Prediction (LP) analysis is commonly used in speech processing. LP is based on Auto-Regressive (AR) model and it estimates the AR model parameter from signals with l2-norm optimization. Recently, sparse estimation is paid attention since it can extract significant features from big data. The sparse estimation is realized by l1 or l0-norm optimization or regularization. Sparse LP analysis methods based on l1-norm optimization have been proposed. Since excitation of speech is not white Gaussian, a sparse LP estimation can estimate more accurate parameter than the conventional l2-norm based LP. These are time-invariant and real-valued analysis. We have been studied Time-Varying Complex AR (TV-CAR) analysis for an analytic signal and have evaluated the performance on speech processing. The TV-CAR methods are l2-norm methods. In this paper, we propose the sparse TV-CAR analysis based on adaptive LASSO (Least absolute shrinkage and selection operator) that is l1-norm regularization and evaluate the performance on F0 estimation of speech using IRAPT (Instantaneous RAPT). The experimental results show that the sparse TV-CAR methods perform better for a high level of additive Pink noise.