Supervised Single-Channel Speech Separation via Sparse Decomposition Using Periodic Signal Models

Makoto NAKASHIZUKA  Hiroyuki OKUMURA  Youji IIGUNI  

Publication
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E95-A   No.5   pp.853-866
Publication Date: 2012/05/01
Online ISSN: 1745-1337
DOI: 10.1587/transfun.E95.A.853
Print ISSN: 0916-8508
Type of Manuscript: PAPER
Category: Engineering Acoustics
Keyword: 
signal representation,  signal decomposition,  sparse representation,  harmonic analysis,  signal separation,  

Full Text: PDF(895.8KB)>>
Buy this Article




Summary: 
In this paper, we propose a method for supervised single-channel speech separation through sparse decomposition using periodic signal models. The proposed separation method employs sparse decomposition, which decomposes a signal into a set of periodic signals under a sparsity penalty. In order to achieve separation through sparse decomposition, the decomposed periodic signals have to be assigned to the corresponding sources. For the assignment of the periodic signal, we introduce clustering using a K-means algorithm to group the decomposed periodic signals into as many clusters as the number of speakers. After the clustering, each cluster is assigned to its corresponding speaker using preliminarily learnt codebooks. Through separation experiments, we compare our method with MaxVQ, which performs separation on the frequency spectrum domain. The experimental results in terms of signal-to-distortion ratio show that the proposed sparse decomposition method is comparable to the frequency domain approach and has less computational costs for assignment of speech components.