Music Source Enhancement Using a Convolutional Denoising Autoencoder and Log-frequency Scale Spectral Features

Kento OHTANI  Kenta NIWA  Takanori NISHINO  Kazuya TAKEDA  

Publication
D - Abstracts of IEICE TRANSACTIONS on Information and Systems (Japanese Edition)   Vol.J101-D   No.3   pp.615-627
Publication Date: 2018/03/01
Online ISSN: 1881-0225
DOI: 
Type of Manuscript: Special Section PAPER (Special Section on Student Research)
Category: 
Keyword: 
blind source enhancement,  convolutional neural network (CNN),  denoising autoencoder (DAE),  source-filter model,  log-frequency scale amplitude spectrum,  

Full Text(in Japanese): PDF(1.8MB)
>>Buy this Article


Summary: 
We propose a music source enhancement technique which uses a convolutional denoising autoencoder (CDAE) and the log-frequency scale amplitude spectral features of musical instrument signals. The structure of the CDAE includes amplitude spectral characteristics of the sounds created by musical instruments in its network in order to estimate the amplitude spectra of the target signals. Evaluation results show that the proposed network achieves better signal-to-interference ratios (SIRs) than conventional network/input feature structures. We also propose a complementary CDAE approach, which estimates target and noise amplitude spectra simultaneously and combines them. By using complementary CDAE, SIRs of the estimated music signals are further improved.