|
For Full-Text PDF, please login, if you are a member of IEICE,
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
|
A Novel Iterative Speaker Model Alignment Method from Non-Parallel Speech for Voice Conversion
Peng SONG Wenming ZHENG Xinran ZHANG Yun JIN Cheng ZHA Minghai XIN
Publication
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences
Vol.E98-A
No.10
pp.2178-2181 Publication Date: 2015/10/01 Online ISSN: 1745-1337
DOI: 10.1587/transfun.E98.A.2178 Type of Manuscript: LETTER Category: Speech and Hearing Keyword: non-parallel speech, voice conversion, iterative speaker model alignment, Gaussian mixture model,
Full Text: PDF(721KB)>>
Summary:
Most of the current voice conversion methods are conducted based on parallel speech, which is not easily obtained in practice. In this letter, a novel iterative speaker model alignment (ISMA) method is proposed to address this problem. First, the source and target speaker models are each trained from the background model by adopting maximum a posteriori (MAP) algorithm. Then, a novel ISMA method is presented for alignment and transformation of spectral features. Finally, the proposed ISMA approach is further combined with a Gaussian mixture model (GMM) to improve the conversion performance. A series of objective and subjective experiments are carried out on CMU ARCTIC dataset, and the results demonstrate that the proposed method significantly outperforms the state-of-the-art approach.
|
|
|