Outlier Detection and Removal for HMM-Based Speech Synthesis with an Insufficient Speech Database

Doo Hwa HONG  June Sig SUNG  Kyung Hwan OH  Nam Soo KIM  

IEICE TRANSACTIONS on Information and Systems   Vol.E95-D   No.9   pp.2351-2354
Publication Date: 2012/09/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E95.D.2351
Print ISSN: 0916-8532
Type of Manuscript: LETTER
Category: Speech and Hearing
HMM-based speech synthesis,  decision tree-based clustering,  outlier detection,  insufficient speech database,  

Full Text: PDF>>
Buy this Article

Decision tree-based clustering and parameter estimation are essential steps in the training part of an HMM-based speech synthesis system. These two steps are usually performed based on the maximum likelihood (ML) criterion. However, one of the drawbacks of the ML criterion is that it is sensitive to outliers which usually result in quality degradation of the synthesized speech. In this letter, we propose an approach to detect and remove outliers for HMM-based speech synthesis. Experimental results show that the proposed approach can improve the synthetic speech, particularly when the available training speech database is insufficient.