F0 Parameterization of Glottalized Tones in HMM-Based Speech Synthesis for Hanoi Vietnamese

Duy Khanh NINH  Yoichi YAMASHITA  

IEICE TRANSACTIONS on Information and Systems   Vol.E98-D   No.12   pp.2280-2289
Publication Date: 2015/12/01
Publicized: 2015/09/07
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2015EDP7134
Type of Manuscript: PAPER
Category: Speech and Hearing
HMM-based speech synthesis,  F0 parameterization,  tones,  glottalization,  pitch marking,  

Full Text: PDF>>
Buy this Article

A conventional HMM-based speech synthesis system for Hanoi Vietnamese often suffers from hoarse quality due to incomplete F0 parameterization of glottalized tones. Since estimating F0 from glottalized waveform is rather problematic for usual F0 extractors, we propose a pitch marking algorithm where pitch marks are propagated from regular regions of a speech signal to glottalized ones, from which complete F0 contours for the glottalized tones are derived. The proposed F0 parameterization scheme was confirmed to significantly reduce the hoarseness whilst slightly improving the tone naturalness of synthetic speech by both objective and listening tests. The pitch marking algorithm works as a refinement step based on the results of an F0 extractor. Therefore, the proposed scheme can be combined with any F0 extractor.