Two-Band Excitation for HMM-Based Speech Synthesis

Sang-Jin KIM  Minsoo HAHN  

IEICE TRANSACTIONS on Information and Systems   Vol.E90-D   No.1   pp.378-381
Publication Date: 2007/01/01
Online ISSN: 1745-1361
Print ISSN: 0916-8532
Type of Manuscript: LETTER
Category: Speech and Hearing
HMM-based speech synthesis,  excitation model,  maximum voiced frequency,  

Full Text: PDF(467.9KB)>>
Buy this Article

This letter describes a two-band excitation model for HMM-based speech synthesis. The HMM-based speech synthesis system generates speech from the HMM training data of the spectral and excitation parameters. Synthesized speech has a typical quality of "vocoded sound" mostly because of the simple excitation model with the voiced/unvoiced selection. In this letter, two-band excitation based on the harmonic plus noise speech model is proposed for generating the mixed excitation source. With this model, we can generate the mixed excitation more accurately and reduce the memory for the trained excitation data as well.