A Pseudo Glottal Excitation Model for the Linear Prediction Vocoder with Speech Signals Coded at 1.6 kbps

Hwai-Tsu HU  Fang-Jang KUO  Hsin-Jen WANG  

IEICE TRANSACTIONS on Information and Systems   Vol.E83-D   No.8   pp.1654-1661
Publication Date: 2000/08/25
Online ISSN: 
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Speech and Hearing
speech coding,  vocoder,  pseudo glottal excitation,  linear prediction,  

Full Text: PDF(564.8KB)>>
Buy this Article

This paper presents a pseudo glottal excitation model for the type of linear prediction vocoders with speech being coded at 1.6 kbps. While unvoiced speech and silence intervals are processed with a stochastic codebook of 512 entries, a glottal codebook with 32 entries for voiced excitation is used to describe the glottal phase characteristics. Steps of formulating the pseudo glottal excitation for one pitch period consist of 1) applying a polynomial model to simulate the low-frequency constituent of the residual, 2) inserting a magnitude-adjustable pulse sequence to characterize the main excitation, and 3) introducing turbulent noise in series with the resulting excitation. Procedures are described for codebook construction in addition to analysis and synthesis of the pseudo glottal excitation. Results in a mean opinion score (MOS) test show that the quality produced by the proposed coder is almost as good as that by 4.8 kbps CELP coder for male utterances, but the quality for female utterances is yet somewhat inferior.