Decision Tree-Based Acoustic Models for Speech Recognition with Improved Smoothness

Masami AKAMINE  Jitendra AJMERA  

IEICE TRANSACTIONS on Information and Systems   Vol.E94-D   No.11   pp.2250-2258
Publication Date: 2011/11/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E94.D.2250
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Speech and Hearing
speech recognition,  acoustic modeling,  decision trees,  probability estimation,  likelihood computation,  

Full Text: PDF(495.5KB)>>
Buy this Article

This paper proposes likelihood smoothing techniques to improve decision tree-based acoustic models, where decision trees are used as replacements for Gaussian mixture models to compute the observation likelihoods for a given HMM state in a speech recognition system. Decision trees have a number of advantageous properties, such as not imposing restrictions on the number or types of features, and automatically performing feature selection. This paper describes basic configurations of decision tree-based acoustic models and proposes two methods to improve the robustness of the basic model: DT mixture models and soft decisions for continuous features. Experimental results for the Aurora 2 speech database show that a system using decision trees offers state-of-the-art performance, even without taking advantage of its full potential and soft decisions improve the performance of DT-based acoustic models with 16.8% relative error rate reduction over hard decisions.