Efficient Parallel Learning of Hidden Markov Chain Models on SMPs

Lei LI  Bin FU  Christos FALOUTSOS  

IEICE TRANSACTIONS on Information and Systems   Vol.E93-D    No.6    pp.1330-1342
Publication Date: 2010/06/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E93.D.1330
Print ISSN: 0916-8532
Type of Manuscript: Special Section INVITED PAPER (Special Section on Info-Plosion)
linear dynamical systems,  hidden Markov models,  OpenMP,  expectation maximization (EM),  optimization,  multi-core,  

Full Text: PDF>>
Buy this Article

Quad-core cpus have been a common desktop configuration for today's office. The increasing number of processors on a single chip opens new opportunity for parallel computing. Our goal is to make use of the multi-core as well as multi-processor architectures to speed up large-scale data mining algorithms. In this paper, we present a general parallel learning framework, Cut-And-Stitch, for training hidden Markov chain models. Particularly, we propose two model-specific variants, CAS-LDS for learning linear dynamical systems (LDS) and CAS-HMM for learning hidden Markov models (HMM). Our main contribution is a novel method to handle the data dependencies due to the chain structure of hidden variables, so as to parallelize the EM-based parameter learning algorithm. We implement CAS-LDS and CAS-HMM using OpenMP on two supercomputers and a quad-core commercial desktop. The experimental results show that parallel algorithms using Cut-And-Stitch achieve comparable accuracy and almost linear speedups over the traditional serial version.