A New Hybrid Method for Machine Transliteration

Dong YANG  Paul DIXON  Sadaoki FURUI  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E93-D   No.12   pp.3377-3383
Publication Date: 2010/12/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E93.D.3377
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Natural Language Processing
Keyword: 
machine transliteration,  two-step CRF,  joint optimization,  system combination,  

Full Text: PDF>>
Buy this Article




Summary: 
This paper proposes a new hybrid method for machine transliteration. Our method is based on combining a newly proposed two-step conditional random field (CRF) method and the well-known joint source channel model (JSCM). The contributions of this paper are as follows: (1) A two-step CRF model for machine transliteration is proposed. The first CRF segments a character string of an input word into chunks and the second one converts each chunk into a character in the target language. (2) A joint optimization method of the two-step CRF model and a fast decoding algorithm are also proposed. Our experiments show that the joint optimization of the two-step CRF model works as well as or even better than the JSCM, and the fast decoding algorithm significantly decreases the decoding time. (3) A rapid development method based on a weighted finite state transducer (WFST) framework for the JSCM is proposed. (4) The combination of the proposed two-step CRF model and JSCM outperforms the state-of-the-art result in terms of top-1 accuracy.