Imposing Constraints from the Source Tree on ITG Constraints for SMT

Hirofumi YAMAMOTO  Hideo OKUMA  Eiichiro SUMITA  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E92-D   No.9   pp.1762-1770
Publication Date: 2009/09/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E92.D.1762
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Natural Language Processing
Keyword: 
statistical machine translation,  word reordering,  distortion model,  ITG constraints,  

Full Text: PDF(181KB)>>
Buy this Article




Summary: 
In the current statistical machine translation (SMT), erroneous word reordering is one of the most serious problems. To resolve this problem, many word-reordering constraint techniques have been proposed. Inversion transduction grammar (ITG) is one of these constraints. In ITG constraints, target-side word order is obtained by rotating nodes of the source-side binary tree. In these node rotations, the source binary tree instance is not considered. Therefore, stronger constraints for word reordering can be obtained by imposing further constraints derived from the source tree on the ITG constraints. For example, for the source word sequence { a b c d }, ITG constraints allow a total of twenty-two target word orderings. However, when the source binary tree instance ((a b) (c d)) is given, our proposed "imposing source tree on ITG" (IST-ITG) constraints allow only eight word orderings. The reduction in the number of word-order permutations by our proposed stronger constraints efficiently suppresses erroneous word orderings. In our experiments with IST-ITG using the NIST MT08 English-to-Chinese translation track's data, the proposed method resulted in a 1.8-points improvement in character BLEU-4 (35.2 to 37.0) and a 6.2% lower CER (74.1 to 67.9%) compared with our baseline condition.