Paraphrase Lattice for Statistical Machine Translation

Takashi ONISHI  Masao UTIYAMA  Eiichiro SUMITA  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E94-D   No.6   pp.1299-1305
Publication Date: 2011/06/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E94.D.1299
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Natural Language Processing
Keyword: 
statistical machine translation,  lattice decoding,  paraphrasing,  paraphrase lattice,  

Full Text: PDF(272.6KB)>>
Buy this Article




Summary: 
Lattice decoding in statistical machine translation (SMT) is useful in speech translation and in the translation of German because it can handle input ambiguities such as speech recognition ambiguities and German word segmentation ambiguities. In this paper, we show that lattice decoding is also useful for handling input variations. “Input variations” refers to the differences in input texts with the same meaning. Given an input sentence, we build a lattice which represents paraphrases of the input sentence. We call this a paraphrase lattice. Then, we give the paraphrase lattice as an input to a lattice decoder. The lattice decoder searches for the best path of the paraphrase lattice and outputs the best translation. Experimental results using the IWSLT dataset and the Europarl dataset show that our proposed method obtains significant gains in BLEU scores.