Automatic Alignment of Japanese-Chinese Bilingual Texts

Chew Lim TAN  Makoto NAGAO  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E78-D   No.1   pp.68-76
Publication Date: 1995/01/25
Online ISSN: 
DOI: 
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Artificial Intelligence and Cognitive Science
Keyword: 
automatic alignment,  example-based machine translation,  lexical approach,  statistical approach,  heuristic function,  

Full Text: PDF>>
Buy this Article




Summary: 
Automatic alignment of bilingual texts is useful to example-based machine translation by facilitating the creation of example pairs of translation for the machine. Two main approaches to automatic alignment have been reported in the literature. They are lexical approach and statistical approach. The former looks for relationships between lexical contents of the bilingual texts in order to find alignment pairs, while the latter uses statistical correlation between sentence lengths of the bilingual texts as the basis of matching. This paper describes a combination of the two approaches in aligning Japanese-Cinese bilingual texts by allowing kanji contents and sentence lengths in the texts to work together in achieving an alignment process. Because of the sentential structure differences between Japanese and Chinese, matching at the sentence level may result in frequent matching between a number of sentences en masses. In view of this, the current work also attempts to create shorter alignment pairs by permitting sentences to be matched with clauses or phrases of the other text if possible. While such matching is more difficult and error-prone, the reliance on kanji contents has proven to be very useful in minimizing the errors. The current research has thus found solutions to problems that are unique to the present work.