Adapting a Bilingual Dictionary to Domains

Hiroyuki KAJI  

IEICE TRANSACTIONS on Information and Systems   Vol.E88-D   No.2   pp.302-312
Publication Date: 2005/02/01
Online ISSN: 
DOI: 10.1093/ietisy/e88-d.2.302
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Natural Language Processing
bilingual dictionary,  domain specificity,  comparable corpora,  

Full Text: PDF(1.9MB)
>>Buy this Article

Two methods using comparable corpora to select translation equivalents appropriate to a domain were devised and evaluated. The first method ranks translation equivalents of a target word according to similarity of their contexts to that of the target word. The second method ranks translation equivalents according to the ratio of associated words that suggest them. An experiment using the EDR bilingual dictionary together with Wall Street Journal and Nihon Keizai Shimbun corpora showed that the method using the ratio of associated words outperforms the method based on contextual similarity. Namely, in a quantitative evaluation using pseudo words, the maximum F-measure of the former method was 86%, while that of the latter method was 82%. The key feature of the method using the ratio of associated words is that it outputs selected translation equivalents together with representative associated words, enabling the translation equivalents to be validated.