Machine Learning Based English-to-Korean Transliteration Using Grapheme and Phoneme Information

Jong-Hoon OH  Key-Sun CHOI  

IEICE TRANSACTIONS on Information and Systems   Vol.E88-D   No.7   pp.1737-1748
Publication Date: 2005/07/01
Online ISSN: 
DOI: 10.1093/ietisy/e88-d.7.1737
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Natural Language Processing
machine transliteration,  machine learning,  information retrieval,  machine translation,  natural language processing,  

Full Text: PDF(1.4MB)>>
Buy this Article

Machine transliteration is an automatic method to generate characters or words in one alphabetical system for the corresponding characters in another alphabetical system. Machine transliteration can play an important role in natural language application such as information retrieval and machine translation, especially for handling proper nouns and technical terms. The previous works focus on either a grapheme-based or phoneme-based method. However, transliteration is an orthographical and phonetic converting process. Therefore, both grapheme and phoneme information should be considered in machine transliteration. In this paper, we propose a grapheme and phoneme-based transliteration model and compare it with previous grapheme-based and phoneme-based models using several machine learning techniques. Our method shows about 1378% performance improvement.