|
For Full-Text PDF, please login, if you are a member of IEICE,
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
|
Using MathML Parallel Markup Corpora for Semantic Enrichment of Mathematical Expressions
Minh-Quoc NGHIEM Giovanni YOKO KRISTIANTO Akiko AIZAWA
Publication
IEICE TRANSACTIONS on Information and Systems
Vol.E96-D
No.8
pp.1707-1715 Publication Date: 2013/08/01 Online ISSN: 1745-1361
DOI: 10.1587/transinf.E96.D.1707 Print ISSN: 0916-8532 Type of Manuscript: PAPER Category: Data Engineering, Web Information Systems Keyword: semantic enrichment, MathML markup, statistical machine translation,
Full Text: PDF(678.8KB)>>
Summary:
This paper explores the problem of semantic enrichment of mathematical expressions. We formulate this task as the translation of mathematical expressions from presentation markup to content markup. We use MathML, an application of XML, to describe both the structure and content of mathematical notations. We apply a method based on statistical machine translation to extract translation rules automatically. This approach contrasts with previous research, which tends to rely on manually encoded rules. We also introduce segmentation rules used to segment mathematical expressions. Combining segmentation rules and translation rules strengthens the translation system and archives significant improvements over a prior rule-based system.
|
|