Leveraging Unannotated Texts for Scientific Relation Extraction

Qin DAI  Naoya INOUE  Paul REISERT  Kentaro INUI  

IEICE TRANSACTIONS on Information and Systems   Vol.E101-D   No.12   pp.3209-3217
Publication Date: 2018/12/01
Publicized: 2018/09/14
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2018EDP7180
Type of Manuscript: PAPER
Category: Natural Language Processing
relation extraction,  scientific document,  word embedding,  semantically related word,  

Full Text: PDF(1.6MB)>>
Buy this Article

A tremendous amount of knowledge is present in the ever-growing scientific literature. In order to efficiently grasp such knowledge, various computational tasks are proposed that train machines to read and analyze scientific documents. One of these tasks, Scientific Relation Extraction, aims at automatically capturing scientific semantic relationships among entities in scientific documents. Conventionally, only a limited number of commonly used knowledge bases, such as Wikipedia, are used as a source of background knowledge for relation extraction. In this work, we hypothesize that unannotated scientific papers could also be utilized as a source of external background information for relation extraction. Based on our hypothesis, we propose a model that is capable of extracting background information from unannotated scientific papers. Our experiments on the RANIS corpus [1] prove the effectiveness of the proposed model on relation extraction from scientific articles.