Triple Prediction from Texts by Using Distributed Representations of Words

Takuma EBISU  Ryutaro ICHISE  

IEICE TRANSACTIONS on Information and Systems   Vol.E100-D   No.12   pp.3001-3009
Publication Date: 2017/12/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2017EDP7112
Type of Manuscript: PAPER
Category: Natural Language Processing
distributed representations of words,  knowledge extraction,  knowledge graph completion,  

Full Text: PDF(420.7KB)
>>Buy this Article

Knowledge graphs have been shown to be useful to many tasks in artificial intelligence. Triples of knowledge graphs are traditionally structured by human editors or extracted from semi-structured information; however, editing is expensive, and semi-structured information is not common. On the other hand, most such information is stored as text. Hence, it is necessary to develop a method that can extract knowledge from texts and then construct or populate a knowledge graph; this has been attempted in various ways. Currently, there are two approaches to constructing a knowledge graph. One is open information extraction (Open IE), and the other is knowledge graph embedding; however, neither is without problems. Stanford Open IE, the current best such system, requires labeled sentences as training data, and knowledge graph embedding systems require numerous triples. Recently, distributed representations of words have become a hot topic in the field of natural language processing, since this approach does not require labeled data for training. These require only plain text, but Mikolov showed that it can perform well with the word analogy task, answering questions such as, “a is to b as c is to __?.” This can be considered as a knowledge extraction task from a text for finding the missing entity of a triple. However, the accuracy is not sufficiently high when applied in a straightforward manner to relations in knowledge graphs, since the method uses only one triple as a positive example. In this paper, we analyze why distributed representations perform such tasks well; we also propose a new method for extracting knowledge from texts that requires much less annotated data. Experiments show that the proposed method achieves considerable improvement compared with the baseline; in particular, the improvement in HITS@10 was more than doubled for some relations.