For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Linked Data Entity Resolution System Enhanced by Configuration Learning Algorithm
Khai NGUYEN Ryutaro ICHISE
IEICE TRANSACTIONS on Information and Systems
Publication Date: 2016/06/01
Online ISSN: 1745-1361
Type of Manuscript: PAPER
Category: Data Engineering, Web Information Systems
linked data, entity resolution, schema-independent, supervised, heuristic,
Full Text: PDF(425.4KB)>>
Linked data entity resolution is the detection of instances that reside in different repositories but co-describe the same topic. The quality of the resolution result depends on the appropriateness of the configuration, including the selected matching properties and the similarity measures. Because such configuration details are currently set differently across domains and repositories, a general resolution approach for every repository is necessary. In this paper, we present cLink, a system that can perform entity resolution on any input effectively by using a learning algorithm to find the optimal configuration. Experiments show that cLink achieves high performance even when being given only a small amount of training data. cLink also outperforms recent systems, including the ones that use the supervised learning approach.