Personal Data Retrieval and Disambiguation in Web Person Search

Yuliang WEI  Guodong XIN  Wei WANG  Fang LV  Bailing WANG  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E102-D   No.2   pp.392-395
Publication Date: 2019/02/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2018EDL8172
Type of Manuscript: LETTER
Category: Data Engineering, Web Information Systems
Keyword: 
sequential block model,  deep learning,  web extraction,  name disambiguation,  

Full Text: PDF(868.2KB)
>>Buy this Article


Summary: 
Web person search often return web pages related to several distinct namesakes. This paper proposes a new web page model for template-free person data extraction, and uses Dirichlet Process Mixture model to solve name disambiguation. The results show that our method works best on web pages with complex structure.