For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
TL-Rank: A Blend of Text and Link Information for Measuring Similarity in Scientific Literature Databases
Seok-Ho YOON Ji-Su KIM Sang-Wook KIM Choonhwa LEE
IEICE TRANSACTIONS on Information and Systems
Publication Date: 2012/10/01
Online ISSN: 1745-1361
Print ISSN: 0916-8532
Type of Manuscript: LETTER
Category: Artificial Intelligence, Data Mining
similarity score, text-based measure, link-based measure, keyword set expansion,
Full Text: PDF(772.2KB)>>
| Errata[Uploaded on January 1,2012]
This paper presents a novel similarity measure that computes similarity scores among scientific research papers. The text of a given paper in online scientific literature is often found to be incomplete in terms of its potential to be compared with others, which likely leads to inaccurate results. Our solution to this problem makes use of both text and link information of a paper in question for similarity scores in that the comparison text of the paper is strengthened by adding that of papers related to it. More accurate similarity scores can be computed by reinforcing the input with the citations of the paper as well as the citations included within the paper. The efficacy of the proposed measure is validated through our extensive performance evaluation study which demonstrates a substantial gain.