For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Simple Weighting Techniques for Query Expansion in Biomedical Document Retrieval
Young-In SONG Kyoung-Soo HAN So-Young PARK Sang-Bum KIM Hae-Chang RIM
IEICE TRANSACTIONS on Information and Systems
Publication Date: 2007/11/01
Online ISSN: 1745-1361
Print ISSN: 0916-8532
Type of Manuscript: LETTER
Category: Contents Technology and Web Information Systems
query expansion, biomedical terminology, biomedical document retrieval, biomedical terminology weighting,
Full Text: PDF>>
In this paper, we propose two weighting techniques to improve performances of query expansion in biomedical document retrieval, especially when a short biomedical term in a query is expanded with its synonymous multi-word terms. When a query contains synonymous terms of different lengths, a traditional IR model highly ranks a document containing a longer terminology because a longer terminology has more chance to be matched with a query. However, such preference is clearly inappropriate and it often yields an unsatisfactory result. To alleviate the bias weighting problem, we devise a method of normalizing the weights of query terms in a long multi-word biomedical term, and a method of discriminating terms by using inverse terminology frequency which is a novel statistics estimated in a query domain. The experiment results on MEDLINE corpus show that our two simple techniques improve the retrieval performance by adjusting the inadequate preference for long multi-word terminologies in an expanded query.