Effectiveness of Passage-Based Document Retrieval for Short Queries

Koichi KISE  Markus JUNKER  Andreas DENGEL  Keinosuke MATSUMOTO  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E86-D   No.9   pp.1753-1761
Publication Date: 2003/09/01
Online ISSN: 
DOI: 
Print ISSN: 0916-8532
Type of Manuscript: Special Section PAPER (Special Issue on Text Processing for Information Access)
Category: 
Keyword: 
document retrieval,  passage retrieval,  document length,  query length,  density distribution,  

Full Text: PDF>>
Buy this Article




Summary: 
Document retrieval is a fundamental but important task for intelligent access to a huge amount of information stored in documents. Although the history of its research is long, it is still a hard task especially in the case that lengthy documents are retrieved with very short queries (a few keywords). For the retrieval of long documents, methods called passage-based document retrieval have proven to be effective. In this paper, we experimentally show that a passage-based method based on window passages is also effective for dealing with short queries on condition that documents are not too short. We employ a method called "density distributions" as a method based on window passages, and compare it with three conventional methods: the simple vector space model, pseudo relevance feedback and latent semantic indexing. We also compare it with a passage-based method based on discourse passages.