Proof: A Novel DHT-Based Peer-to-Peer Search Engine

Kai-Hsiang YANG  Jan-Ming HO  

IEICE TRANSACTIONS on Communications   Vol.E90-B   No.4   pp.817-825
Publication Date: 2007/04/01
Online ISSN: 1745-1345
DOI: 10.1093/ietcom/e90-b.4.817
Print ISSN: 0916-8516
Type of Manuscript: Special Section PAPER (Special Section on Networks Software)
distributed search engine,  inverted list,  Pagerank,  Peer-to-Peer,  

Full Text: PDF>>
Buy this Article

In this paper we focus on building a large scale keyword search service over structured Peer-to-Peer (P2P) networks. Current state-of-the-art keyword search approaches for structured P2P systems are based on inverted list intersection. However, the biggest challenge in those approaches is that when the indices are distributed over peers, a simple query may cause a large amount of data to be transmitted over the network. We propose in this paper a new P2P keyword search scheme, called "Proof," which aims to reduce the network traffic generated during the intersection process. We applied three main ideas in Proof to reduce network traffic, including (1) using a sorted query flow, (2) storing content summaries in the inverted lists, and (3) setting a stop condition for the checking of content summaries. We also discuss the advantages and limitations of Proof, and conducted extensive experiments to evaluate the search performance and the quality of search results. Our simulation results showed that, compared with previous solutions, Proof can dramatically reduce network traffic while providing 100% precision and high recall of search results, at some additional storage overhead.