Predicting DataSpace Retrieval Using Probabilistic Hidden Information

Gile Narcisse FANZOU TCHUISSANG  Ning WANG  Nathalie Cindy KUICHEU  Francois SIEWE  De XU  Shuoyan LIU  

IEICE TRANSACTIONS on Information and Systems   Vol.E93-D   No.7   pp.1991-1994
Publication Date: 2010/07/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E93.D.1991
Print ISSN: 0916-8532
Type of Manuscript: LETTER
Category: Data Engineering, Web Information Systems
information retrieval,  probabilistic algorithm,  DataSpace,  

Full Text: PDF(193.8KB)>>
Buy this Article

This paper discusses the issues involved in the design of a complete information retrieval system for DataSpace based on user relevance probabilistic schemes. First, Information Hidden Model (IHM) is constructed taking into account the users' perception of similarity between documents. The system accumulates feedback from the users and employs it to construct user oriented clusters. IHM allows integrating uncertainty over multiple, interdependent classifications and collectively determines the most likely global assignment. Second, Three different learning strategies are proposed, namely query-related UHH, UHB and UHS (User Hidden Habit, User Hidden Background, and User Hidden keyword Semantics) to closely represent the user mind. Finally, the probability ranking principle shows that optimum retrieval quality can be achieved under certain assumptions. An optimization algorithm to improve the effectiveness of the probabilistic process is developed. We first predict the data sources where the query results could be found. Therefor, compared with existing approaches, our precision of retrieval is better and do not depend on the size and the DataSpace heterogeneity.