User Feedback-Driven Document Clustering Technique for Information Organization

Han-joon KIM  Sang-goo LEE  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E85-D   No.6   pp.1043-1048
Publication Date: 2002/06/01
Online ISSN: 
DOI: 
Print ISSN: 0916-8532
Type of Manuscript: LETTER
Category: Databases
Keyword: 
semi-supervised clustering,  hierarchical agglomerative clustering,  relevance feedback,  fuzzy information retrieval,  

Full Text: PDF>>
Buy this Article




Summary: 
This paper discusses a new type of semi-supervised document clustering that uses partial supervision to partition a large set of documents. Most clustering methods organizes documents into groups based only on similarity measures. In this paper, we attempt to isolate more semantically coherent clusters by employing the domain-specific knowledge provided by a document analyst. By using external human knowledge to guide the clustering mechanism with some flexibility when creating the clusters, clustering efficiency can be considerably enhanced. Experimental results show that the use of only a little external knowledge can considerably enhance the quality of clustering results that satisfy users' constraint.