Publication: IEICE TRANSACTIONS on Information and Systems Publication Date: 2004/12/01 Vol. E87-DNo. 12pp. 2678-2688 Type of Manuscript: Special Section PAPER (Special Section on New Technologies and their Applications of the Internet) Category: Internet Systems Keyword: spam, unsupervised learning, document space density, direct-mapped cache,