For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Density-Based Spam Detector
Kenichi YOSHIDA Fuminori ADACHI Takashi WASHIO Hiroshi MOTODA Teruaki HOMMA Akihiro NAKASHIMA Hiromitsu FUJIKAWA Katsuyuki YAMAZAKI
IEICE TRANSACTIONS on Information and Systems
Publication Date: 2004/12/01
Print ISSN: 0916-8532
Type of Manuscript: Special Section PAPER (Special Section on New Technologies and their Applications of the Internet)
Category: Internet Systems
spam, unsupervised learning, document space density, direct-mapped cache,
Full Text: PDF(1.3MB)>>
The volume of mass unsolicited electronic mail, often known as spam, has recently increased enormously and has become a serious threat not only to the Internet but also to society. This paper proposes a new spam detection method which uses document space density information. Although the proposed method requires extensive e-mail traffic to acquire the necessary information, it can achieve perfect detection (i.e., both recall and precision is 100%) under practical conditions. A direct-mapped cache method contributes to the handling of over 13,000 e-mail messages per second. Experimental results, which were conducted using over 50 million actual e-mail messages, are also reported in this paper.