For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
A Malicious Web Site Identification Technique Using Web Structure Clustering
Tatsuya NAGAI Masaki KAMIZONO Yoshiaki SHIRAISHI Kelin XIA Masami MOHRI Yasuhiro TAKANO Masakatu MORII
IEICE TRANSACTIONS on Information and Systems
Publication Date: 2019/09/01
Online ISSN: 1745-1361
Type of Manuscript: Special Section PAPER (Special Section on Log Data Usage Technology and Office Information Systems)
website structure, malicious website, exploit kit, clustering,
Full Text: PDF(1.6MB)>>
Epidemic cyber incidents are caused by malicious websites using exploit kits. The exploit kit facilitate attackers to perform the drive-by download (DBD) attack. However, it is reported that malicious websites using an exploit kit have similarity in their website structure (WS)-trees. Hence, malicious website identification techniques leveraging WS-trees have been studied, where the WS-trees can be estimated from HTTP traffic data. Nevertheless, the defensive component of the exploit kit prevents us from capturing the WS-tree perfectly. This paper shows, hence, a new WS-tree construction procedure by using the fact that a DBD attack happens in a certain duration. This paper proposes, moreover, a new malicious website identification technique by clustering the WS-tree of the exploit kits. Experiment results assuming the D3M dataset verify that the proposed technique identifies exploit kits with a reasonable accuracy even when HTTP traffic from the malicious sites are partially lost.