Fine-Grained Analysis of Compromised Websites with Redirection Graphs and JavaScript Traces

Yuta TAKATA  Mitsuaki AKIYAMA  Takeshi YAGI  Takeshi YADA  Shigeki GOTO  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E100-D   No.8   pp.1714-1728
Publication Date: 2017/08/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2016ICP0011
Type of Manuscript: Special Section PAPER (Special Section on Information and Communication System Security)
Category: Internet Security
Keyword: 
compromised website,  drive-by download,  redirection graph,  program trace,  

Full Text: PDF(2.6MB)>>
Buy this Article




Summary: 
An incident response organization such as a CSIRT contributes to preventing the spread of malware infection by analyzing compromised websites and sending abuse reports with detected URLs to webmasters. However, these abuse reports with only URLs are not sufficient to clean up the websites. In addition, it is difficult to analyze malicious websites across different client environments because these websites change behavior depending on a client environment. To expedite compromised website clean-up, it is important to provide fine-grained information such as malicious URL relations, the precise position of compromised web content, and the target range of client environments. In this paper, we propose a new method of constructing a redirection graph with context, such as which web content redirects to malicious websites. The proposed method analyzes a website in a multi-client environment to identify which client environment is exposed to threats. We evaluated our system using crawling datasets of approximately 2,000 compromised websites. The result shows that our system successfully identified malicious URL relations and compromised web content, and the number of URLs and the amount of web content to be analyzed were sufficient for incident responders by 15.0% and 0.8%, respectively. Furthermore, it can also identify the target range of client environments in 30.4% of websites and a vulnerability that has been used in malicious websites by leveraging target information. This fine-grained analysis by our system would contribute to improving the daily work of incident responders.