A Novel Malware Clustering Method Using Frequency of Function Call Traces in Parallel Threads

Junji NAKAZATO  Jungsuk SONG  Masashi ETO  Daisuke INOUE  Koji NAKAO  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E94-D   No.11   pp.2150-2158
Publication Date: 2011/11/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E94.D.2150
Print ISSN: 0916-8532
Type of Manuscript: Special Section PAPER (Special Section on Information and Communication System Security)
Category: 
Keyword: 
malware analysis,  behavior of malware,  clustering,  

Full Text: PDF>>
Buy this Article




Summary: 
With the rapid development and proliferation of the Internet, cyber attacks are increasingly and continually emerging and evolving nowadays. Malware – a generic term for computer viruses, worms, trojan horses, spywares, adwares, and bots – is a particularly lethal security threat. To cope with this security threat appropriately, we need to identify the malwares' tendency/characteristic and analyze the malwares' behaviors including their classification. In the previous works of classification technologies, the malwares have been classified by using data from dynamic analysis or code analysis. However, the works have not been succeeded to obtain efficient classification with high accuracy. In this paper, we propose a new classification method to cluster malware more effectively and more accurately. We firstly perform dynamic analysis to automatically obtain the execution traces of malwares. Then, we classify malwares into some clusters using their characteristics of the behavior that are derived from Windows API calls in parallel threads. We evaluated our classification method using 2,312 malware samples with different hash values. The samples classified into 1,221 groups by the result of three types of antivirus softwares were classified into 93 clusters. 90% of the samples used in the experiment were classified into 20 clusters at most. Moreover, it ensured that 39 malware samples had characteristics different from other samples, suggesting that these may be new types of malware. The kinds of Windows API calls confirmed the samples classified into the same cluster had the same characteristics. We made clear that antivirus softwares named different name to malwares that have same behavior.