Graph Similarity Metric Using Graph Convolutional Network: Application to Malware Similarity Match

Bing-lin ZHAO  Fu-dong LIU  Zheng SHAN  Yi-hang CHEN  Jian LIU  

IEICE TRANSACTIONS on Information and Systems   Vol.E102-D   No.8   pp.1581-1585
Publication Date: 2019/08/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2018EDL8259
Type of Manuscript: LETTER
Category: Information Network
graph convolutional network,  graph similarity computation,  malware similarity metric,  

Full Text: PDF(700.9KB)>>
Buy this Article

Nowadays, malware is a serious threat to the Internet. Traditional signature-based malware detection method can be easily evaded by code obfuscation. Therefore, many researchers use the high-level structure of malware like function call graph, which is impacted less from the obfuscation, to find the malware variants. However, existing graph match methods rely on approximate calculation, which are inefficient and the accuracy cannot be effectively guaranteed. Inspired by the successful application of graph convolutional network in node classification and graph classification, we propose a novel malware similarity metric method based on graph convolutional network. We use graph convolutional network to compute the graph embedding vectors, and then we calculate the similarity metric of two graph based on the distance between two graph embedding vectors. Experimental results on the Kaggle dataset show that our method can applied to the graph based malware similarity metric method, and the accuracy of clustering application with our method reaches to 97% with high time efficiency.