Identifying DNS Anomalous User by Using Hierarchical Aggregate Entropy

Keisuke ISHIBASHI  Kazumichi SATO  

IEICE TRANSACTIONS on Communications   Vol.E100-B   No.1   pp.140-147
Publication Date: 2017/01/01
Online ISSN: 1745-1345
DOI: 10.1587/transcom.2016EBP3075
Type of Manuscript: PAPER
Category: Internet
DNS,  entropy,  heavy-hitter,  

Full Text: PDF(952.8KB)>>
Buy this Article

We introduce the notion of hierarchical aggregate entropy and apply it to identify DNS client hosts that wastefully consume server resources. Entropy of DNS query traffic can capture client query patterns, e.g., the concentration of queries to a specific domain or dispersion to a large domain name space. However, entropy alone cannot capture the spatial structure of the traffic. That is, even if queries disperse to various domains but concentrate in the same upper domain, entropy among domain names provides no information on the upper domain structure, which is an important characteristic of DNS traffic. On the other hand, entropies of aggregated upper domains do not have detailed information on individual domains. To overcome this difficulty, we introduce the notion of hierarchical aggregate entropy, where queries are recursively aggregated into upper domains along the DNS domain tree, and their entropies are calculated. Thus, this method enables us to analyze the spatial characteristics of DNS traffic in a multi-resolution manner. We calculate the hierarchical aggregate entropies for actual DNS heavy-hitters and observed that the entropies of normal heavy-hitters were concentrated in a specific range. On the basis of this observation, we adopt the support vector machine method to identify the range and to classify DNS heavy-hitters as anomalous or normal. It is shown that with hierarchical aggregate entropy can halve the classification error compared to non-hierarchical entropies.