Identifying High-Rate Flows Based on Sequential Sampling

Yu ZHANG  Binxing FANG  Hao LUO  

IEICE TRANSACTIONS on Information and Systems   Vol.E93-D   No.5   pp.1162-1174
Publication Date: 2010/05/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E93.D.1162
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Information Network
traffic monitoring,  high-rate flow,  identification,  sequential sampling,  

Full Text: PDF>>
Buy this Article

We consider the problem of fast identification of high-rate flows in backbone links with possibly millions of flows. Accurate identification of high-rate flows is important for active queue management, traffic measurement and network security such as detection of distributed denial of service attacks. It is difficult to directly identify high-rate flows in backbone links because tracking the possible millions of flows needs correspondingly large high speed memories. To reduce the measurement overhead, the deterministic 1-out-of-k sampling technique is adopted which is also implemented in Cisco routers (NetFlow). Ideally, a high-rate flow identification method should have short identification time, low memory cost and processing cost. Most importantly, it should be able to specify the identification accuracy. We develop two such methods. The first method is based on fixed sample size test (FSST) which is able to identify high-rate flows with user-specified identification accuracy. However, since FSST has to record every sampled flow during the measurement period, it is not memory efficient. Therefore the second novel method based on truncated sequential probability ratio test (TSPRT) is proposed. Through sequential sampling, TSPRT is able to remove the low-rate flows and identify the high-rate flows at the early stage which can reduce the memory cost and identification time respectively. According to the way to determine the parameters in TSPRT, two versions of TSPRT are proposed: TSPRT-M which is suitable when low memory cost is preferred and TSPRT-T which is suitable when short identification time is preferred. The experimental results show that TSPRT requires less memory and identification time in identifying high-rate flows while satisfying the accuracy requirement as compared to previously proposed methods.