Identifying Heavy-Hitter Flows from Sampled Flow Statistics

Tatsuya MORI  Tetsuya TAKINE  Jianping PAN  Ryoichi KAWAHARA  Masato UCHIDA  Shigeki GOTO  

IEICE TRANSACTIONS on Communications   Vol.E90-B   No.11   pp.3061-3072
Publication Date: 2007/11/01
Online ISSN: 1745-1345
DOI: 10.1093/ietcom/e90-b.11.3061
Print ISSN: 0916-8516
Type of Manuscript: Special Section PAPER (Special Section on Next Generation Network Management)
network measurement,  packet sampling,  flow statistics,  a priori distribution,  Bayes' theorem,  

Full Text: FreePDF

With the rapid increase of link speed in recent years, packet sampling has become a very attractive and scalable means in collecting flow statistics; however, it also makes inferring original flow characteristics much more difficult. In this paper, we develop techniques and schemes to identify flows with a very large number of packets (also known as heavy-hitter flows) from sampled flow statistics. Our approach follows a two-stage strategy: We first parametrically estimate the original flow length distribution from sampled flows. We then identify heavy-hitter flows with Bayes' theorem, where the flow length distribution estimated at the first stage is used as an a priori distribution. Our approach is validated and evaluated with publicly available packet traces. We show that our approach provides a very flexible framework in striking an appropriate balance between false positives and false negatives when sampling frequency is given.