known data) to analyze the tail distribution. Then, we evaluate how well the obtained tail distribution can predict the tail distribution of the remaining 890 seconds (unknown data). The results indicate that the obtained tail distribution based on EVT by analyzing the small amount of known data can predict the tail distribution of unknown data much better than methods based on empirical or log-normal distributions. Furthermore, we apply the obtained tail distribution to predict the peak throughput in unknown data. The results of this paper enable us to predict serious deterioration events with lower measurement cost." />


Traffic Data Analysis Based on Extreme Value Theory and Its Applications to Predicting Unknown Serious Deterioration

Masato UCHIDA  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E87-D   No.12   pp.2654-2664
Publication Date: 2004/12/01
Online ISSN: 
DOI: 
Print ISSN: 0916-8532
Type of Manuscript: Special Section PAPER (Special Section on New Technologies and their Applications of the Internet)
Category: Traffic Measurement and Analysis
Keyword: 
serious deterioration of the telecommunication quality,  prediction,  tail distribution,  extreme value theory,  

Full Text: PDF>>
Buy this Article




Summary: 
It is important to predict serious deterioration of telecommunication quality. This paper investigates predicting such serious events by analyzing only a "short" period (i.e., a "small" amount) of teletraffic data. To achieve this end, this paper presents a method for analyzing the tail distributions of teletraffic state variables, because tail distributions are suitable for representing serious events. This method is based on Extreme Value Theory (EVT), which provides a firm theoretical foundation for the analysis. To be more precise, in this paper, we use throughput data measured on an actual network during daily busy hours for 15 minutes, and use its first 10 seconds (known data) to analyze the tail distribution. Then, we evaluate how well the obtained tail distribution can predict the tail distribution of the remaining 890 seconds (unknown data). The results indicate that the obtained tail distribution based on EVT by analyzing the small amount of known data can predict the tail distribution of unknown data much better than methods based on empirical or log-normal distributions. Furthermore, we apply the obtained tail distribution to predict the peak throughput in unknown data. The results of this paper enable us to predict serious deterioration events with lower measurement cost.