For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Tuning GridFTP Pipelining, Concurrency and Parallelism Based on Historical Data
IEICE TRANSACTIONS on Information and Systems
Publication Date: 2014/11/01
Online ISSN: 1745-1361
Type of Manuscript: LETTER
Category: Information Network
big data, throughput optimization, throughput estimation, pipelining, concurrency, parallelism,
Full Text: PDF(342KB)>>
This paper presents a prediction model based on historical data to achieve optimal values of pipelining, concurrency and parallelism (PCP) in GridFTP data transfers in Cloud systems. Setting the correct values for these three parameters is crucial in achieving high throughput in end-to-end data movement. However, predicting and setting the optimal values for these parameters is a challenging task, especially in shared and non-predictive network conditions. Several factors can affect the optimal values for these parameters such as the background network traffic, available bandwidth, Round-Trip Time (RTT), TCP buffer size, and file size. Existing models either fail to provide accurate predictions or come with very high prediction overheads. The author shows that new model based on historical data can achieve high accuracy with low overhead.