Avoiding Performance Impacts by Re-Replication Workload Shifting in HDFS Based Cloud Storage

Thanda SHWE  Masayoshi ARITSUGI  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E101-D   No.12   pp.2958-2967
Publication Date: 2018/12/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2018PAP0017
Type of Manuscript: Special Section PAPER (Special Section on Parallel and Distributed Computing and Networking)
Category: Cloud Computing
Keyword: 
re-replication,  fault tolerance,  data reliability,  HDFS,  

Full Text: PDF(548.2KB)
>>Buy this Article


Summary: 
Data replication in cloud storage systems brings a lot of benefits, such as fault tolerance, data availability, data locality and load balancing both from reliability and performance perspectives. However, each time a datanode fails, data blocks stored on the failed datanode must be restored to maintain replication level. This may be a large burden for the system in which resources are highly utilized with users' application workloads. Although there have been many proposals for replication, the approach of re-replication has not been properly addressed yet. In this paper, we present a deferred re-replication algorithm to dynamically shift the re-replication workload based on current resource utilization status of the system. As workload pattern varies depending on the time of the day, simulation results from synthetic workload demonstrate a large opportunity for minimizing impacts on users' application workloads with the simple algorithm that adjusts re-replication based on current resource utilization. Our approach can reduce performance impacts on users' application workloads while ensuring the same reliability level as default HDFS can provide.