An Efficient Parallel Coding Scheme in Erasure-Coded Storage Systems

Wenrui DONG  Guangming LIU  

IEICE TRANSACTIONS on Information and Systems   Vol.E101-D   No.3   pp.627-643
Publication Date: 2018/03/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2017EDP7218
Type of Manuscript: PAPER
Category: Computer System
erasure codes,  storage,  parallelism,  data allocation,  multiple threads,  thread affinity,  performance evaluation,  

Full Text: PDF(3MB)
>>Buy this Article

Erasure codes have been considered as one of the most promising techniques for data reliability enhancement and storage efficiency in modern distributed storage systems. However, erasure codes often suffer from a time-consuming coding process which makes them nearly impractical. The opportunity to solve this problem probably rely on the parallelization of erasure-code-based application on the modern multi-/many-core processors to fully take advantage of the adequate hardware resources on those platforms. However, the complicated data allocation and limited I/O throughput pose a great challenge on the parallelization. To address this challenge, we propose a general multi-threaded parallel coding approach in this work. The approach consists of a general multi-threaded parallel coding model named as MTPerasure, and two detailed parallel coding algorithms, named as sdaParallel and ddaParallel, respectively, adapting to different I/O circumstances. MTPerasure is a general parallel coding model focusing on the high level data allocation, and it is applicable for all erasure codes and can be implemented without any modifications of the low level coding algorithms. The sdaParallel divides the data into several parts and the data parts are allocated to different threads statically in order to eliminate synchronization latency among multiple threads, which improves the parallel coding performance under the dummy I/O mode. The ddaParallel employs two threads to execute the I/O reading and writing on the basis of small pieces independently, which increases the I/O throughput. Furthermore, the data pieces are assigned to the coding thread dynamically. A special thread scheduling algorithm is also proposed to reduce thread migration latency. To evaluate our proposal, we parallelize the popular open source library jerasure based on our approach. And a detailed performance comparison with the original sequential coding program indicates that the proposed parallel approach outperforms the original sequential program by an extraordinary speedups from 1.4x up to 7x, and achieves better utilization of the computation and I/O resources.