Reliability Analysis of Disk Array Organizations by Considering Uncorrectable Bit Errors

Xuefeng WU  Jie LI  Hisao KAMEDA  

IEICE TRANSACTIONS on Information and Systems   Vol.E81-D   No.1   pp.73-80
Publication Date: 1998/01/25
Online ISSN: 
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Fault Tolerant Computing
uncorrectable bit errors,  reliability analysis,  disk arrays,  RAID,  sparing methods,  

Full Text: PDF>>
Buy this Article

In this paper, we present an analytic model to study the reliability of some important disk array organizations that have been proposed by others in the literature. These organizations are based on the combination of two options for the data layout, regular RAID-5 and block designs, and three alternatives for sparing, hot sparing, distributed sparing and parity sparing. Uncorrectable bit errors have big effects on reliability but are ignored in traditional reliability analysis of disk arrays. We consider both disk failures and uncorrectable bit errors in the model. The reliability of disk arrays is measured in terms of MTTDL (Mean Time To Data Loss). A unified formula of MTTDL has been derived for these disk array organizations. The MTTDLs of these disk array organizations are also compared using the analytic model. By numerical experiments, we show that the data losses caused by uncorrectable bit errors may dominate the data losses of disk array systems though only the data losses caused by disk failures are traditionally considered. The consideration of uncorrectable bit errors provides a more realistic look at the reliability of the disk array systems.