Efficient Techniques for Adaptive Independent Checkpointing in Distributed Systems

Cheng-Min LIN  Chyi-Ren DOW  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E83-D   No.8   pp.1642-1653
Publication Date: 2000/08/25
Online ISSN: 
DOI: 
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Fault Tolerance
Keyword: 
distributed systems,  fault tolerance,  checkpointing,  failure recovery,  

Full Text: PDF>>
Buy this Article




Summary: 
This work presents two novel algorithms to prevent rollback propagation for independent checkpointing: an efficient adaptive independent checkpointing algorithm and an optimized adaptive independent checkpointing algorithm. The last opportunity strategy that yields a better performance than the conservation strategy is also employed to prevent useless checkpoints for both causal rewinding paths and non-causal rewinding paths. The two methods proposed herein are domino effect-free and require only a limited amount of control information. They also take less unnecessary adaptive checkpoints than other algorithms. Furthermore, experimental results indicate that the checkpoint overhead of our techniques is lower than that of the coordinated checkpointing and domino effect-free algorithms for service-providing applications.