Determining Consistent Global Checkpoints of a Distributed Computation

Dakshnamoorthy MANIVANNAN  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E87-D   No.1   pp.164-174
Publication Date: 2004/01/01
Online ISSN: 
DOI: 
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Computer Systems
Keyword: 
causality,  distributed checkpointing,  global snapshot collection,  failure recovery,  fault-tolerance,  zigzag paths,  z-paths,  

Full Text: PDF>>
Buy this Article




Summary: 
Determining consistent global checkpoints of a distributed computation has applications in the areas such as rollback recovery, distributed debugging, output commit and others. Netzer and Xu introduced the notion of zigzag paths and presented necessary and sufficient conditions for a set of checkpoints to be part of a consistent global checkpoint. This result also reveals that determining the existence of zigzag paths between checkpoints is crucial for determining consistent global checkpoints. Recent research also reveals that determining zigzag paths on-line is not possible. In this paper, we present an off-line method for determining the existence of zigzag paths between checkpoints.