For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Hyochang NAM Jong KIM Sung Je HONG Sunggu LEE
IEICE TRANSACTIONS on Information and Systems
Publication Date: 2002/07/01
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Fault Tolerance
checkpointing, encoding, rollback recovery, fault tolerance, aliasing,
Full Text: PDF>>
For checkpointing to be practical, it has to introduce low overhead for the targeted application. As a means of reducing the overhead of checkpointing, this paper proposes a probabilistic checkpointing method, which uses block encoding to detect the modified memory area between two consecutive checkpoints. Since the proposed technique uses block encoding to detect the modified area, the possibility of aliasing exists in encoded words. However, this paper shows that the aliasing probability is near zero when an 8-byte encoded word is used. The performance of the proposed technique is analyzed and measured by using experiments. An analytic model which predicts the checkpointing overhead is first constructed. By using this model, the block size that produces the best performance for a given target program is estimated. In most cases, medium block sizes, i.e., 128 or 256 bytes, show the best performance. The proposed technique has also been implemented on Unix based systems, and its performance has been measured in real environments. According to the experimental results, the proposed technique reduces the overhead by 11.7% in the best case and increases the overhead by 0.5% in the worst case in comparison with page-based incremental checkpointing.