For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Analytical Model on Hybrid State Saving with a Limited Number of Checkpoints and Bound Rollbacks
Mamoru OHARA Ryo SUZUKI Masayuki ARAI Satoshi FUKUMOTO Kazuhiko IWASAKI
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences
Publication Date: 2006/09/01
Online ISSN: 1745-1337
Print ISSN: 0916-8508
Type of Manuscript: PAPER
Category: Reliability, Maintainability and Safety Analysis
reliability, distributed systems, hybrid state saving Time Warp simulation, evaluation model,
Full Text: PDF(541.5KB)>>
This paper discusses distributed checkpointing with logging for practical applications running with limited resources. We present a discrete time model evaluating the total expected overhead per event where the number of available checkpoints that each process can hold is finite. The rollback distance is also bound to some finite interval in many actual applications. Therefore, the recovery overhead for the checkpointing scheme is described by using a truncated geometric distribution as the rollback distance distribution. Although it is difficult to analytically derive the optimal checkpoint interval, which minimizes the total expected overhead, substituting other simple probabilistic distributions instead of the truncated geometric distribution enables us to do this explicitly. Numerical examples obtained through simulations are presented to show that we can achieve almost minimized total overhead by using the new models and analyses.