For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Effective Scheduling of Duplicated Tasks for Fault Tolerance in Multiprocessor Systems
Koji HASHIMOTO Tatsuhiro TSUCHIYA Tohru KIKUNO
IEICE TRANSACTIONS on Information and Systems
Publication Date: 2002/03/01
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Fault Tolerance
multiprocessor, fault-tolerant scheduling, task graph, heights, task groups,
Full Text: PDF(733.8KB)>>
In this paper, we propose a new scheduling algorithm to achieve fault tolerance in multiprocessor systems. This algorithm first partitions a parallel program into subsets of tasks, based on the notion of height of a task graph. For each subset, the algorithm then duplicates and schedules the tasks in the subset successively. We prove that schedules obtained by the proposed algorithm can tolerate a single processor failure and show that the computational complexity of the algorithm is O(|V|4) where V is the set of nodes of a task graph. We conduct simulations by applying the algorithm to two kinds of practical task graphs (Gaussian elimination and LU-decomposition). The results of this experiment show that fault tolerance can be achieved at the cost of small degree of time redundancy, and that performance in the case of a processor failure is improved compared to a previous algorithm.