
For FullText PDF, please login, if you are a member of IEICE,
or go to Pay Per View on menu list, if you are a nonmember of IEICE.

Effective Scheduling of Duplicated Tasks for Fault Tolerance in Multiprocessor Systems
Koji HASHIMOTO Tatsuhiro TSUCHIYA Tohru KIKUNO
Publication
IEICE TRANSACTIONS on Information and Systems
Vol.E85D
No.3
pp.525534 Publication Date: 2002/03/01
Online ISSN:
DOI:
Print ISSN: 09168532 Type of Manuscript: PAPER Category: Fault Tolerance Keyword: multiprocessor, faulttolerant scheduling, task graph, heights, task groups,
Full Text: PDF(733.8KB)>>
Summary:
In this paper, we propose a new scheduling algorithm to achieve fault tolerance in multiprocessor systems. This algorithm first partitions a parallel program into subsets of tasks, based on the notion of height of a task graph. For each subset, the algorithm then duplicates and schedules the tasks in the subset successively. We prove that schedules obtained by the proposed algorithm can tolerate a single processor failure and show that the computational complexity of the algorithm is O(V^{4}) where V is the set of nodes of a task graph. We conduct simulations by applying the algorithm to two kinds of practical task graphs (Gaussian elimination and LUdecomposition). The results of this experiment show that fault tolerance can be achieved at the cost of small degree of time redundancy, and that performance in the case of a processor failure is improved compared to a previous algorithm.

