For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
A Scalable and Reconfigurable Fault-Tolerant Distributed Routing Algorithm for NoCs
Zewen SHI Xiaoyang ZENG Zhiyi YU
IEICE TRANSACTIONS on Information and Systems
Publication Date: 2011/07/01
Online ISSN: 1745-1361
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Computer System
fault-tolerant routing, network-on-chip (NoC), deadlock-free, divide-and-conquer, system partition,
Full Text: PDF(3.2MB)>>
Manufacturing defects in the deep sub-micron VLSI process and aging resulted problems of devices during lifecycle are inevitable, and fault-tolerant routing algorithms are important to provide the required communication for NoCs in spite of failures. The proposed algorithm, referred to as scalable and reconfigurable fault-tolerant distributed routing (RFDR), partitions the system into nine regions using the concept of divide-and-conquer. It is a distributed algorithm, and each router guarantees fault-tolerance within one's own region and the system can be still sustained with multiple fault areas. The proposed RFDR has excellent scalability with hardware cost keeping constant independent of system size. Also it is completely reconfigurable when new nodes fail. Simulations under various synthetic traffic patterns show its better performance compared to Extended-XY routing algorithm. Moreover, there is almost no hardware overhead compared to Logic-Based Distributed Routing (LBDR), but the fault-tolerance capacity is enhanced in the proposed algorithm. Hardware cost is reduced 37% compared to Reconfigurable Distributed Scalable Predictable Interconnect Network (R-DSPIN) which only supports single fault region.