For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
A Tightly Coupled General Purpose Reconfigurable Accelerator LAPP and Its Power States for HotSpot-Based Energy Reduction
Jun YAO Yasuhiko NAKASHIMA Naveen DEVISETTI Kazuhiro YOSHIMURA Takashi NAKADA
IEICE TRANSACTIONS on Information and Systems
Publication Date: 2014/12/01
Online ISSN: 1745-1361
Type of Manuscript: Special Section PAPER (Special Section on Parallel and Distributed Computing and Networking)
reconfigurable architectures, multi-core processing, energy efficiency,
Full Text: PDF>>
General purpose many-core architecture (MCA) such as GPGPU has recently been used widely to continue the performance scaling when the continuous increase in the working frequency has approached the manufacturing limitation. However, both the general purpose MCA and its building block general purpose processor (GPP) lack a tuning capability to boost energy efficiency for individual applications, especially computation intensive applications. As an alternative to the above MCA platforms, we propose in this paper our LAPP (Linear Array Pipeline) architecture, which takes a special-purpose reconfigurable structure for an optimal MIPS/W. However, we also keep the backward binary compatibility, which is not featured in most special hardware. More specifically, we used a general purpose VLIW processor, interpreting a commercial VLIW ISA, as the baseline frontend part to provide the backward binary compatibility. We also extended the functional unit (FU) stage into an FU array to form the reconfigurable backend for efficient execution of program hotspots to exploit parallelism. The hardware modules in this general purpose reconfigurable architecture have been locally zoned into several groups to apply preferable low-power techniques according to the module hardware features. Our results show that under a comparable performance, the tightly coupled general/special purpose hardware, which is based on a 180nm cell library, can achieve 10.8 times the MIPS/W of MCA architecture of the same technology features. When a 65 technology node is assumed, a similar 9.4x MIPS/W can be achieved by using the LAPP without changing program binaries.