|
For Full-Text PDF, please login, if you are a member of IEICE,
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
|
Date Flow Optimization of Dynamically Coarse Grain Reconfigurable Architecture for Multimedia Applications
Xinning LIU Chen MEI Peng CAO Min ZHU Longxing SHI
Publication
IEICE TRANSACTIONS on Information and Systems
Vol.E95-D
No.2
pp.374-382 Publication Date: 2012/02/01 Online ISSN: 1745-1361
DOI: 10.1587/transinf.E95.D.374 Print ISSN: 0916-8532 Type of Manuscript: Special Section PAPER (Special Section on Reconfigurable Systems) Category: Design Methodology Keyword: REMUS-II, coarse grain reconfigurable architecture, multimedia application, data flow optimization, H.264 HiP,
Full Text: PDF>>
Summary:
This paper proposes a novel sub-architecture to optimize the data flow of REMUS-II (REconfigurable MUltimedia System 2), a dynamically coarse grain reconfigurable architecture. REMUS-II consists of a µPU (Micro-Processor Unit) and two RPUs (Reconfigurable Processor Unit), which are used to speeds up control-intensive tasks and data-intensive tasks respectively. The parallel computing capability and flexibility of REMUS-II makes itself an excellent candidate to process multimedia applications, which require a large amount of memory accesses. In this paper, we specifically optimize the data flow to deal with those performance-hazard and energy-hungry memory accessing in order to meet the bandwidth requirement of parallel computing. The RPU internal memory could work in multiple modes, like 2D-access mode and transformation mode, according to different multimedia access patterns. This novel design can improve the performance up to 26% compared to traditional on-chip memory. Meanwhile, the block buffer is implemented to optimize the off-chip data flow through reducing off-chip memory accesses, which reducing up to 43% compared to direct DDR access. Based on RTL simulation, REMUS-II can achieve 1080p@30 fps of H.264 High Profile@ Level 4 and High Level MPEG2 at 200 MHz clock frequency. The REMUS-II is implemented into 23.7 mm2 silicon on TSMC 65 nm logic process with a 400 MHz maximum working frequency.
|
|