|
For Full-Text PDF, please login, if you are a member of IEICE,
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
|
Parallelization of Computing-Intensive Tasks of the H.264 High Profile Decoding Algorithm on a Reconfigurable Multimedia System
Tongsheng GENG Leibo LIU Shouyi YIN Min ZHU Shaojun WEI
Publication
IEICE TRANSACTIONS on Information and Systems
Vol.E93-D
No.12
pp.3223-3231 Publication Date: 2010/12/01 Online ISSN: 1745-1361
DOI: 10.1587/transinf.E93.D.3223 Print ISSN: 0916-8532 Type of Manuscript: Special Section PAPER (Special Section on Parallel and Distributed Computing and Networking) Category: Keyword: H.264, reconfigurable multimedia system, parallelization computation, hardware/software partition,
Full Text: PDF>>
Summary:
This paper proposes approaches to perform HW/SW (Hardware/Software) partition and parallelization of computing-intensive tasks of the H.264 HiP (High Profile) decoding algorithm on an embedded coarse-grained reconfigurable multimedia system, called REMUS (REconfigurable MUltimedia System). Several techniques, such as MB (Macro-Block) based parallelization, unfixed sub-block operation etc., are utilized to speed up the decoding process, satisfying the requirements of real-time and high quality H.264 applications. Tests show that the execution performance of MC (Motion Compensation), deblocking, and IDCT-IQ (Inverse Discrete Cosine Transform-Inverse Quantization) on REMUS is improved by 60%, 73%, 88.5% in the typical case and 60%, 69%, 88.5% in the worst case, respectively compared with that on XPP PACT (a commercial reconfigurable processor). Compared with ASIC solutions, the performance of MC is improved by 70%, 74% in the typical and in the worst case, respectively, while those of Deblocking remain the same. As for IDCT_IQ, the performance is improved by 17% no matter in the typical or worst case. Relying on the proposed techniques, 1080p@30 fps of H.264 HiP@ Level 4 decoding could be achieved on REMUS when utilizing a 200 MHz working frequency.
|
|