Scalable VLSI Architecture for Variable Block Size Integer Motion Estimation in H.264/AVC

Yang SONG  Zhenyu LIU  Satoshi GOTO  Takeshi IKENAGA  

IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E89-A   No.4   pp.979-988
Publication Date: 2006/04/01
Online ISSN: 1745-1337
DOI: 10.1093/ietfec/e89-a.4.979
Print ISSN: 0916-8508
Type of Manuscript: Special Section PAPER (Special Section on Selected Papers from the 18th Workshop on Circuits and Systems in Karuizawa)
variable block size motion estimation (VBSME),  H.264/AVC,  very large scale integration (VLSI) architecture,  

Full Text: PDF(1.4MB)>>
Buy this Article

Because of the data correlation in the motion estimation (ME) algorithm of H.264/AVC reference software, it is difficult to implement an efficient ME hardware architecture. In order to make parallel processing feasible, four modified hardware friendly ME workflows are proposed in this paper. Based on these workflows, a scalable full search ME architecture is presented, which has following characteristics: (1) The sum of absolute differences (SAD) results of 44 sub-blocks is accumulated and reused to calculate SADs of bigger sub-blocks. (2) The number of PE groups is configurable. For a search range of MN pixels, where M is width and N is height, up to M PE groups can be configured to work in parallel with a peak processing speed of N16 clock cycles to fulfill a full search variable block size ME (VBSME). (3) Only conventional single port SRAM is required, which makes this architecture suitable for standard-cell-based implementation. A design with 8 PE groups has been realized with TSMC 0.18 µm CMOS technology. The core area is 2.13 mm1.60 mm and clock frequency is 228 MHz in typical condition (1.8 V, 25).