For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
A High-Throughput and Compact Hardware Implementation for the Reconstruction Loop in HEVC Intra Encoding
Yibo FAN Leilei HUANG Zheng XIE Xiaoyang ZENG
IEICE TRANSACTIONS on Electronics
Publication Date: 2017/06/01
Online ISSN: 1745-1353
Type of Manuscript: PAPER
Category: Integrated Electronics
reconstruction loop, discrete cosine transform (DCT), inverse discrete cosine transform (IDCT), quantization (Q), de-quantization (IQ), high efficiency video coding (HEVC),
Full Text: PDF>>
In the newly finalized video coding standard, namely high efficiency video coding (HEVC), new notations like coding unit (CU), prediction unit (PU) and transformation unit (TU) are introduced to improve the coding performance. As a result, the reconstruction loop in intra encoding is heavily burdened to choose the best partitions or modes for them. In order to solve the bottleneck problems in cycle and hardware cost, this paper proposed a high-throughput and compact implementation for such a reconstruction loop. By “high-throughput”, it refers to that it has a fixed throughput of 32 pixel/cycle independent of the TU/PU size (except for 4×4 TUs). By “compact”, it refers to that it fully explores the reusability between discrete cosine transform (DCT) and inverse discrete cosine transform (IDCT) as well as that between quantization (Q) and de-quantization (IQ). Besides the contributions made in designing related hardware, this paper also provides a universal formula to analyze the cycle cost of the reconstruction loop and proposed a parallel-process scheme to further reduce the cycle cost. This design is verified on the Stratix IV FPGA. The basic structure achieved a maximum frequency of 150MHz and a hardware cost of 64K ALUTs, which could support the real time TU/PU partition decision for 4K×2K@20fps videos.