Packet Processing Architecture with Off-Chip Last Level Cache Using Interleaved 3D-Stacked DRAM

Tomohiro KORIKAWA  Akio KAWABATA  Fujun HE  Eiji OKI  

Publication
IEICE TRANSACTIONS on Communications   Vol.E104-B   No.2   pp.149-157
Publication Date: 2021/02/01
Publicized: 2020/08/06
Online ISSN: 1745-1345
DOI: 10.1587/transcom.2020EBP3017
Type of Manuscript: PAPER
Category: Network System
Keyword: 
cache memory,  communication system,  memory architecture,  network function virtualization,  

Full Text: FreePDF(1.8MB)


Summary: 
The performance of packet processing applications is dependent on the memory access speed of network systems. Table lookup requires fast memory access and is one of the most common processes in various packet processing applications, which can be a dominant performance bottleneck. Therefore, in Network Function Virtualization (NFV)-aware environments, on-chip fast cache memories of a CPU of general-purpose hardware become critical to achieve high performance packet processing speeds of over tens of Gbps. Also, multiple types of applications and complex applications are executed in the same system simultaneously in carrier network systems, which require adequate cache memory capacities as well. In this paper, we propose a packet processing architecture that utilizes interleaved 3 Dimensional (3D)-stacked Dynamic Random Access Memory (DRAM) devices as off-chip Last Level Cache (LLC) in addition to several levels of dedicated cache memories of each CPU core. Entries of a lookup table are distributed in every bank and vault to utilize both bank interleaving and vault-level memory parallelism. Frequently accessed entries in 3D-stacked DRAM are also cached in on-chip dedicated cache memories of each CPU core. The evaluation results show that the proposed architecture reduces the memory access latency by 57%, and increases the throughput by 100% while reducing the blocking probability but about 10% compared to the architecture with shared on-chip LLC. These results indicate that 3D-stacked DRAM can be practical as off-chip LLC in parallel packet processing systems.