A Novel Cache Replacement Policy via Dynamic Adaptive Insertion and Re-Reference Prediction

Xi ZHANG  Chongmin LI  Zhenyu LIU  Haixia WANG  Dongsheng WANG  Takeshi IKENAGA  

Publication
IEICE TRANSACTIONS on Electronics   Vol.E94-C   No.4   pp.468-476
Publication Date: 2011/04/01
Online ISSN: 1745-1353
DOI: 10.1587/transele.E94.C.468
Print ISSN: 0916-8516
Type of Manuscript: Special Section PAPER (Special Section on Circuits and Design Techniques for Advanced Large Scale Integration)
Category: 
Keyword: 
cache replacement,  set dueling,  adaptive insertion,  shared cache,  

Full Text: PDF>>
Buy this Article




Summary: 
Previous research illustrates that LRU replacement policy is not efficient when applications exhibit a distant re-reference interval. Recently RRIP policy is proposed to improve the performance for such kind of workloads. However, the lack of access recency information in RRIP confuses the replacement policy to make the accurate prediction. To enhance the robustness of RRIP for recency-friendly workloads, we propose an Dynamic Adaptive Insertion and Re-reference Prediction (DAI-RRP) policy which evicts data based on both re-reference prediction value and the access recency information. DAI-RRP makes adaptive adjustment on insertion position and prediction value for different access patterns, which makes the policy robust across different workloads and different phases. Simulation results show that DAI-RRP outperforms LRU and RRIP. For a single-core processor with a 1 MB 16-way set last-level cache (LLC), DAI-RRP reduces CPI over LRU and Dynamic RRIP by an average of 8.1% and 2.7% respectively. Evaluations on quad-core CMP with a 4 MB shared LLC show that DAI-RRP outperforms LRU and Dynamic RRIP (DRRIP) on the weighted speedup metric by an average of 8.1% and 15.7% respectively. Furthermore, compared to LRU, DAI-RRP consumes the similar hardware for 16-way cache, or even less hardware for high-associativity cache. In summary, the proposed policy is practical and can be easily integrated into existing hardware approximations of LRU.