High Bandwidth, Variable Line-Size Cache Architecture for Merged DRAM/Logic LSIs

Koji INOUE  Koji KAI  Kazuaki MURAKAMI  

Publication
IEICE TRANSACTIONS on Electronics   Vol.E81-C   No.9   pp.1438-1447
Publication Date: 1998/09/25
Online ISSN: 
DOI: 
Print ISSN: 0916-8516
Type of Manuscript: Special Section PAPER (Special Issue on Novel VLSI Processor Architectures)
Category: 
Keyword: 
cache,  merged DRAM/logic LSIs,  memory system,  

Full Text: PDF(938.1KB)>>
Buy this Article




Summary: 
Merged DRAM/logic LSIs could provide high on-chip memory bandwidth by interconnecting logic portions and DRAM with wider on-chip buses. For merged DRAM/logic LSIs with the memory hierarchy including cache memory, we can exploit such high on-chip memory bandwidth by means of replacing a whole cache line (or cache block) at a time on cache misses. This approach tends to increase the cache-line size if we attempt to improve the attainable memory bandwidth. Larger cache lines, however, might worsen the system performance if programs running on the LSIs do not have enough spatial locality of references and cache misses frequently take place. This paper describes a novel cache architecture suitable for merged DRAM/logic LSIs, called variable line-size cache or VLS cache, for resolving the above-mentioned dilemma. The VLS cache can make good use of the high on-chip memory bandwidth by means of larger cache lines and, at the same time, alleviate the negative effects of larger cache-line size by partitioning each large cache line into multiple sub-lines and allowing every sub-line to work as an independent cache line. The number of sub-lines involved when a cache replacement occurs can be determined depending on the characteristics of programs. This paper also evaluates the cost/performance improvements attainable by the VLS cache and compares it with those of conventional cache architectures. As a result, it is observed that a VLS cache reduces the average memory-access time by 16. 4% while it increases the hardware cost by only 13%, compared to a conventional direct-mapped cache with fixed 32-byte lines.