Reducing On-Chip DRAM Energy via Data Transfer Size Optimization

Takatsugu ONO  Koji INOUE  Kazuaki MURAKAMI  Kenji YOSHIDA  

Publication
IEICE TRANSACTIONS on Electronics   Vol.E92-C   No.4   pp.433-443
Publication Date: 2009/04/01
Online ISSN: 1745-1353
DOI: 10.1587/transele.E92.C.433
Print ISSN: 0916-8516
Type of Manuscript: Special Section PAPER (Special Section on Low-Leakage, Low-Voltage, Low-Power and High-Speed Technologies for System LSIs in Deep-Submicron Era)
Category: 
Keyword: 
low power,  variable line-size,  on-chip DRAM,  high bandwidth,  embedded systems,  

Full Text: PDF(497.8KB)>>
Buy this Article




Summary: 
This paper proposes a software-controllable variable line-size (SC-VLS) cache architecture for low power embedded systems. High bandwidth between logic and a DRAM is realized by means of advanced integrated technology. System-in-Silicon is one of the architectural frameworks to realize the high bandwidth. An ASIC and a specific SRAM are mounted onto a silicon interposer. Each chip is connected to the silicon interposer by eutectic solder bumps. In the framework, it is important to reduce the DRAM energy consumption. The specific DRAM needs a small cache memory to improve the performance. We exploit the cache to reduce the DRAM energy consumption. During application program executions, an adequate cache line size which produces the lowest cache miss ratio is varied because the amount of spatial locality of memory references changes. If we employ a large cache line size, we can expect the effect of prefetching. However, the DRAM energy consumption is larger than a small line size because of the huge number of banks are accessed. The SC-VLS cache is able to change a line size to an adequate one at runtime with a small area and power overheads. We analyze the adequate line size and insert line size change instructions at the beginning of each function of a target program before executing the program. In our evaluation, it is observed that the SC-VLS cache reduces the DRAM energy consumption up to 88%, compared to a conventional cache with fixed 256 B lines.