A Cache Optimized Multidimensional Index in Disk-Based Environments

Myungsun PARK  Sukho LEE  

IEICE TRANSACTIONS on Information and Systems   Vol.E88-D    No.8    pp.1932-1939
Publication Date: 2005/08/01
Online ISSN: 
DOI: 10.1093/ietisy/e88-d.8.1932
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Database
optimize,  cache performance,  I/O performance,  R-tree,  prefetch,  

Full Text: PDF>>
Buy this Article

R-trees have been traditionally optimized for I/O performance with disk pages as tree nodes. Recently, researchers have proposed cache-conscious variations of R-trees optimized for CPU cache performance in main memory environments, where the node size is several cache lines wide and more entries are packed in a node by compressing MBR keys. However, because there is a big difference between the node sizes of two types of R-trees, disk-optimized R-trees show poor cache performance while cache-optimized R-trees exhibit poor disk performance. In this paper, we propose a cache and disk optimized R-tree, called PR-tree (Prefetching R-tree). For cache performance, the node size of the PR-tree is wider than a cache line, and the prefetch instruction is used to reduce the number of cache misses. For I/O performance, the nodes of the PR-tree are fitted into one disk page. We represent the detailed analysis of cache misses for range queries, and enumerate all the reasonable in-page leaf and nonleaf node sizes, and heights of in-page trees to figure out tree parameters for the best cache and I/O performance. The PR-tree that we propose achieves better cache performance than the disk-optimized R-tree: a factor of 3.5-15.1 improvement for one-by-one insertions, 6.5-15.1 improvement for deletions, 1.3-1.9 improvement for range queries, and 2.7-9.7 improvement for k-nearest neighbor queries. All experimental results do not show notable declines of I/O performance.