A Randomized Online Algorithm for the File Caching Problem

Seiichiro TANI  Toshiaki MIYAZAKI  

IEICE TRANSACTIONS on Information and Systems   Vol.E86-D    No.4    pp.686-697
Publication Date: 2003/04/01
Online ISSN: 
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Algorithms
network,  cache,  competitive,  online algorithm,  

Full Text: PDF>>
Buy this Article

Caching web files reduces user response time as well as network traffic. When implementing caches, the file caching problem must be addressed; the problem is how to determine which files should be evicted from a cache when there is insufficient space for storing a new file so that the sum of the mis-hit (fault) file costs is minimized. Greedy-Dual-Size (GDS) is the best online algorithm in terms of competitiveness, i. e. , (k)/(k-h+1)-competitive, where k and h are the storage space of, respectively, GDS and an optimal offline algorithm. GDS performs very well even in trace-driven simulations. The worst-case time taken to service a request is another important measure for online file caching algorithms since slow response times render caching meaningless from the client's view point. This paper proposes a fast randomized (k)/(k-h+1)-competitive algorithm that performs in O(2log ^* k) time per file eviction or insertion, whereas GDS takes O(log k) time, where 2log ^* k is a much slower increasing function than log k. To confirm its practicality, we conduct trace driven simulations. Experimental results show that our algorithm attains only slightly worse byte hit rates and sufficiently large reduced latency in comparison with GDS, and our algorithm is a good candidate for caches requiring high-speed processing such as second-level caches in the large networks.