Index Interpolation: A Subsequence Matching Algorithm Supporting Moving Average Transform of Arbitrary Order in Time-Series Databases

Woong-Kee LOH  Sang-Wook KIM  Kyu-Young WHANG  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E84-D   No.1   pp.76-86
Publication Date: 2001/01/01
Online ISSN: 
DOI: 
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Databases
Keyword: 
index interpolation,  subsequence matching,  moving average transform,  time-series databases,  

Full Text: PDF>>
Buy this Article




Summary: 
In this paper we propose a subsequence matching algorithm that supports moving average transform of arbitrary order in time-series databases. Moving average transform reduces the effect of noise and has been used in many areas such as econometrics since it is useful in finding the overall trends. The proposed algorithm extends the existing subsequence matching algorithm proposed by Faloutsos et al. (SUB94 in short). If we applied the algorithm without any extension, we would have to generate an index for each moving average order and would have serious storage and CPU time overhead. In this paper we tackle the problem using the notion of index interpolation. Index interpolation is defined as a searching method that uses one or more indexes generated for a few selected cases and performs searching for all the cases satisfying some criteria. The proposed algorithm, which is based on index interpolation, can use only one index for a pre-selected moving average order k and performs subsequence matching for arbitrary order m ( k). We prove that the proposed algorithm causes no false dismissal. The proposed algorithm can also use more than one index to improve search performance. The algorithm works better with smaller selectivities. For selectivities less than 10-2, the degradation of search performance compared with the fully-indexed case--which is equivalent to SUB94--is no more than 33.0% when one index is used, and 17.2% when two indexes are used. Since the queries with smaller selectivities are much more frequent in general database applications, the proposed algorithm is suitable for practical situations.