False Drop Analysis of Set Retrieval with Signature Files

Hiroyuki KITAGAWA  Yoshiharu ISHIKAWA  

IEICE TRANSACTIONS on Information and Systems   Vol.E80-D   No.6   pp.653-664
Publication Date: 1997/06/25
Online ISSN: 
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Databases
access method,  complex object,  advanced data models,  signature file,  false drop,  set retrieval,  

Full Text: PDF(739.9KB)>>
Buy this Article

Modern database systems have to support complex data objects, which appear in advanced data models such as object-oriented data models and nested relational data models. Set-valued objects are basic constructs to build complex structures in those models. Therefore, efficient processing of set-valued object retrieval (simply, set retrieval) is an important feature required of advanced database systems. Our previous work proposed a basic scheme to apply superimposed coded signature files to set retrieval and showed its potential advantages over the B-tree index based approach using a performance analysis model. Retrieval with signature files is always accompanied by mismatches called false drops, and proper control of the false drops is indispensable in the signature file design. This study intensively analyzes the false drops in set retrieval with signature files. First, schemes to use signature files are presented to process set retrieval involving "has-subset," "is-subset," "has-intersection," and "is-equal" predicates, and generic formulas estimating the false drops are derived. Then, three sets of concrete formulas are derived in three ways to estimate the false drops in the four types of set retrieval. Finally, their estimates are validated with computer simulations, and advantages and disadvantages of each set of the false drop estimation formulas are discussed. The analysis shows that proper choice of estimation formulas gives quite accurate estimates of the false drops in set retrieval with signature files.