For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
The BINDS-Tree: A Space-Partitioning Based Indexing Scheme for Box Queries in Non-Ordered Discrete Data Spaces
A. K. M. Tauhidul ISLAM Sakti PRAMANIK Qiang ZHU
IEICE TRANSACTIONS on Information and Systems
Publication Date: 2019/04/01
Online ISSN: 1745-1361
Type of Manuscript: Special Section PAPER (Special Section on Data Engineering and Information Management)
non-ordered discrete data, multidimensional indexing, space-partitioning, box query, and algorithm,
Full Text: PDF(1.1MB)>>
In recent years we have witnessed an increasing demand to process queries on large datasets in Non-ordered Discrete Data Spaces (NDDS). In particular, one type of query in an NDDS, called box queries, is used in many emerging applications including error corrections in bioinformatics and network intrusion detection in cybersecurity. Effective indexing methods are necessary for efficiently processing queries on large datasets in disk. However, most existing NDDS indexing methods were not designed for box queries. Several recent indexing methods developed for box queries on a large NDDS dataset in disk are based on the popular data-partitioning approach. Unfortunately, a space-partitioning based indexing scheme, which is more effective for box queries in an NDDS, has not been studied before. In this paper, we propose a novel indexing method based on space-partitioning, called the BINDS-tree, for supporting efficient box queries on a large NDDS dataset in disk. A number of effective strategies such as node split based on minimum span and cross optimal balance, redundancy reduction utilizing a singleton dimension inheritance property, and a space-efficient structure for the split history are incorporated in the constructing algorithm for the BINDS-tree. Experimental results demonstrate that the proposed BINDS-tree significantly improves the box query I/O performance, comparing to that of the state-of-the-artdata-partitioning based NDDS indexing method.