
For FullText PDF, please login, if you are a member of IEICE,
or go to Pay Per View on menu list, if you are a nonmember of IEICE.

Pivot Generation Algorithm with a Complete Binary Tree for Efficient Exact Similarity Search
Yuki YAMAGISHI Kazuo AOYAMA Kazumi SAITO Tetsuo IKEDA
Publication
IEICE TRANSACTIONS on Information and Systems
Vol.E101D
No.1
pp.142151 Publication Date: 2018/01/01
Online ISSN: 17451361
DOI: 10.1587/transinf.2017EDP7077
Type of Manuscript: PAPER Category: Data Engineering, Web Information Systems Keyword: similarity search, pivot generation, complete binary tree,
Full Text: PDF(756.7KB) >>Buy this Article
Summary:
This paper presents a pivotset generation algorithm for accelerating exact similarity search in a largescale data set. To deal with the largescale data set, it is important to efficiently construct a search index offline as well as to perform fast exact similarity search online. Our proposed algorithm efficiently generates competent pivots with two novel techniques: hierarchical data partitioning and fast pivot optimization techniques. To make effective use of a small number of pivots, the former recursively partitions a data set into two subsets with the same size depending on the rank order from each of two assigned pivots, resulting in a complete binary tree. The latter calculates a defined objective function for pivot optimization with a low computational cost by skillfully operating data objects mapped into a pivot space. Since the generated pivots provide the tight lower bounds on distances between a query object and the data objects, an exact similarity search algorithm effectively avoids unnecessary distance calculations. We demonstrate that the search algorithm using the pivots generated by the proposed algorithm reduces distance calculations with an extremely high rate regarding a range query problem for real largescale image data sets.

