Approximate Nearest Neighbor Search for a Dataset of Normalized Vectors

Kengo TERASAWA  Yuzuru TANAKA  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E92-D   No.9   pp.1609-1619
Publication Date: 2009/09/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E92.D.1609
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Algorithm Theory
Keyword: 
nearest neighbor,  randomized algorithm,  locality-sensitive hashing (LSH),  

Full Text: PDF(400.1KB)>>
Buy this Article




Summary: 
This paper describes a novel algorithm for approximate nearest neighbor searching. For solving this problem especially in high dimensional spaces, one of the best-known algorithm is Locality-Sensitive Hashing (LSH). This paper presents a variant of the LSH algorithm that outperforms previously proposed methods when the dataset consists of vectors normalized to unit length, which is often the case in pattern recognition. The LSH scheme is based on a family of hash functions that preserves the locality of points. This paper points out that for our special case problem we can design efficient hash functions that map a point on the hypersphere into the closest vertex of the randomly rotated regular polytope. The computational analysis confirmed that the proposed method could improve the exponent ρ, the main indicator of the performance of the LSH algorithm. The practical experiments also supported the efficiency of our algorithm both in time and in space.