
For FullText PDF, please login, if you are a member of IEICE,
or go to Pay Per View on menu list, if you are a nonmember of IEICE.

Multiple Binary Codes for Fast Approximate Similarity Search
Shinichi SHIRAKAWA
Publication
IEICE TRANSACTIONS on Information and Systems
Vol.E98D
No.3
pp.671680 Publication Date: 2015/03/01 Publicized: 2014/12/11 Online ISSN: 17451361
DOI: 10.1587/transinf.2014EDP7212 Type of Manuscript: PAPER Category: Pattern Recognition Keyword: similarity search, binary hash, approximate nearest neighbor search, machine learning, image retrieval,
Full Text: PDF(1.2MB)>>
Summary:
One of the fast approximate similarity search techniques is a binary hashing method that transforms a realvalued vector into a binary code. The similarity between two binary codes is measured by their Hamming distance. In this method, a hash table is often used when undertaking a constanttime similarity search. The number of accesses to the hash table, however, increases when the number of bits lengthens. In this paper, we consider a method that does not access data with a long Hamming radius by using multiple binary codes. Further, we attempt to integrate the proposed approach and the existing multiindex hashing (MIH) method to accelerate the performance of the similarity search in the Hamming space. Then, we propose a learning method of the binary hash functions for multiple binary codes. We conduct an experiment on similarity search utilizing a dataset of up to 50 million items and show that our proposed method achieves a faster similarity search than that possible with the conventional linear scan and hash table search.

open access publishing via







