|
For Full-Text PDF, please login, if you are a member of IEICE,
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
|
Multiple Binary Codes for Fast Approximate Similarity Search
Shinichi SHIRAKAWA
Publication
IEICE TRANSACTIONS on Information and Systems
Vol.E98-D
No.3
pp.671-680 Publication Date: 2015/03/01 Publicized: 2014/12/11 Online ISSN: 1745-1361
DOI: 10.1587/transinf.2014EDP7212 Type of Manuscript: PAPER Category: Pattern Recognition Keyword: similarity search, binary hash, approximate nearest neighbor search, machine learning, image retrieval,
Full Text: PDF(1.2MB)>>
Summary:
One of the fast approximate similarity search techniques is a binary hashing method that transforms a real-valued vector into a binary code. The similarity between two binary codes is measured by their Hamming distance. In this method, a hash table is often used when undertaking a constant-time similarity search. The number of accesses to the hash table, however, increases when the number of bits lengthens. In this paper, we consider a method that does not access data with a long Hamming radius by using multiple binary codes. Further, we attempt to integrate the proposed approach and the existing multi-index hashing (MIH) method to accelerate the performance of the similarity search in the Hamming space. Then, we propose a learning method of the binary hash functions for multiple binary codes. We conduct an experiment on similarity search utilizing a dataset of up to 50 million items and show that our proposed method achieves a faster similarity search than that possible with the conventional linear scan and hash table search.
|
|