An Adaptive Multi-level Hashing Structure for Fast Approximate Similarity Search

Authors

  • Alexander Ocsa ICMC, University of Sao Paulo (USP), Brazil http://www.gbdi.icmc.usp.br/~aocsa/
  • Elaine P. M. de Sousa ICMC, University of Sao Paulo (USP), Brazil

DOI:

https://doi.org/10.5753/jidm.2010.1279

Keywords:

Access methods, LSH, Multidimentional Index, Similarity Information Retrieval

Abstract

Fast information retrieval is an essential task in data management, mainly due to the increasing availability of data. To address this problem, database researchers have developed indexing techniques to logically organize elements from large datasets in order to answer queries efficiently. In this context, an approximate similarity search algorithm known as Locality Sensitive Hashing (LSH) was recently proposed to query high-dimensional datasets with efficient computational time. The query cost of LSH is sub-linear on the dataset size. However, some drawbacks of LSH have not been solved entirely, such as the need of several hash indexes to achieve accurate query results and the critical dependency on data distribution and parameter values. Therefore, the LSH solution is unsuitable to many real applications involving dynamic datasets. In this paper, we propose an adaptive Multi-level hashing to support dynamic index construction efficiently. By employing a Multi-level scheme it is possible to dynamically adapt the data domain parameters and exploit the resulting multi-resolution index structure to speed up the query process. Experimental studies on large real and synthetic datasets show the similarity search performance of the proposed technique.

Downloads

Download data is not yet available.

Downloads

Additional Files

Published

2010-10-06

How to Cite

Ocsa, A., & de Sousa, E. P. M. (2010). An Adaptive Multi-level Hashing Structure for Fast Approximate Similarity Search. Journal of Information and Data Management, 1(3), 359. https://doi.org/10.5753/jidm.2010.1279

Issue

Section

Regular Papers