Coder Social home page Coder Social logo

fmaglia / boi Goto Github PK

View Code? Open in Web Editor NEW
15.0 2.0 3.0 850 KB

Multi-index hashing for the resolution of ANN search problem on large datasets

Home Page: http://implab.ce.unipr.it/?page_id=787

C++ 100.00%
landmark-recognition image-retrieval content-based-image-retrieval large-scale nearest-neighbor-search

boi's Introduction

Bag of Indexes (BoI)

[paper] [project] [poster]

Bag of Indexes (BoI) is a multi-index hashing C++ library, used for solving Approximate Nearest Neighbors (ANN) search problems. In the belove figure, an example of BoI's working is presented.

BoI_overview

LSH and multi-probe LSH are the only, at the moment, projection functions implemented.

Datasets

The original dataset files are converted in binary through the application of a simple C++ script:

After downloading the dat files you need to create a folder called dataset and then put in the uncompressed version.

Installation

  • Requirements:
    • G++ 5.4 or greater
    • openmp
  • Build: g++ -o BoI BoI.cpp -lstdc++fs -std=c++14 -fopenmp -O2

Test

./BoI SIFT1M 16 25 500 true ..

In the previous command the values of the parameters are:

  • δ = 16;
  • L = 25;
  • topN = 500;
  • fast re-ranking = true (loading descriptors from RAM).

Results

SIFT1M

The following table represents the results obtained through the application of BoI adaptive multi-probe LSH, trying to change some parameters:

  • L: number of hash tables;
  • δ: number of buckets per hash table (hash dimension);
  • topN: number of top elements to re-rank based on original distance (usually Euclidean distance).

In every test, the neighborhood is setted to 3.

Configuration 1 10 100 avg retrieval time
δ = 16, L = 25, top500 90.50% 91.57% 91.57% 6 msec
δ = 15, L = 50, top500 98.36% 99.19% 99.19% 18 msec
δ = 15, L = 50, top10k 99.20% 100% 100% 18 msec

The reduced average retrieval time is obtained through the multi-threading and the cut-off (an unsuperivesd reduction of the BoI structure). For more info see the paper cited in the Reference section. The reason behind the same retrieval time changing the top elements to re-rank is due to the speed of loading files from RAM memory.

GIST1M

Configuration 1 10 100 avg retrieval time
δ = 16, L = 100, top500 79.80% 80.80% 80.80% 60 msec
δ = 16, L = 100, top10k 97.40% 98.50% 98.50% 100 msec

All the experiments are evaluated following the Recall@R, that is the average rate of queries for which the 1-nearest neighbor is ranked in the top R positions.

Reference

@article{magliani2018efficient,
  title={Efficient Nearest Neighbors Search for Large-Scale Landmark Recognition},
  author={Magliani, Federico and Fontanini, Tomaso and Prati, Andrea},
  journal={arXiv preprint arXiv:1806.05946},
  year={2018}
}

boi's People

Contributors

fmaglia avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.