Coder Social home page Coder Social logo

cgtuebingen / product-quantization-tree Goto Github PK

View Code? Open in Web Editor NEW
93.0 10.0 26.0 128 KB

GPU-based large scale Approx. Nearest Neighbor Search, accepted at CVPR 2016

Home Page: http://goo.gl/YDwCl4

License: MIT License

C++ 35.41% Cuda 63.14% CMake 0.62% MATLAB 0.73% Shell 0.10%
nearest-neighbor-search gpu parallel-computing high-performance-computing large-scale

product-quantization-tree's Introduction

Efficient Large-scale Approximate Nearest Neighbor Search on the GPU

This repository contains the implementation of Product Quantization Trees (PQT) for large scale nearest neighbor search.

@inproceedings{PQT,
    author = "Patrick Wieschollek and Oliver Wang and Alexander Sorkine-Hornung and Hendrik P.A. Lensch",
    title = "Efficient Large-scale Approximate Nearest Neighbor Search on the GPU",
    booktitle = "IEEE Conference on Computer Vision and Pattern Recognition (CVPR)",
    pages = "",
    month = "June",
    year = "2016",
    url = "goo.gl/4Zl5xB"
}

This experimental implementation is the first running on the GPU outperforming previous CPU approaches.

Usage

Download

Just by

git clone --recursive https://github.com/cgtuebingen/Product-Quantization-Tree.git

Requirements

To build this project you need

We use Toolkit v7.5, gcc 4.8.4 on Ubuntu 14.04 for Nvidia Titan X.

Preprocessing

To handle datasets and efficiently read them from (ram-)disk we converted the official datasets from SIFT1M, SIFT1B in our own format:

Each *.umem, *.imem, *.fmem has the following layout (header and content)

uint number_of_vectors
uint dimen_of_vector
... header are 20 bytes, next data start at byte 20:
T consecutive array of data, each entry is a T

Examples

* .umem: `uint  uint 0 0 ... 0 0 uint8_t uint8_t uint8_t uint8_t uint8_t uint8_t ...  uint8_t`
* .imem: `uint  uint 0 0 ... 0 0 int int int int int int ...  int`
* .fmem: `uint  uint 0 0 ... 0 0 float float float float float float ...  float`

We provide script to convert these datasets

./convert_fvecs --fvecs src/path/to/db.fvecs --umem dst/path/to/db.umem
./convert_bvecs --bvecs src/path/to/db.bvecs --umem dst/path/to/db.umem
./convert_ivecs --ivecs src/path/to/db.ivecs --imem dst/path/to/db.imem

Query (offline phase)

Building the index structure (Product-Quantization-Tree) is done by tool_createdb. This creates the index structure, prepares the database and dump all intermediate values into binary files.

Possible Flags (see ./tool_createdb -h):

- device    "selected cuda device"
- c1        "number of clusters in first level"
- c2        "number of refinements in second level"
- p         "parts per vector"
- dim       "expected dimension for each vector"
- lineparts "vectorparts for reranking informations"
- chunksize "number of vectors per chunk"
- hashsize  "maximal number of bins"
- basename  "prefix for generated data"
- dataset   "patch to vector dataset"

Query (online phase)

The query process is done batchwise using tool_query. The accompanying example return the best and second best found vector. For possible Flags see ./tool_query -h.

product-quantization-tree's People

Contributors

patwie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.