Coder Social home page Coder Social logo

jqk6 / comparing-filters Goto Github PK

View Code? Open in Web Editor NEW

This project forked from theholyjoker/comparing-filters

0.0 1.0 0.0 13.47 MB

Benchmarking Bloom, Cuckoo, Morton, and PD based filter.

License: Other

CMake 0.40% C++ 99.09% C 0.46% Makefile 0.05% Python 0.01%

comparing-filters's Introduction

Comparing Filters

Currently benchmarking:

  1. Bloomfilter (BF)
  2. Cuckoo filter (CF)
  3. SIMD blocked Bloom filter (SIMD)
  4. Morton filter (MF)
  5. Pocket Dictionary filter (PD)

Validation

The files test.hpp test.cpp contain validation tests on the filter.

  1. Making sure the filter does not have a false negative. (Indicating the element is in not in the filter when it is.)
  2. Checking the filter false positive rate is as expected. Filter often have a parameter controlling on the false positive probability $\epsilon$, when it is increased, the filter uses more space, and has smaller error probability.

Benchmark

There are various benchmark to evaluate the error probability under differents loads, and speed test by four paramaters: Insertions, uniform lookup (uniform lookup result is "no" w.h.p. in standard scenarios), True-lookup (of elements in the filter) and Deletions.

Usage

Dependencies

Since CF uses openssl library, the project won't compile unless it is installed. (See CF repository)

To build

git clone -b Simpler https://github.com/TheHolyJoker/Comparing_Filters.git
cd Comparing_Filters
mkdir build
cd build
cmake..
cmake --build ./ --target Filters

To run

In build directory run

./Filters <filter indicator> <exponent of number of keys> <lookup factor> <rounds>
  1. filter indicator: Which filter to test.

    1. To include BF in the test,filter indicator & 1 should be true.
    2. To include CF in the test,filter indicator & 2 should be true.
    3. To include SIMD in the test,filter indicator & 4 should be true.
    4. To include MF in the test,filter indicator & 8 should be true.
    5. To include PD in the test,filter indicator & 16 should be true.

    The default value is -1 to test all filters.

  2. exponent of the number of keys: Every filter is built to contain at most 2^exponent of the number of keys.
    The default value is 24. (should not be set to less than 16 or MF might fail)

  3. lookup factor: Lookup exponent factor. If set to d and n insertions will be performed, then n*2^d lookups will be performed.
    The default value is 2

  4. rounds: The benchmark performs insertion, and then lookup where each time a fraction of 1/rounds of the total number of elements is queried.
    The default value is 32.

Credit

Large parts of the code and its structure are taken from https://github.com/FastFilter/fastfilter_cpp.

Cuckoo filter is from https://github.com/efficient/cuckoofilter by Bin Fan et al.

SIMD blocked Bloom filter is from https://github.com/apache/impala.

Morton filter is from https://github.com/AMDComputeLibraries/morton_filter.

Counting Quotient Filter (CQF) is from https://github.com/splatlab/cqf. (Currently not in use).

Pocket Dictionary is work in progress see https://github.com/TomerEven/Pocket_Dictionary. The Pocket Dictionary class that uses advanced SIMD instructions, is taken from here by Jim Apple (@Jbapple).

Papers

Bloom filter https://en.wikipedia.org/wiki/Bloom_filter

Cuckoo Filter "Cuckoo Filter: Better Than Bloom" by Bin Fan, Dave Andersen and Michael Kaminsky

SIMD blocked Bloom filter Cache-, Hash- and Space-Efficient Bloom Filters

Morton filter Morton Filters: Faster, Space-Efficient Cuckoo Filters via Biasing, Compression, and Decoupled Logical Sparsity,

Quotient Filter A General-Purpose Counting Filter: Counting Quotient Filter (CQF)

Pocket Dictionary Fully-Dynamic Space-Efficient Dictionaries and Filters with Constant Number of Memory Accesses

To do

  1. Add Filters
    1. Vacuum-Filter paper repository
    2. Quotient-Filter repository
  2. Counting filter benchmark.

comparing-filters's People

Contributors

adam-morrison avatar theholyjoker avatar tomereven avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.