Coder Social home page Coder Social logo

pt1043 / bloomfilter Goto Github PK

View Code? Open in Web Editor NEW

This project forked from lovasoa/bloomfilter

0.0 1.0 0.0 67 KB

Simplistic (but fast) java implementation of a bloom filter.

Home Page: https://en.wikipedia.org/wiki/Bloom_filter

License: MIT License

Java 100.00%

bloomfilter's Introduction

BloomFilter

Build Status Jitpack repository

A Bloom filter is a hashing based data structure for maintaining a set of items in limited memory, allowing false positives but no false negatives.

This repository contains a simple but performant implementation of bloom filters in java.

See the Wikipedia page for Bloom filters.

How to use

Add the project to your dependencies

You can add this project to your dependencies very easily using jitpack.io. See instructions.

Usage example

import com.github.lovasoa.bloomfilter.BloomFilter;

// Create a new bloom filter optimized for containing 100 elements and using 1024 bits of memory
BloomFilter f = new BloomFilter(100, 1024);

// Add elements to the filter
// it uses Object.hashCode() internally, so you can add objects of any type
f.add("hello");

// Check if an element is in the filter
f.contains("hello"); // true
f.contains("hello, world!"); // false

Documentation

Read the full javadoc of the BloomFilter class.

Implementation details

Hashes

It doesn't use any fancy hash function. It uses object.hashCode() instead. You can override your objects' .hashCodemethod if you want better hashes.

No allocation

It doesn't do any allocation when adding new elements or checking if an element is present. It should thus be faster than many other implementations.

Performance

A class is provided that helps making performance measurements: Test.java. It tests the implementation with a Bloom filter containing randomly generated integers.

Here are the results it gives on my laptop (Core i7-4600M CPU @ 2.90GHz) with a set of 10 million integers added to a 10 megabyte Bloom filter:

Testing a bloom filter containing n=10000000 elements in a bit array of m=80000000 bits (=9.5Mib) 

Testing correctness.
Creating a filter, a set, and filling them...
Elements incorrectly found to be inside:   215013/10000000 (2.15%)
done.

Testing insertion speed...
Inserted 10000000 elements in 3445388006 ns.
Insertion speed: 2.90243e+06 elements/second

Testing query speed...
Queried 10000000 elements in 1537504033 ns.
Query speed: 6.50405e+06 elements/second

We see that:

  • The implementation is correct: the error rate is p=exp(-ln(2)^2 * m/n)
  • It is quite fast
    • It can insert around 2 million elements per second.
    • It can query around 6 million elements per second.

bloomfilter's People

Contributors

lovasoa avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.