Coder Social home page Coder Social logo

hyperloglog's Introduction

The HyperLogLog algorithm [1] is a space efficient method to estimate the cardinality of extraordinarily large data sets. This module is written in C using a Murmur3 hash [2] for python 2.7.x or python 3.x.

[Build Status] (https://travis-ci.org/ascv/HyperLogLog)

v 1.0

Setup

You will need the python development package. On Ubuntu/Mint you can install this package using:

sudo apt-get install python-dev

Now install using pip:

sudo pip install HLL

Alternatively, install using setup.py:

sudo python setup.py install

Quick start

from HLL import HyperLogLog

hll = HyperLogLog(5) # use 2^5 registers
hll.add('some data')
estimate = hll.cardinality()

Documentation

add(data)

Adds data to the estimator where data is a string, buffer, or bytes type.

HyperLogLog(k, seed=314)

Create a new HyperLogLog using 2^k registers, k must be in the range [2, 16]. Set seed to determine the seed value for the Murmur3 hash. The default value was chosen arbitrarily.

merge(hll)

Merges another HyperLogLog into the current one. Merging compares individual registers and takes the maximum value for each one. The registers of the other HyperLogLog are unaffected.

murmur3_hash(data, seed=314)

Gets a signed integer from a Murmur3 hash of data where data is a string, buffer, or bytes (python 3.x). Set seed to determine the seed value for the Murmur3 hash. The default value was chosen arbitrarily.

registers()

Gets a bytearray of the registers.

seed()

Gets the seed value used in the Murmur3 hash.

set_register(index, value)

Sets the register at index to value. Indexing is zero-based.

set_registers(registers)

Sets the registers to new_registers. If new_registers is too long then the extra new registers are ignored. If new_registers is too short then the extra registers are not modified.

size()

Gets the number of registers.

License

This software is released under the MIT License.

References

[1] http://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf [2] https://github.com/PeterScott/murmur3

hyperloglog's People

Contributors

ascv avatar dfuhry avatar b3au avatar

Watchers

tao avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.