Coder Social home page Coder Social logo

cortalo / coldfilter Goto Github PK

View Code? Open in Web Editor NEW

This project forked from zhouyangpkuer/coldfilter

0.0 1.0 0.0 5.66 MB

This repository contains the open source related to Cold Filter meta-framework (accepted by SIGMOD2018).

CMake 1.05% C++ 96.13% C 2.82%

coldfilter's Introduction

Cold Filter

Introduction

Approximate stream processing algorithms, such as Count-Min sketch, Space-Saving, etc., support numerous applications in databases, storage systems, networking, and other domains. Unfortunately, because of the unbalanced distribution in real data streams, existing algorithms can hardly achieve small memory usage, fast processing speed, and high accuracy at the same time. To address this gap, we propose a meta-framework, called Cold Filter (CF), that enables faster and more accurate stream processing.

Different from existing filters that mainly focus on hot items, our filter captures cold items in the first stage, and hot items in the second stage. Also, existing filters require two-direction communication โ€“ with frequent exchanges between the two stages; our filter on the other hand is one-direction โ€“ each item enters one stage at most once. Our filter can accurately estimate both cold and hot items, giving it a genericity that makes it applicable to many stream processing tasks. To illustrate the benefits of our filter, we deploy it on three typical stream processing tasks and experimental results show speed improvements of up to 4.7 times, and accuracy improvements of up to 51 times.

About this repo

This repo contains all the algorithms in our experiments, as shown in the following table.

Task Algorithms
Estimating item frequency cm sketch (count min), cm-cu sketch (count min sketch with conservative update), A sketch
Finding top-k hot items cm sketch with heap, cm-cu sketch with heap, space saving
Detecting heavy changes FlowRadar (invertible IBLT)

This repo also contains a small demo to show how to use this algorithms with a small dataset.

Requirements

  • The gather-and-report part of CF use SIMD instructions to achieve high speed, so the cpu must support SSE2 instruction set.
  • cmake >= 2.6
  • g++ (MSVC is not supported currently.)

How to build

The project is built upon cmake. You can use the following commands to build and run.

mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make
cd ../
./demo

Related paper

Cold Filter: A Meta-Framework for Faster and More Accurate Stream Processing

coldfilter's People

Contributors

cortalo avatar zhouyangpkuer avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.