Coder Social home page Coder Social logo

1brc's Introduction

1BRC

1๏ธโƒฃ๐Ÿ๐ŸŽ๏ธ The One Billion Row Challenge -- A fun exploration of how quickly 1B rows from a text file can be aggregated. The challenge was primarily foces on Java but I decided to solve it in Golang!

The original solution with the tests I re-use for this repository:

I wrote a detailed blog about my implementation approach, you can check it out here.

The benchmarking results are run on my personal MacBook Pro 2020, 1.4 GHz Quad-Core Intel Core i5, 8 GB 2133 MHz LPDDR3.

The measurement file is about 13GB in size.

As a baseline, the original naive Java implementation took 3:55.68 to run on this machine.

๐Ÿš€ Results

Attempt Number Approach Execution Time (m:ss) Note Commit
1 Naive approach, read data to map and process sequentially 4:57.19 - -
2 Using readSlice and custom parseInt function 4:18.95 - -
3 Read file by chunks of 20MB 4:07.94 - -
3.1 Read file by chunks of 100MB 4:14.20 - -
3.1 Read file by chunks of 50MB 4:09.89 - -
4 Using concurrency with 6 worker in total including main thread 1:27.80 The chunk size of 20MB and 6 gouroutines seems to be the sweet spot. Other set up are drastically slower, usually take around 3m50s. And not sure why it's not faster in the case of 1M row file. -
5 Replace bytes.Split with finding ; manually 52.513 - -
6 Reduce hash map read, and find new optimal chunk size 44.495 - -
7 Manually parse valid chunk instead of using split bytes 39.524 - -
7.1 Change back chunk size to 20MB 34.692 - -
8 Use unsafe byte slice to string 28.591 - -
8.1 Finding best combination of concurrent degree and chunk size 24.063 - -
9 Using swiss map 19.867 - -
10 Using unbuffered channel 19.674 - -

๐Ÿš€ Bechmark Results

The overall results above is ran on my machine with many other running processes so it might vary. This bechmark is ran using go testing bench, on a file with 1M rows, so it's supposed to be more stable and reliable.

make bench
Attempt Number Evaluate 1M Evaluate 10M
1 289ms
2 245ms
3 254ms
4 241ms
5 175ms
6 117ms
7 108ms
8 88ms
8.1 88ms
8.1 78ms
9 78ms
10 77ms

1brc's People

Contributors

nlgtea avatar shraddhaag avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.