Coder Social home page Coder Social logo

1brc-bench's Introduction

1BRC reproducible benchmark

See results branch for the output.

Works on Ubuntu 22.04, may work on Debian 12 with .NET install adjustments in prerequisites.sh (apt repo).

  1. Clone this repo.
  2. Run bash update.sh. (not ./update.sh). This checks out latest repositories and also makes all scripts executable.
  3. Run ./prerequisites.sh.
  4. Run ./build.sh.
  5. Place measurements_1B.txt and measurements_1B_10K.txt input files in ./inputs. You may use zstd and place compressed files here, e.g. measurements_1B.txt.zst.
  6. [Optional] Call use_input.sh 1B or use_input.sh 1B_10K to select a dataset. You may use your own suffix. This places the files in tmpfs, so you must have at least 30GB of RAM just for the files. For most implementations using mmap this is all that is needed, the apps itself use very little. Do not try to run this if you have less than 32GB. You will have to adjust the scripts to read files from a disk directly or run only the default dataset.
  7. Run ./run.sh <username> <cores=4> <threads=2*cores> <dataset=1B> <runs=5>, e.g. bash run.sh buybackoff 6 12 1B_10K 5.
  8. Run ./run_ds.sh <dataset> <max_cores=4> <runs=5> <min_cores=4> to run all users on a dataset.
  9. Run ./run_all.sh <max_cores=4> <runs=5> <min_cores=4> for a complete benchmark run.

In the last two options, we start with min_cores value and double it. If we miss max_cores we run max_cores config anyways. (e.g. on a 24C machine if we start with 4 cores we will have 4, 8, 16, 32. We will run 24 anyways. HT is always tried, it is assumed that it is enabled. Otherwise you will have to adjust the scripts.)

Json output will be stored in results dir.

Sudo is required for numactl. But it could become messy if you use remote VSCode. Just do sudo chown -R username ./* if you want to edit a file but receive a permission error.

Run on AWS metal instances

Right now, c5.metal spot instances are very cheap. E.g. below $0.60 in Stockholm. Other spot instances are also quite affordable. For AMD see c6a.metal in London, Mumbai or less busy US regions.

It's likely you do not have quotas sufficient to run many spot vCPUs. You could request them from AWS console, just search for the right link. For c5.metal my request was approved automatically. For c6a.metal it's already more than 12 hours of Unassigned status.

Do not go to Spot Requests section in EC2. Instead go to Instances and start launching an instance normally. Then in advanced settings there will be an option to select Spot and set max price and other params.

Before you to launch an instance, create a key pair with you local id_rsa.pub or it's equivalent. Select that pair during instance setup. After that ssh [email protected] abd VSCode remote just work flawlessly.

Run steps from above. ./run_all.sh max_cores runs is fully automated suite for all benchmarks. Min cores are set in ./run_ds.sh in the loop as the counter initial value, now it's 6.

Usage in Proxmox/LXC

There is no difference between bare metal and Proxmox container setup but only if the container is privileged. Otherwise numactl does not work.

1brc-bench's People

Contributors

buybackoff avatar pedrosakuma avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.