Coder Social home page Coder Social logo

jamiealquiza / sangrenel Goto Github PK

View Code? Open in Web Editor NEW
203.0 12.0 50.0 1.98 MB

Apache Kafka load testing "...basically a cloth bag filled with small jagged pieces of scrap iron"

License: MIT License

Go 100.00%
kafka benchmark load-testing go sarama

sangrenel's Introduction

sangrenel

[Update] Sangrenel is currently being updated. Take note of issues.

"...basically a cloth bag filled with small jagged pieces of scrap iron"

Sangrenel is Kafka cluster load testing tool. Sangrenel was originally created for some baseline performance testing, exemplified in my Load testing Apache Kafka on AWS blog post.

While using this tool, keep in mind that benchmarking is always an examination of total systems performance, and that the testing software itself is part of the system (meaning you're not just testing Kafka, but Kafka+Sangrenel).

Example

Sangrenel takes configurable message/batch sizing, concurrency and other settings and writes messages to a reference topic. Message throughput, batch write latency (p99, harmonic mean, min, max) and a latency histogram are dumped every 5 seconds.

img_0856

Installation

Assuming Go is installed (tested with 1.7+) and $GOPATH is set:

  • go get -u github.com/jamiealquiza/sangrenel
  • go install github.com/jamiealquiza/sangrenel

Binary will be found at $GOPATH/bin/sangrenel

Usage

Usage output:

Usage of sangrenel:
  -api-version string
    	Explicit sarama.Version string
  -brokers string
    	Comma delimited list of Kafka brokers (default "localhost:9092")
  -compression string
    	Message compression: none, gzip, snappy (default "none")
  -graphite-ip string
    	Destination Graphite IP address
  -graphite-metrics-prefix string
    	Top-level Graphite namespace prefix (defaults to hostname) (default "ja.local")
  -graphite-port string
    	Destination Graphite plaintext port
  -interval int
    	Statistics output interval (seconds) (default 5)
  -message-batch-size int
    	Messages per batch (default 500)
  -message-size int
    	Message size (bytes) (default 300)
  -noop
    	Test message generation performance (does not connect to Kafka)
  -produce-rate uint
    	Global write rate limit (messages/sec) (default 100000000)
  -required-acks string
    	RequiredAcks config: none, local, all (default "local")
  -tls
      Whether to enable TLS communcation (default "false")
  -tls-ca-cert string
      Path to the CA SSL certificate
  -tls-cert-file string
      Path to the certificate file
  -tls-key-file string
      Path to the private key file
  -tls-insecure-skip-verify
      TLS insecure skip verify (default false)
  -topic string
    	Kafka topic to produce to (default "sangrenel")
  -workers int
    	Number of workers (default 1)
  -writers-per-worker int
    	Number of writer (Kafka producer) goroutines per worker (default 5)

Sangrenel uses the Kafka client library Sarama. Sangrenel starts one or more workers, each of which instantiate a unique Kafka client connection to the target cluster. Each worker has a number of writers which generate and send message data to Kafka, sharing the parent worker client connection. The number of workers is configurable via the -workers flag, the number of writers per worker via the -writers-per-worker. This is done for scaling purposes; while a single Sarama client can be used for multiple writers (which live in separate goroutines), performance begins to flatline at some point. It's best to leave the writers-per-worker at the default 5 and scaling the worker count as needed, but the option is exposed for more control. Left as a technical exercise for the user, there's a different between 2 workers with 5 writers each and 1 worker with 10 writers.

The -topic flag specifies which topic is used, allowing configs such as parition count and replication factor to be prepared ahead of time for performance comparisons (by switching which topic Sangrenel is using). The -message-batch-size, -message-size and -produce-rate flags can be used to dictate message size, number of messages to batch per write, and the total Sangrenel write rate. -required-acks sets the Sarama RequiredAcks config. The -api-version flag allows Sarama to be configured at a specific API level (See the version field of the Sarama Config struct). The supported API versions can be found here.

Two important factors to note:

  • Sangrenel uses Sarama's SyncProducer, meaning messages are written synchronously
  • At a given message size, Sangrenel should be tested in -noop mode to ensure the desired number of messages can be generated (even if a -produce-rate is specified)

Once running, Sangrenel generates and writes messages as fast as possible (or to the configured -produce-rate). Every 5 seconds, message throughput, rates, latency and other metrics (via tachymeter) are printed to console.

If optionally defined, some metric data can be written to Graphite. Better metric output options will be added soon.

sangrenel's People

Contributors

dd-caleb avatar eladleev avatar jamiealquiza avatar kapouille avatar ntk148v avatar pjagielski avatar radekg avatar rgruchalski-klarrio avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sangrenel's Issues

kafka: client has run out of available brokers to talk to (Is your cluster reachable?)

When I run sangrenel in Centos , I face bellow error;

# sangrenel 
Starting 1 client workers, 5 writers per worker
Message size 300 bytes, 500 message limit per batch
API Version: automatic, Compression: none, RequiredAcks: local
2020/02/19 14:05:12 kafka: client has run out of available brokers to talk to (Is your cluster reachable?)

any thoughts?

add overhead % in output

Track % of time spent not actually sending messages, rather, preparing them (e.g. batch assembly)

Add additional info outputs

Sangrenel dumps periodic output to console, which is easily parsed for input elsewhere. Would be better to have things like:

  • native output into Graphite so that Sangrenel data can be overlaid with external but relevant data (e.g. Kafka node metrics).
  • an option to run for a fixed period and dump a report in addition to the periodic info output

Add auto-tuning mode for autonomously discovering Kafka performance

Sangrenel has great input controls such as worker concurrency, message sizes and rate limits - so that Kafka cluster can be observed within a controlled environment. Would be nice if performance thresholds could be pre-defined, such as a desired 90th percentile and worst case latency treshold, and Sangrenel automatically adjusts workers & message rates to tell you what a spec'd cluster should be suitable to handle in terms of throughput. For example, "I just built a 6 node Dell R420 cluster, how much throughput can this handle while maintaining a sub 10ms latency?" - The user should enter '10ms', a run duration, fire up Sangrenel and walk away for a jam cookie while the service does the work for them.

Add message template support

Completely random messages have limited value. Sangrenel needs a template system that allows users to model messages that more accurately reflect real world data.

new release version

Could you make a new release version in github?

The current latest actually has quite a few different features then master. Having it be the latest release was a little confusing when I tried sangrenel out.

Thanks!

Needs better clarity/controls on the underlying Sarama client

As noted in my blog post, benchmarking is complicated; this tool effectively couples (beyond just a Kafka cluster) the Sarama client as a piece of the performance testing. Underlying factors such as batching, connection counts and other controls have significant effects. These effects in relation to Sangrenel need to be clarified and controls need to be exposed where needed.

code.google.com/p/snappy-go/snappy moved to github.com/google/snappy

github.com/jamiealquiza/sangrenel/vendor/github.com/Shopify/sarama/snappy.go: "code.google.com/p/snappy-go/snappy" moved to "github.com/google/snappy"

$ go get github.com/jamiealquiza/sangrenel
package github.com/jamiealquiza/sangrenel
imports code.google.com/p/snappy-go/snappy: unable to detect version control system for code.google.com/ path

Add clustering

Single c3.8xlarge caps out @ ~3Gb/s. output. Need to build Sangrenel clustering with centralized controls and a working distributed rate limiter.

Provide options for authentication

I currently see no way to use this tool with Kafka cluster that requires authentication. (other than SSL).

In my case we are using SASL_PLAINTEXT and SCRAM-SHA-256, which would imply providing username/password when attempting to authentication. I currently see no option to supply those.

It either needs to be implemented or it needs to be documented (if already implemented).

Add AsyncProducer option?

From what I can tell, Sangrenel uses sync producers (right?). If so, it would be nice to be able to play around with an async producer as well to observe the batch sending from the producer side and observe how well the brokers handle different incoming batch sizes.

Invalid API version options

Hi,
The API version returns the wrong list of options, it make me confused. You can see bellow pic, even with the correct 2.2.0.0 option, it still invalid.

image

arf

I mean, of course.

support a message template file

Allow a template file pre-loaded with sample message to be used. Not only will it more closely resemble user-specific data, but compression performance will be more realistically represented (compressing the Sangrenel generated random data doesn't result in high ratios).

breakage

panic: strings: negative Repeat count

goroutine 1 [running]:
strings.Repeat(0x710b59, 0x1, 0x8000000000000000, 0x0, 0xc42373a000)
        /usr/local/go/src/strings/strings.go:432 +0x258
github.com/jamiealquiza/tachymeter.(*Histogram).String(0xc420aa5ea0, 0x32, 0xc42270df28, 0x5)
        /home/jamiealquiza/go/src/github.com/jamiealquiza/tachymeter/tachymeter.go:271 +0x2a6
main.main()
        /home/jamiealquiza/go/src/github.com/jamiealquiza/sangrenel/sangrenel.go:161 +0xd72

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.