logzio / elasticsearch-stress-test Goto Github PK

View Code? Open in Web Editor NEW

271.0 13.0 111.0 39 KB

Stress test tool for Elasticsearch

License: Apache License 2.0

Python 94.27% Dockerfile 0.48% Shell 5.25%

opensource

elasticsearch-stress-test's Introduction

THIS PROJECT IS NO LONGER MAINTAINED

Elasticsearch Stress Test

Overview

This script generates a bunch of documents, and indexes as much as it can to Elasticsearch. While doing so, it prints out metrics to the screen to let you follow how your cluster is doing.

How to use

Save this script
Make sure you have Python 2.7+
pip install elasticsearch

How does it work

The script creates document templates based on your input. Say - 5 different documents. The documents are created without fields, for the purpose of having the same mapping when indexing to ES. After that, the script takes 10 random documents out of the template pool (with redraws) and populates them with random data.

After we have the pool of different documents, we select an index out of the pool, select documents * bulk size out of the pool, and index them.

The generation of documents is being processed before the run, so it will not overload the server too much during the benchmark.

Mandatory Parameters

Parameter	Description
`--es_address`	Address of the Elasticsearch cluster (no protocol and port). You can supply mutiple clusters here, but only one node in each cluster (preferably the client node)
`--indices`	Number of indices to write to
`--documents`	Number of template documents that hold the same mapping
`--clients`	Number of threads that send bulks to ES
`--seconds`	How long should the test run. Note: it might take a bit longer, as sending of all bulks whose creation has been initiated is allowed

Optional Parameters

Parameter	Description	Default
`--number-of-shards`	How many shards per index	3
`--number-of-replicas`	How many replicas per index	1
`--bulk-size`	How many documents each bulk request should contain	1000
`--max-fields-per-document`	What is the maximum number of fields each document template should hold	100
`--max-size-per-field`	When populating the templates, what is the maximum length of the data each field would get	1000
`--no-cleanup`	Boolean field. Don't delete the indices after completion	False
`--stats-frequency`	How frequent to show the statistics	30
`--not-green`	Script doesn't wait for the cluster to be green	False
`--no-verify`	No verify SSL certificates	False
`--ca-file`	Path to Certificate file
`--username`	HTTP authentication Username
`--password`	HTTP authentication Password

Examples

Run the test for 2 Elasticsearch clusters, with 4 indices on each, 5 random documents, don't wait for the cluster to be green, open 5 different writing threads and run the script for 120 seconds

python elasticsearch-stress-test.py  --es_address 1.2.3.4 1.2.3.5 --indices 4 --documents 5 --seconds 120 --not-green --clients 5

Run the test on ES cluster 1.2.3.4, with 10 indices, 10 random documents with up to 10 fields in each, the size of each field on each document can be up to 50 chars, each index will have 1 shard and no replicas, the test will run from 1 client (thread) for 300 seconds, will print statistics every 15 seconds, will index in bulks of 5000 documents and will leave everything in Elasticsearch after the test

 python elasticsearch-stress-test.py --es_address 1.2.3.4 --indices 10 --documents 10 --clients 1 --seconds 300 --number-of-shards 1 --number-of-replicas 0 --bulk-size 5000 --max-fields-per-document 10 --max-size-per-field 50 --no-cleanup --stats-frequency 15

Run the test with SSL

 python elasticsearch-stress-test.py --es_address https://1.2.3.4 --indices 5 --documents 5 --clients 1 --ca-file /path/ca.pem

Run the test with SSL without verify the certificate

 python elasticsearch-stress-test.py --es_address https://1.2.3.4 --indices 5 --documents 5 --clients 1 --no-verify

Run the test with HTTP Authentification

 python elasticsearch-stress-test.py --es_address 1.2.3.4 --indices 5 --documents 5 --clients 1 --username elastic --password changeme

Contribution

You are more then welcome! Please open a PR or issues here.

elasticsearch-stress-test's People

Contributors

Stargazers

Watchers

Forkers

aslijiasheng weiboyiyou shaheemirza axelabs cheesemochi sunnyjay binchensjtu houdejun214 nikhilbelsare 247entertainment traum-ferienwohnungen aboutte aligator77 hazdik thalesfsp danielmitterdorfer saliormoon liwang0513 tester808 blackwhites anosulchik gpdream jiaozhang1 sadok-f leunaides msbart29 ingussneilands munnerz markjacksonfishing sanggichoi floragunncom feng-static subodhkhanduri1 kubedb janavenkat txu2k8 nuptzp bora-fsd acartag7 lovepocky zenaptix-lab dbuarque ynuosoft rogervaas maryam79 vibhor995 code-machina cdoan1 adawolfs sanen shiveshabhishek sushma1118 nicaise hardrong zakariahoussa tschroeder-zendesk mintel theden momirjalili newton321 napoleon211092 lizrea barrywebb2 jayu8 sriramkosuri portworx tonymadbrain guru107 yanghongkjxy uudigitalhumanitieslab melodous wangyunpeng666 rishabh6788 attiliogreco ftgitnow marquesledivan ojoggerst panospa zkkxyz diggleweb wojtekdmyszewicz devops-corner yorondevops arunksingh16 siddhi1907 princewadhwa zerolugithub koralogixgirl zvictorino siryjvyiko vineeth-varma zleternity opendevsecopsio vibek-d a58982284 theoriginalkazak robertlindner korkin25 weastur mili-sucevic

elasticsearch-stress-test's Issues

Documents not creating...

python elasticsearch-stress-test.py --es_address 139.0.0.1 128.0.0.2 --indices 4 --documents 5 --seconds 120 --not-green --clients 2

Test is done! Final results:
Elapsed time: 132 seconds
Successful bulks: 0 (0 documents)
Failed bulks: 26 (26000 documents)
Indexed approximately 0 MB which is 0.00 MB/s

Cleaning up created indices.. Done!

Test always fails to bulk insert

python elasticsearch-stress-test.py --es_address https://es-dev.us-east-1.es.amazonaws.com:443 --indices 4 --documents 10 --clients 5 --seconds 10 --not-green --stats-frequency 5 --no-verify

Starting initialization of https://es-dev.us-east-1.es.amazonaws.com:443
/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/http_urllib3.py:135: UserWarning: Connecting to es-dev.us-east-1.es.amazonaws.com using SSL with verify_certs=False is insecure.
  'Connecting to %s using SSL with verify_certs=False is insecure.' % host)
Done!
Creating indices..
Generating documents and workers..
Done!
Starting the test. Will print stats every 5 seconds.
The test would run for 10 seconds, but it might take a bit more because we are waiting for current bulk operation to complete.

Elapsed time: 6 seconds
Successful bulks: 0 (0 documents)
Failed bulks: 5 (5000 documents)
Indexed approximately 0 MB which is 0.00 MB/s


Test is done! Final results:
Elapsed time: 11 seconds
Successful bulks: 0 (0 documents)
Failed bulks: 13 (13000 documents)
Indexed approximately 0 MB which is 0.00 MB/s

Cleaning up created indices..  Done!

I haven't been able to get this to produce any successful bulks regardless of what I've tried.

Indices are being created, but they never contain any documents.

Default timeout settings cause ConnectionErrors

Problem

When a bulk request is taking longer than the default timeout of the Python Elasticsearch client, the script is recording an error and moving on with the next request. The problem is that the server is likely still processing the request. Thus test script is actually throwing even more load at an already overloaded node.

Steps to reproduce

Start an Elasticsearch node - say 5.5.2 - with out-of-the-box settings on localhost
Run python elasticsearch-stress-test.py --es_address localhost --documents 10 --clients 10 --seconds 120 --indices 5 --no-cleanup --not-green

The script will produce failures due to read timeouts. If you insert a print in the try-except you'll see these:

ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host=u'localhost', port=9200): Read timed out. (read timeout=10))

Proposed solution

10 seconds is not an insanely long time period if you are hitting a node with default settings with large bulk requests. So I suggest to increase the timeout to e.g. 60 seconds by creating the Elasticsearch client with:

es = Elasticsearch(esaddress, timeout=60)

instead of

es = Elasticsearch(esaddress)

Add index name prefix config option

Sometimes test is killed by OS if it consumes too much memory. In such cases the database is not cleaned up and we should delete test indices manually. It would be nice to add index name prefix config option for situations described above. This prefix will give us ability to delete indices using wildcards.

Index size is not as expected

Hi,
i'm tring to benchmark an elasticsearch installation but i don't understand how the parameters max_fields_per_doc and max_size_per_field works.
Are they use to determinate the size of a single document, aren't they?

Thanks you for support
Cristian

Why the test shows different the performance between Windows and Linux?

I used the your stress test code for checking performance of our systems such as Microsoft Windows 10, 10 Server, Linux(CentOS 7).
In case of Linux(CentOS 7), It shows the 27-30 MB/s on the test using example 1.
(python elasticsearch-stress-test.py --es_address 1.2.3.4 1.2.3.5 --indices 4 --documents 5 --seconds 120 --not-green --clients 5)
But, On Windows 10 and 10 server shows the 2.7~4 MB/s.
Why difference the performance between them?
All of them have same configuration of elasticsearch such as heap size (2GB) and same hardwrare(CPU, RAM, SSD).

Doesn't work after 8.x

I tried to build and run the project recently and found out that it doesn't work. After some debugging, it is because of the pip elasticsearch version, after elasticsearch-py 8.0.0 it doesn't work. The last working version is 7.17.7. I'd suggest fixing the elasticsearch-py version on requirements.txt and update the README.

How to reproduce:

python -m venv .venv
source .venv/bin/activate
pip install elasticsearch
python

>>> from elasticsearch.connection import create_ssl_context
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'elasticsearch.connection'
>>> exit()

Works:

pip uninstall elasticsearch
pip install elasticsearch==7.17.7
python

>>> from elasticsearch.connection import create_ssl_context

Could not create index

Could not create index. Is your cluster ok?
TransportError(400, u'index_already_exists_exception', u'already exists')

Indices=30
documents=25
Client=3

Script stopping.

Hi,

first of all great work!

I am having some inconsistency issuse with the script.

Sometimes it runs well with bulk=1000 document=1 and clients=100 and sometimes it fails with lower settings, no I try to run with this command and it get killed, maybe its becuase I am sending it through NGINX loadbalancer?:

root@XXXXX:~# ./elasticsearch-stress-test.py  --es_address XXXX:8080  --indices 1 --documents 1 --seconds 60 --not-green --clients 10 --stats-frequency 5 --number-of-shards 6 --number-of-replicas 2 --bulk-size 500

Starting initialization of XXX:8080
Done!
Creating indices..
Generating documents and workers..
Done!
Starting the test. Will print stats every 5 seconds.
The test would run for 60 seconds, but it might take a bit more because we are waiting for current bulk operation to complete.

Killed

Thanks,
Tal

It should have parameter to run test as long as my container is not deleted or Volume does not have enough resources .

I was running this stress-tool inside container by deploying job on kubernetes. tool is working fine as it is creating fare load on my volume.but at certain period this is going to stop as it has --second parameter for running test. but i want to keep my stress test running as long as my

Volume get filled entirely.
Deleted my job or container which is running stress-tool.

elasticsearch-stress script should have some parameter added so that user can run it inside conatiner as long as he/she faces above condition.

Not generating load

Sample o/p on script run

Starting initialization of 127.0.0.1
Done!
Creating indices..
Cluster timeout....
Cleaning up created indices.. Starting the test. Will print stats every 15 seconds.
The test would run for 300 seconds, but it might take a bit more because we are waiting for current bulk operation to complete.

Test is done! Final results:
Elapsed time: 10 seconds
Successful bulks: 0 (0 documents)
Failed bulks: 0 (0 documents)
Indexed approximately 0 MB which is 0.00 MB/s

Add some examples

Can you please add some examples about how to use your script?
like :

./elasticsearch-stress-test.py es_address "10.10.10.10" indices 3 documents 10 clients 3 seconds 60

Load is getting generated more than specified time in OpenShift Cluster

Not able to generate high load

Hi, great tool!

Having said that, there are some issues to be solved :)

it seems like its impossible to generate heavy load. no matter which setting I try (clients, duration, doc size), the maximum load of docs/second I can reach is 2000.
i have tried setting the clients count to big numbers such as 5000, but I dont see indexing rate changes on Marvel (its never passes to 2000 docs/s).
maybe I am missing something?
Missing testing of non-bulk indexing. (es.index(...))

Need to create all indices using only one unassigned shard

Hi,
I'm back again.

This time I'm facing another problem.
The python script creates one unassigned shard per index.
Is this possible or not to keep the unassigned shard to 1? If possible, then how can I do this?

Or how can I reduce the total no. of unassigned shards?

Thanks.

Update README with the version support

Hi there, I am trying to run elasticsearch-stress-test in k8s on elasticsearch-7.2.0 . Unable to run tests there. Please check. I am also using the keys: --not-green --no-verify --not-green
I am attaching the log here:

Starting initialization of http://test:9200
Done!
Creating indices.. 
Could not create index. Is your cluster ok?
ConnectionError(('Connection aborted.', BadStatusLine("''",))) caused by: ProtocolError(('Connection aborted.', BadStatusLine("''",)))

If this does not support any specific version(s), I think it should be written on readme. Or if I am doing something wrong, please ping.

Add --port option for ES running on different ports. Eg: Amazon ES

Amazon ES runs on port 80 or 443 depending on the selected option.
For ES running on ports other than 9200, the script will require code change.
Please add --port option to test such hosted clusters of ES.