Coder Social home page Coder Social logo

christianbaun / ossperf Goto Github PK

View Code? Open in Web Editor NEW
70.0 7.0 12.0 157 KB

A lightweight tool for analyzing the performance and data integrity of object-based storage services

License: GNU General Public License v3.0

Shell 100.00%
cloud storage-service object-storage aws-s3 cloud-storage performance-testing shell

ossperf's Introduction

GPLv3 license made-with-bash

OSSperf

OSSperf is a lightweight command-line tool for analyzing the performance and data integrity of storage services that implement the S3 API, the Swift API, or the Azure Blob Storage API. The tool creates a user-defined number of files with random content and of a specified size inside a local directory. The tool creates a bucket, uploads and downloads the files, and afterward removes the bucket. The time required to carry out theses S3/Swift/Azure-related tasks is measured and printed out on the command line.

Until November 2017, the OSSperf tool had the name S3perf because, initially, the tool had only implemented support for storage services, which implement the S3 API. Because now, the solution targets also storage services that implement different APIs, the tool was renamed to OSSperf. OSS stands for Object-based Storage Services.

Storage services tested with this tool are so far:

Publications

Synopsis

ossperf.sh -n files -s size [-b <bucket>] [-u] [-a] [-m <alias>] [-z] [-g] [-w] [-l <location>] [-d <url>] [-k] [-p] [-o]

This script analyzes the performance and data integrity of S3-compatible
storage services 

Arguments:
-h : show this message on screen
-n : number of files to be created
-s : size of the files to be created in bytes (max 16777216 = 16 MB)
-b : ossperf will create per default a new bucket ossperf-testbucket (or OSSPERF-TESTBUCKET, in case the argument -u is set). This is not a problem when private cloud deployments are investigated, but for public cloud scenarios it may become a problem, because object-based storage services implement a global bucket namespace. This means that all bucket names must be unique. With the argument -b <bucket> the users of ossperf have the freedom to specify the bucket name
-u : use upper-case letters for the bucket name (this is required for Nimbus Cumulus and S3ninja)
-a : use the Swift API and not the S3 API (this requires the python client for the Swift API and the environment variables ST_AUTH, ST_USER and ST_KEY)
-m : use the S3 API with the Minio Client (mc) instead of s3cmd. It is required to provide the alias of the mc configuration that shall be used
-z : use the Azure CLI instead of the S3 API (this requires the python client for the Azure CLI and the environment variables AZURE_STORAGE_ACCOUNT and AZURE_STORAGE_ACCESS_KEY)
-g : use the Google Cloud Storage CLI instead of the s3cmd (this requires the python client for the Google API)
-w : use the AWS CLI instead of the s3cmd (this requires the installation and configuration of the aws cli client)
-l : use a specific site (location) for the bucket. This is supported e.g. by the AWS S3 and Google Cloud Storage
-d : If the aws cli shall be used with an S3-compatible non-Amazon service, please specify with this parameter the endpoint-url
-k : keep the local files and the directory afterwards (do not clean up)
-p : upload and download the files in parallel
-o : appended the results to a local file results.csv

Requirements

These software packages must be installed:

  • bash 4.3.30
  • bc 1.06.95
  • md5sum 8.26
  • parallel 20161222
  • s3cmd -- Command line tool for working with storage service that implement the S3 API (tested with version 1.5.0, 1.6.1 and 2.0.2)

These software packages are optional:

  • swift -- Python client for the Swift API (tested with version 2.3.1)
  • mc -- Minio Client for the S3 API (tested with version RELEASE.2020-02-05T20-07-22Z)
  • az -- Python client for the Azure CLI (tested with version 2.0)
  • gsutil -- Python client for the Google Cloud Storage as replacement for s3cmd (tested with version 4.27 and 4.38)
  • aws -- AWS CLI client for the S3 API (tested with version 1.15.6)

Examples

This command creates five files of size 1 MB each and uses them to test the performance and data integrity of the storage service. The new bucket used has the name ossperf-testbucket, and the uploads and downloads are carried out in parallel. The s3cmd command line tool is used.

./ossperf.sh -n 5 -s 1048576 -b ossperf-testbucket -p

This command does the same, but uses the Minio Client mc for the S3 API insted of s3cmd.

./ossperf.sh -n 5 -s 1048576 -b ossperf-testbucket -p -m minio

This command creates ten files of size 512 kB each and uses them to test the performance and data integrity of the AWS S3 in the region eu-west-2 (Frankfurt am Main). The new bucket used has the name my-unique-bucketname, and the uploads and downloads are carried out in parallel. The aws command line tool is used.

./ossperf.sh -n 10 -s 524288 -b my-unique-bucketname -p -w eu-west-2

Related Work

Some interesting papers and software projects focusing on the performance evaluation of S3-compatible services:

  • An Evaluation of Amazon's Grid Computing Services: EC2, S3 and SQS. Simson Garfinkel. 2007. In this paper, the throughput which S3 can deliver via HTTP HET requests with objects of different sizes, is evaluated over several days from several locations by using a self-written client. The software was implemented in C++ and used libcurl for the interaction with the storage service. Sadly, this client tool was never released. The focus of this work is the download performance of Amazon S3. Other operations like the the upload performance are not investigated. Unfortunately, this tool has never been released by the author.
  • Amazon S3 for Science Grids: a Viable Solution? Mayur Palankar, Adriana Iamnitchi, Matei Ripeanu, Simson Garfinkel. 2008. Proceedings of the 2008 international workshop on Data-aware distributed computing (DADC 2008). Pages 55-64.
  • Real-world benchmarking of cloud storage providers: Amazon S3, Google Cloud Storage, and Azure Blob Storage. Larry Land. 2015. The author analyzes the performance of different public cloud object-based storage services with files of different sizes any by using the command line tools of the service providers and by mounting buckets of the services as file systems in user-space.
  • Windows Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency. Calder et. al. 2011. The authors describe the functioning of the Microsoft Azure Storage service offering and analyze the performance for uploading and downloading objects of 1 kB and 4 MB in size. Unfortunately, the paper provides no further details about the tool, that has been used by the authors to carry out the perforamance measurements.
  • CloudCmp: Comparing Public Cloud Providers. Li et al.. 2010. The authors analyzed the performance of the four public cloud service offerings Amazon S3, Microsoft Azure Blob Storage and Rackspace Cloud Files with their self developed Java software solution CloudCmp for objects of 1 kB and 10 MB in size. The authors among others compare the scalability of the mentioned blob services by sending multiple concurrent operations and were able to make bottlenecks visible when uploading or downloading multiple objects of 10 MB in size.
  • DepSky: Dependable and Secure Storage in a Cloud-of-Clouds. Bessani et al.. 2013. The authors analyzed in 2013 among others the required time to upload and download objects of 100 kB, 1 MB, and 1 0MB in size from Amazon S3, Microsoft Azure Blob Storage and Rackspace Cloud Files from clients that were located on different parts of the globe. In their work, the authors describe DepSky, a software solution that can be used to create a virtual storage cloud by using a combination of diverse cloud service offerings in order to archive better levels of availability, security, privacy and prevent the situation of a vendor lock-in. While the DepSky software has been released to the public by the authors, they did not publish a tool to carry out performance measurements of storage services so far.
  • AWS S3 vs Google Cloud vs Azure: Cloud Storage Performance. Zach Bjornson. 2015. The author measured the latency - time to first byte (TTFB) and the throughput of different public cloud object-based storage services by using a self-written tool. Sadly, this tool was never released.
  • COSBench - Cloud Object Storage Benchmark. This very complex benchmarking tool from Intel is able to measure the performance of different object-based storage services. The tool is written in Java. It provides a web-based user interface and many helpful documentation for users and developers.
  • S3 Performance Test Tool. Jens Hadlich. 2015/2016. A performance test tool, which is implemented in Java and can be used to evaluate the upload and download performance of Amazon S3 or S3-compatible object storage systems.
  • s3-perf. Ross McFarland. 2013. Two simple Python scripts which make use of the boto library to measure the download and upload data rate of the Amazon S3 service offering for different file object sizes.

Web Site

Visit the ossperf web page for more information and the latest revision.

https://github.com/christianbaun/ossperf

Some further information provides the Wiki

License

GPLv3 or later.

ossperf's People

Contributors

christianbaun avatar equipsunglasses avatar firedigger avatar rhizoet avatar spdfnet avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

ossperf's Issues

any current benchmarks available?

Hey there,

thank you for putting this together :)

I wanted to know if there are any current benchmarks, since the last published one is quite outdated. Otherwise I will see to setup one on my own.

Standard_in errors causes no results

I'm running this tool on a local minio instance and have tried using both -m and just the standard s3cmd utility. Everything appears to work fine, except there are many parse errors coming from standard_in and at the end, no results are displayed:

[OK] Bucket ossperf-testbucket has been erased with mc.
(standard_in) 1: illegal character: N
(standard_in) 1: illegal character: N
[OK] The directory testfiles has been erased.
[1] Required time to create the bucket:                  s
[2] Required time to upload the files:                   s
[3] Required time to fetch a list of files:              s
[4] Required time to download the files:                 s
[5] Required time to erase the objects:                  s
[6] Required time to erase the bucket:                   s
(standard_in) 1: parse error
    Required time to perform all S3-related operations:  s

    Bandwidth during the upload of the files:            Mbps
    Bandwidth during the download of the files:          Mbps

Download / erase duration incorrect

Hi,

First, thanks for this amazing tool.

I run ossperf with the following parameters:

./ossperf.sh -n 100 -s 16777216 -b benchmark-test -p -m {mc_alias} -o

I get this result:

[1] Required time to create the bucket:                 2.262 s
[2] Required time to upload the files:                  60.858 s
[3] Required time to fetch a list of files:             0.297 s
[4] Required time to download the files:                0.194 s
[5] Required time to erase the objects:                 0.431 s
[6] Required time to erase the bucket:                  0.692 s
    Required time to perform all S3-related operations: 64.734 s

    Bandwidth during the upload of the files:           220.542 Mbps
    Bandwidth during the download of the files:         69184.395 Mbps

Is 0.194 s the required time to download all the 100 files or just one of them? Same question to erase objects.

How to use -l option

Hi,
I was trying to use -l location option but getting error if i directly use the command

for eg:
./ossperf.sh -n 10 -s 1048576 -b ossperf-testbucket -a -l
./ossperf.sh: option requires an argument -- l
[ERROR] Invalid option!

and if i give some random input then also it's executing;

eg: ./ossperf.sh -n 10 -s 1048576 -b ossperf-testbucket -a -l 123
[INFO] The tool s3cmd has been found on this system.
s3cmd version 2.0.1
[INFO] The tool bc has been found on this system.
bc 1.07.1
[INFO] The tool md5sum has been found on this system.
md5sum (GNU coreutils) 8.28
[INFO] The tool ping has been found on this system.
ping utility, iputils-s20161105
[INFO] The swift client has been found on this system.
[OK] This computer has a working internet connection.
[OK] The storage service can be accessed via the tool swift.
[OK] The directory testfiles has been created in the local directory.
1048576+0 records in
1048576+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 2.41526 s, 434 kB/s
[OK] File with random content has been created.
^Z
[3]+ Stopped ./ossperf.sh -n 10 -s 1048576 -b ossperf-testbucket -a -l 123

So my question here is whats the right way to use -l option while executing the ossperf command.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.