Coder Social home page Coder Social logo

jay1999ke / plantd-operator Goto Github PK

View Code? Open in Web Editor NEW

This project forked from carnegiemellon-plantd/plantd-operator

0.0 0.0 0.0 27.1 MB

Performance and Latency ANalysis Testing for Data pipelines

License: GNU General Public License v2.0

JavaScript 1.30% Python 9.12% Go 82.84% Makefile 4.90% Dockerfile 0.55% Java 1.30%

plantd-operator's Introduction

Data Pipeline Wind Tunnel

Data Pipeline PlantD

PlantD: (Performance, Latency ANalysis and Testing for Data pipelines) is a harness for measuring the performance of data pipelines during and after development. PlantD collects a standard suite of metrics and visualizations, for use when developing or deciding among data pipeline architectures, configurations, and business use cases.

Concepts

To use PlantD, you configure it with the following information:

  • How to reach your pipeline-under-test: a description of the pipeline you want to measure, including at least an IP address and port number to send data in, and tags that uniquely identify your pipeline's resources on your cloud provider.
  • The data schema that your pipeline requires as input, that is, what data items are fed into the pipeline, as well as their data format and allowable values.
    • From this, PlantD will generate a dataset: a quantity of generated fake data that meets that schema, for use in testing
  • A load pattern describing a variable rate of load generation, for example: 100 records per second steadily for 5 minutes, then ramping up over 1 minute to 200 records per second, staying steady for 10 minutes, then ramping down to 0 over a 2 minute span.
    • PlantD's load generator will send data to your pipeline following this pattern
  • A description of the experiment you want to run: a timed session where the load generator sends a dataset to a pipeline-under-test using a load pattern, and collects metrics during and after the load generation.

Prerequisites

You will need:

  • A pipeline to measure
  • A Kubernetes Cluster (Managed or Standalone)
  • kubectl with access to the cluster

Test Pipeline (Coming soon)

If you aren't ready to test your own pipeline, we supply a toy pipeline for demonstration

Kubernetes cluster

If you don't have a kubernetes cluster handy, you can use a small test cluster using Minikube. Make sure the cluster has at least 10GB of memory assigned.

Note that this will not scale up well to measuring dataflow of large pipelines, but it's enough to experiment and find out how PlantD works.

Type kubectl cluster-info to check that it's running

Deploying the Operator

The easiest way to setup oeprator is to use the bundle.yaml deployments.

Bundle deployments

### Instal the K6 Operator
curl https://raw.githubusercontent.com/grafana/k6-operator/main/bundle.yaml | kubectl create -f -

### Install the Prometheus Operator
curl https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/main/bundle.yaml | kubectl create -f -


### Install the PlantD Operator
curl https://raw.githubusercontent.com/CarnegieMellon-PlantD/PlantD-operator/main/bundle.yaml | kubectl create -f - 

### Get the Studio service hostname
kubectl get svc plantd-studio-service -n plantd-operator-system -o jsonpath='{.status.loadBalancer.ingress[0].hostname}{"\n"}'

Note that it may take upto 2-3 minutes for the PlantD Studio to be available at the above hostname.

Contributing

We welcome contributions from the open-source community, from bug fixes to new features and improvements. See CONTRIBUTING.md for more information on how to contribute.

Funding

PlantD is funded by Honda's 99P labs, with implementation and ongoing support provided by the TEEL team at Carnegie Mellon University.

License

PlantD is licensed under the GPLv2 License. See LICENSE for more details.

Documentation

For more detailed information about how to use PlantD, see our full documentation.

API documentation can be found in the docs

Contact

99p Labs TEEL Lab logo Carnegie Mellon University

For more information about the PlantD project, please contact us:

We are always open to collaboration, questions, and suggestions!

plantd-operator's People

Contributors

zitengshu avatar tomzhu1024 avatar baljit92 avatar jay1999ke avatar cbogart avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.