Coder Social home page Coder Social logo

apisces / ddn Goto Github PK

View Code? Open in Web Editor NEW

This project forked from nsdi2017-ddn/ddn

0.0 1.0 0.0 36 KB

Code repo for Pytheas (formally DDN), a control platform for enabling data-driven control for network applications

Java 63.64% Python 24.61% Shell 7.86% PHP 3.89%

ddn's Introduction

This directory archives Pytheas implementation prior to NSDI submission.

The newest version of Pytheas code is available at https://github.com/nsdi2017-ddn/pytheas

Table of Contents

  1. Environment
  2. Front Server
    1. Web Server
    2. Group Manager
  3. kafka
  4. spark
    1. Decision Maker
    2. Communicator
  5. Benchmark
    1. Response Time
      1. python Script
      2. Apache Benchmark
    2. Python Benchmark
      1. Standalone Benchmark
      2. Distributed Benchmark
    3. Kafka Benchmark
  6. Trace
    1. Algorithm Comparison
    2. Fault Tolerance
    3. One Host Experiment
  7. Abandoned

##Environment

System: Ubuntu 15.10

Java compiler tools (Maven) installation:

$ sudo apt-get update
$ sudo apt-get install -y default-jdk maven

##Front Server

Contains programs need to be deployed on each front-end server host.

###Web Server

Auto-deployment script (for Apache httpd and php programs):

../front_server $ sudo ./frontserver_deploy.sh

###Group Manager

compile using maven:

../GroupManager $ mvn package

run:

../GroupManager $ java -cp target/GroupManager-1.0-SNAPSHOT.jar frontend.GroupManager <cluster_ID> <kafka_server> <config_file>
<cluster_ID> is the ID of current cluster

<kafka_server> is the list of IP of kafka servers, separated by comma

<config_file> contains labels of update info and reduced labels

##Kafka

Deploy on one or more hosts in each cluster to manage the communications between each functional module.

Kafka deployment:

../kafka $ sudo ./kafka_deploy.sh <host_list> <host_number>
<host_list> is all IP addresses of kafka servers, separated by comma

<host_number> is the sequence number of current host in host_list

run:

$ cd /usr/share/kafka
$ sudo bin/zookeeper-server-start.sh config/zookeeper.properties &
$ sudo bin/kafka-server-start.sh config/server.properties

Note: If run kafka on more than one host. Execute third command only if second command has been executed on each host.


##Spark

Contains Decision-making module and communication module, each uses spark and can be run on one or more hosts.

Spark deployment:

../spark $ sudo ./spark_deploy.sh

###Decision Maker

make decision for each group.

compile using maven and submit it to spark.

###Communicator

communicate with backend cluster and other frontend clusters.

like DecisionMaker, compile using maven and submit it to spark.

reference :

Run Spark on Multi-hosts

Spark Submitting Applications


##Benchmark

some small scripts and programs to test the scalability of frontend cluster.

###Response Time

Test the response time of requests.

Python Script

A simple python program to perform HTTP POST request 1000 times and plot the CDF of response time:

$ ./post_time.py

Apache Benchmark

A shell using Apache Benchmark to test the response time of frontend server.

$ ./responseTime.ssh

###Python Benchmark

Standalone Benchmark

A standalone benchmark to perform the HTTP POST request. Test time and request per second(RPS) can be controlled.

$ ./benchmark.py

Distributed Benchmark

A distributed benchmark to perform the HTTP POST request.

Run slave program on all the hosts to perform the benchmark. Then run master program on one host to start test. When test finished, master program will generator three figures(Response Time, Successful RPS, CDF Response Time)

run slave:

$ ./dbenchmark_slave <url>
<url>: Desti-URL slave program will send requests to

run master:

$ ./dbenchmark_master <Time> <RPS>
<Time>: the time this test will last

<RPS>: request per second. Actually this parameter is only positive correlated with real RPS. The real RPS will show in the result figure.

Note: the host runs master program need to install matplotlib :

sudo apt-get install -y python-matplotlib

###Kafka Benchmark

This is special designed for test of throughput of Kafka and Spark Streaming. Need cooperation of special msg format.

compile:

../KafkaBenchmark $ mvn package

run:

send msg to kafka :

java -cp target/KafkaBenchmark-1.0-SNAPSHOT.jar mybenchmark.MsgReader <kafka_sender> <mps>
<kafka_server>: hostname of kafka server

mps: messages per second

Note: By default all msgs are sent to topic internal_groups

receive msg from kafka :

java -cp target/KafkaBenchmark-1.0-SNAPSHOT.jar mybenchmark.MsgReader <kafka_server> <topic>
<kafka_server>: hostname of kafka server

<topic>: Kafka topic this Reader will comsume

##Trace

some scripts to test the system or algorithm performance using traces.

trace_sort.sh : sort the trace by timestamp

###Algorithm Comparison

main scripts for algorithm comparison

auto_plot.sh : plot the algorithm comparison results

combine.py : process raw data

cost.conf : Gnuplot script for the plot

pull*.sh : pull the test result from cluster to localhost

trace_parser.py : parse the trace and simulate the player

###Fault Tolerance

main scripts for fault tolerance experiment

ft.conf : Gnuplot script

sort : process raw data

trace_parser_multi.py : parse the trace and simulate multiple players

###One Host Experiment

For real-world trace benchmark, deploy all mudules of a frontend cluster on one host. This is more efficient for multiple algorithms comparison.

autoscp.sh : upload files used.

onehost_deploy : deployment script

start_tmux : run necessary programs in tmux


##Abandoned

abandoned module codes, including load balancer (HAProxy) and proxy server.

ddn's People

Contributors

nsdi2017-ddn avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.