Coder Social home page Coder Social logo

theclaymethod / foundry-vagrant-mesos-kafka-cluster Goto Github PK

View Code? Open in Web Editor NEW

This project forked from bspaans/vagrant-mesos-cluster

50.0 9.0 10.0 560 KB

A Vagrant/Ansible => Kafka, Mesos (w/ Marathon/Docker), ZK, Hadoop, and Spark. Service discovery via HAProxy and Bamboo.

License: MIT License

Ruby 28.42% Shell 71.58%

foundry-vagrant-mesos-kafka-cluster's Introduction

The Foundry

A vagrant configuration to set up a cluster of mesos master, slaves and zookeepers through ansible. It will also set up a seperate Kafka cluster that piggybacks off Zookeeper from the Mesos cluster.

This also installs HDFA HA (Namenodes on mesos-master1 and mesos-master2) and Spark (spark-submit at /home/spark/spark/bin/spark-submit on mesos-master3).

For Kafka and Bamboo, just uncomment the Kafka VM in the Vagrantfile and Inventory File. And for Bamboo, uncomment in the cluster.yml playbook. These were turned off since it takes forever to Vagrant Up with them, and will probably be re-enabled soon.

Usage

Make sure you have the vagrant-hostmanager plugin installed

vagrant plugin install vagrant-hostmanager

Clone the repository, and run:

vagrant up

This will provision a mini Mesos cluster with one master, three slaves, and one Kafka instance. The Mesos master server also contains Zookeeper and the Marathon framework. The slave will come with HAProxy, Docker, and Bamboo installed.

After provisioning the servers you can access Marathon here: http://100.0.10.11:8080/ and the master itself here: http://100.0.10.11:5050/

Bamboo handles service discovery and reconfigures HAProxy. See usage instructions here: https://github.com/QubitProducts/bamboo

Non-High Availability Mode

There is also a vagrantfile for a normal 1 slave, 1 master setup (which will save a lot of time provisioning) in /nonHA

Chronos

You can register Chronos as a framework in a docker container with the following command:

curl -X POST -H "Content-Type: application/json" http://100.0.10.11:8080/v2/apps -d@docker-payloads/chronos.json

You can access the Chronos UI at http://mesos-slave:8081

Deploying Docker containers

Submitting a Docker container to run on the cluster is done by making a call to Marathon's REST API:

First create a file, ubuntu.json, with the details of the Docker container that you want to run:

{
  "container": {
    "type": "DOCKER",
    "docker": {
      "image": "libmesos/ubuntu"
    }
  },
  "id": "ubuntu",
  "instances": "1",
  "cpus": "0.5",
  "mem": "128",
  "uris": [],
  "cmd": "while sleep 10; do date -u +%T; done"
}

And second, submit this container to Marathon by using curl:

curl -X POST -H "Content-Type: application/json" http://100.0.10.11:8080/v2/apps [email protected]

You can monitor and scale the instance by going to the Marathon web interface linked above.

Using Spark

Load up the spark-shell using

/home/spark/spark/bin/spark-shell --master mesos://mesos-master1:5050,mesos-master2:5050,mesos-master3:5050 --executor-memory 128M

And execute a simple script

sc.parallelize(1 to 10).count()

Go to http://mesos-master3:5050/#/frameworks and see the workers in action

There should also be a Spark UI at http://mesos-master3:4040

You can run a Spark Jobserver with the following:

curl -X POST -H "Content-Type: application/json" http://100.0.10.11:8080/v2/apps -d@docker-payloads/spark-job-server-host.json

Spark Jobserver Running at http://mesos-master:8090

curl --data-binary @job-server-tests/target/job-server-tests-0.4.2-SNAPSHOT.jar mesos-slave:8090/jars/test

curl -d "input.string = a b c a b see" 'mesos-slave:8090/jobs?appName=test&classPath=spark.jobserver.WordCountExample'
{
  "status": "STARTED",
  "result": {
    "jobId": "0d734d5b-9dc8-4b85-aa2e-3f36b0fe4d91",
    "context": "abbe3d0b-spark.jobserver.WordCountExample"
  }
}

From this point, you could asynchronously query the status and results:

curl mesos-master:8090/jobs/3d4ef63e-1222-41f1-ad43-164e0412a99b
{
  "status": "OK",
  "result": {
    "a": 2,
    "b": 2,
    "see": 1,
    "c": 1
  }
}

Hadoop Example

Note: This will only work on a sufficiently large cluster. This may not be possible via VMs on your local machine.

su mapred
export HADOOP_MAPRED_HOME=/usr/lib/hadoop-0.20-mapreduce
export MESOS_NATIVE_LIBRARY=/usr/local/lib/libmesos.so
export MESOS_NATIVE_JAVA_LIBRARY=/usr/local/lib/libmesos.so
echo "Hello World Bye World" > /tmp/file0
echo "Hello Hadoop Goodbye Hadoop" > /tmp/file1
hdfs dfs -mkdir -p /user/foo/data
hdfs dfs -copyFromLocal /tmp/file? /user/foo/data
hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-examples-2.5.0-mr1-cdh5.2.0.jar wordcount /user/foo/data /user/foo/out
hdfs dfs -ls /user/foo/out
hdfs dfs -cat /user/foo/out/part*

foundry-vagrant-mesos-kafka-cluster's People

Contributors

ndemoor avatar rasputnik avatar theclaymethod avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

foundry-vagrant-mesos-kafka-cluster's Issues

Ansible provisioning fails on Mac OS X 10.10.x "Yosemite"

PLAY [all] ******************************************************************** 

GATHERING FACTS *************************************************************** 
fatal: [mesos-slave2] => SSH encountered an unknown error during the connection. We recommend you re-run the command using -vvvv, which will enable SSH debugging output to help diagnose the issue
fatal: [mesos-slave3] => SSH encountered an unknown error during the connection. We recommend you re-run the command using -vvvv, which will enable SSH debugging output to help diagnose the issue
fatal: [mesos-master1] => SSH encountered an unknown error during the connection. We recommend you re-run the command using -vvvv, which will enable SSH debugging output to help diagnose the issue
fatal: [mesos-slave1] => SSH encountered an unknown error during the connection. We recommend you re-run the command using -vvvv, which will enable SSH debugging output to help diagnose the issue
fatal: [mesos-master3] => SSH encountered an unknown error during the connection. We recommend you re-run the command using -vvvv, which will enable SSH debugging output to help diagnose the issue
fatal: [mesos-master2] => SSH encountered an unknown error during the connection. We recommend you re-run the command using -vvvv, which will enable SSH debugging output to help diagnose the issue

TASK: [common | create mesosphere repo] *************************************** 
FATAL: no hosts matched or all hosts have already failed -- aborting

N.B. I see many reports on the web that Yosemite has issues in recognizing modifications to /etc/hosts, and DNS resolution generally, so this may actually be an issue that needs to be reported with respect to the vagrant-hosts plugin. If you think it is, please let me know. Thanks!

error during ansible install

I get this after ansible starts running.

PLAY [all] ********************************************************************

GATHERING FACTS ***************************************************************
ESTABLISH CONNECTION FOR USER: vagrant
ESTABLISH CONNECTION FOR USER: vagrant
REMOTE_MODULE setup
REMOTE_MODULE setup
EXEC ['ssh', '-C', '-tt', '-vvv', '-o', 'UserKnownHostsFile=/dev/null', '-o', 'ControlMaster=auto', '-o', 'ControlPersist=60s', '-o', 'ControlPath=/Users/hsunami/.ansible/cp/ansible-ssh-%h-%p-%r', '-o', 'StrictHostKeyChecking=no', '-o', 'IdentityFile="/Users/hsunami/dev/Foundry-vagrant-mesos-kafka-cluster/nonHA/.vagrant/machines/mesos-slave/virtualbox/private_key"', '-o', 'KbdInteractiveAuthentication=no', '-o', 'PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey', '-o', 'PasswordAuthentication=no', '-o', 'User=vagrant', '-o', 'ConnectTimeout=10', 'mesos-master', "/bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1423134787.37-127624957695043 && chmod a+rx $HOME/.ansible/tmp/ansible-tmp-1423134787.37-127624957695043 && echo $HOME/.ansible/tmp/ansible-tmp-1423134787.37-127624957695043'"]
EXEC ['ssh', '-C', '-tt', '-vvv', '-o', 'UserKnownHostsFile=/dev/null', '-o', 'ControlMaster=auto', '-o', 'ControlPersist=60s', '-o', 'ControlPath=/Users/hsunami/.ansible/cp/ansible-ssh-%h-%p-%r', '-o', 'StrictHostKeyChecking=no', '-o', 'IdentityFile="/Users/hsunami/dev/Foundry-vagrant-mesos-kafka-cluster/nonHA/.vagrant/machines/mesos-slave/virtualbox/private_key"', '-o', 'KbdInteractiveAuthentication=no', '-o', 'PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey', '-o', 'PasswordAuthentication=no', '-o', 'User=vagrant', '-o', 'ConnectTimeout=10', 'mesos-slave', "/bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1423134787.37-187272507066113 && chmod a+rx $HOME/.ansible/tmp/ansible-tmp-1423134787.37-187272507066113 && echo $HOME/.ansible/tmp/ansible-tmp-1423134787.37-187272507066113'"]
fatal: [mesos-slave] => SSH encountered an unknown error. The output was:
OpenSSH_6.2p2, OSSLShim 0.9.8r 8 Dec 2011
debug1: Reading configuration data /Users/hsunami/.ssh/config
debug1: Reading configuration data /etc/ssh_config
debug1: /etc/ssh_config line 20: Applying options for *
debug1: /etc/ssh_config line 102: Applying options for *
debug1: auto-mux: Trying existing master
debug1: Control socket "/Users/hsunami/.ansible/cp/ansible-ssh-mesos-slave-22-vagrant" does not exist
debug2: ssh_connect: needpriv 0
debug1: Connecting to mesos-slave [100.0.10.101] port 22.
debug2: fd 3 setting O_NONBLOCK
debug1: fd 3 clearing O_NONBLOCK
debug1: Connection established.
debug3: timeout: 9919 ms remain after connect
debug3: Incorrect RSA1 identifier
debug3: Could not load "/Users/hsunami/dev/Foundry-vagrant-mesos-kafka-cluster/nonHA/.vagrant/machines/mesos-slave/virtualbox/private_key" as a RSA1 public key
debug1: identity file /Users/hsunami/dev/Foundry-vagrant-mesos-kafka-cluster/nonHA/.vagrant/machines/mesos-slave/virtualbox/private_key type -1
debug1: identity file /Users/hsunami/dev/Foundry-vagrant-mesos-kafka-cluster/nonHA/.vagrant/machines/mesos-slave/virtualbox/private_key-cert type -1
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_6.2
ssh_exchange_identification: read: Connection reset by peer

fatal: [mesos-master] => SSH encountered an unknown error. The output was:
OpenSSH_6.2p2, OSSLShim 0.9.8r 8 Dec 2011
debug1: Reading configuration data /Users/hsunami/.ssh/config
debug1: Reading configuration data /etc/ssh_config
debug1: /etc/ssh_config line 20: Applying options for *
debug1: /etc/ssh_config line 102: Applying options for *
debug1: auto-mux: Trying existing master
debug1: Control socket "/Users/hsunami/.ansible/cp/ansible-ssh-mesos-master-22-vagrant" does not exist
debug2: ssh_connect: needpriv 0
debug1: Connecting to mesos-master [100.0.10.11] port 22.
debug2: fd 3 setting O_NONBLOCK
debug1: connect to address 100.0.10.11 port 22: Operation timed out
ssh: connect to host mesos-master port 22: Operation timed out

TASK: [common | create mesosphere repo] ***************************************
FATAL: no hosts matched or all hosts have already failed -- aborting

PLAY RECAP ********************************************************************
to retry, use: --limit @/Users/hsunami/cluster.retry

mesos-master : ok=0 changed=0 unreachable=1 failed=0
mesos-slave : ok=0 changed=0 unreachable=1 failed=0

error for bamboo-haproy

When building the environment we get this error:
TASK: [bamboo-haproxy | Build bamboo image] ***********************************
failed: [100.0.10.102] => {"failed": true, "parsed": false}
SUDO-SUCCESS-gygrgcpeatjghxdfsyqbeefhjlvjjmra
Traceback (most recent call last):
File "/home/vagrant/.ansible/tmp/ansible-tmp-1415895628.48-91606373400347/docker", line 2419, in
main()
File "/home/vagrant/.ansible/tmp/ansible-tmp-1415895628.48-91606373400347/docker", line 802, in main
manager.start_containers(deployed_containers)
File "/home/vagrant/.ansible/tmp/ansible-tmp-1415895628.48-91606373400347/docker", line 661, in start_containers
self.client.start(i['Id'], **params)
TypeError: start() got an unexpected keyword argument 'network_mode'

FATAL: all hosts have already failed -- aborting

Otherwise, great work and thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.