Coder Social home page Coder Social logo

ale93p / g5k-flink-cluster Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 2.0 7 KB

Automated deployment of an Apache Flink cluster in your Grid'5000 reserved nodes.

License: MIT License

Python 100.00%
grid5000 apache-flink flink flink-stream-processing deployment-automation ansible ansible-playbook data-stream-processing bigdata deploy g5k cluster

g5k-flink-cluster's Introduction

Grid’5000 Apache Flink Cluster

This script will deploy a basic Apache Flink Cluster in our reserved nodes in Grid’5000. Hope to improve it in the future, any help is welcomed.

Dependencies

  • Python 3.x
  • Ansible (tested with version 2.5)

To install ansible from the frontend:

export PATH=$HOME/.local/bin:$PATH
easy_install --user ansible netaddr

How to run it

First of all we have to clone this repository in the frontend node in Grid’5000.

Download the disk image and the env file from here, then move them inside the repository folder.

Reserve resources

To reserve nodes in Grid’5000 you just have to run the following command (adapted to your situation):

frontend > oarsub -t deploy -p "cluster='suno'" -I -l nodes=4,walltime=2 -k

In this example we are in the Sophia region, we’re requesting 4 nodes for 2 hours in the cluster named ”suno”.

For further and more specific information follow the Grid’5000’s Getting Started tutorial.

Pre-tasks

Prepare the config file

Open the file cluster.conf and modify the parameters to comply your system configuration (g5k, storm and folders). Be sure to change the username with your grid5000 username, and to specify the correct grid5000 image name. (check also other possible configurations that I may have changed during testing)

Install Storm in your frontend

Download ad extract a binary of flink (download 1.7.2) in your frontend home. It will be needed to copy the configuration for your cluster, so you will be able to submit topologies directly from the frontend.

Run it

frontend > python3 deploy.py

To access your nodes use:

ssh root@node-name

That’s all. Simple, no?

Post-Run

Connect to Flink Dashboard

To connect to the Web UI, we need to open an ssh tunnel to the web service port:

localhost > ssh {{ g5k.username }}@access.grid5000.fr -N -L8081:{{ nimbus_node_address }}:8081

Now the Web Server should be reached through localhost:8081

Multi-Cluster Run

The script is able to deploy storm also in a multi-cluster environment. To make the reservation use:

frontend > oargridsub -t deploy -w '0:59:00' suno:rdef="/nodes=6",parapide:rdef="/nodes=6"

In this case, we don't enter in the job shell, so we don't have the OAR_NODE_FILE systemvariable. We can retrieve the list of the reserved machines using:

frontend > oargridstat -w -l {{ GRID_RESERVATION_ID  }} | sed '/^$/d' > ~/machines

Finally, change the configuration file specifying the location of the file just created (oar.file.location=~/machines) and write "yes" in the multi cluster option (multi.cluster=yes).

For more informations visit the Grid'5000's Multi-site jobs tutorial.

More

Check out also the other script for:

g5k-flink-cluster's People

Contributors

ale93p avatar

Watchers

 avatar  avatar

Forkers

bodiva mirfarzam

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.