Coder Social home page Coder Social logo

databill86 / docker-hadoop-cluster Goto Github PK

View Code? Open in Web Editor NEW

This project forked from lewuathe/docker-hadoop-cluster

0.0 0.0 0.0 84 KB

Multiple node cluster on Docker for self development.

License: Apache License 2.0

Makefile 2.64% Shell 36.31% Python 24.91% Dockerfile 36.14%

docker-hadoop-cluster's Introduction

docker-hadoop-cluster Build Status

Multiple node cluster on Docker for self development. docker-hadoop-cluster is suitable for testing your patch for Hadoop which has multiple nodes.

Build images from your Hadoop source code

Base image of hadoop service. This image includes JDK, hadoop package configurations etc. This image can include your self-build hadoop package. In order to bind, tar.gz package assumed be put under hadoop-base directory.

$ cd docker-hadoop-cluster
$ cp /path/to/hadoop-3.0.0-alpha3-SNAPSHOT.tar.gz hadoop-base
$ make

Once you build hadoop-base image, you can launch hadoop cluster by using docker-compose.

$ docker-compose up -d

or

$ make run

See http://localhost:9870 for NameNode or http://localhost:8088 for ResourceManager.

$ make down

shutdowns your cluster.

Build images from the latest trunk

docker-hadoop-cluster also uploads the latest image which refers HEAD of trunk. They are deployed on Docker Hub. If you want to try the trunk (though it can be unstable), docker-compose.yml like below is needed. It will launch 3 slave Hadoop cluster.

version: '2'

services:
  master:
    image: lewuathe/hadoop-master
    ports:
      - "9870:9870"
      - "8088:8088"
      - "19888:19888"
      - "8188:8188"
    container_name: "master"
  slave1:
    image: lewuathe/hadoop-slave
    container_name: "slave1"
    depends_on:
      - master
    ports:
      - "9901:9864"
      - "8041:8042"
  slave2:
    image: lewuathe/hadoop-slave
    container_name: "slave2"
    depends_on:
      - master
    ports:
      - "9902:9864"
      - "8042:8042"
  slave3:
    image: lewuathe/hadoop-slave
    container_name: "slave3"
    depends_on:
      - master
    ports:
      - "9903:9864"
      - "8043:8042"

Login cluster

$ docker exec -it master bash
bash-4.1# cd /usr/local/hadoop
bash-4.1# bin/hadoop version
Hadoop 3.0.0-SNAPSHOT
Source code repository git://git.apache.org/hadoop.git -r 0c7d3f480548745e9e9ccad1d318371c020c3003
Compiled by lewuathe on 2015-09-13T01:12Z
Compiled with protoc 2.5.0
From source with checksum 9174a352ac823cdfa576f525665e99
This command was run using /usr/local/hadoop-3.0.0-SNAPSHOT/share/hadoop/common/hadoop-common-3.0.0-SNAPSHOT.jar

Deploy on EC2

We can run docker-hadoop-cluster on EC2 instance with ec2/ script.

$ python ec2/ec2.py -k <Key Name> -s <Security Group ID> -n <Subnet ID> launch

This script launch EC2 instance and prepare prerequisites to launch docker-hadoop-cluster.

Docker Hub

Image name Pulls Stars
lewuathe/hadoop-base hadoop-base hadoop-base
lewuathe/hadoop-master hadoop-master hadoop-master
lewuathe/hadoop-slave hadoop-slave hadoop-slave

License

Apache License Version2.0 This images are modified version of sequenceiq/hadoop-docker.

docker-hadoop-cluster's People

Contributors

lewuathe avatar tasanuma avatar unik avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.