Coder Social home page Coder Social logo

terraform-docker-swarm's Introduction

This document describes how to build a Docker swarm with monitoring setup in a AWS VPC using Terraform.

The repository here has three major parts.

  • Customized AMI
  • Docker swarm mode
  • Prometheus with Grafana
  • Examples
    • Logging
    • Reverse proxy
    • Dynamic storage
              ┌──────────────────────────────┐                 
              │ Docker VPC                   │                 
              │     ┌───────────────────┐    │                 
              │ ┌──▶│      Manager 0    │    │                 
              │ │   └───────────────────┘    │                 
              │ │   ┌───────────────────┐    │                 
 ┌───┐  ┌───┐ │ ├──▶│      Manager 1    │    │                 
 │ R │  │ E │ │ │   └───────────────────┘    │                 
 │ 5 │─▶│ L │─┼─┤   ┌───────────────────┐    │                 
 │ 3 │  │ B │ │ ├──▶│       Node 0      │──┐ │                 
 └───┘  └───┘ │ │   └───────────────────┘  │ │                 
              │ │   ┌───────────────────┐  │ │                 
              │ ├──▶│       Node 1      │  │ │ ┌──────────────┐
              │ │   └───────────────────┘  └─┼▶│ EBS Metric   │
              │ │   ┌───────────────────┐    │ └──────────────┘
              │ └──▶│       Node 2      │    │ ┌──────────────┐
              │     └───────────────────┘  ┌─┼▶│ S3 Registry  │
              │     ┌───────────────────┐  │ │ └──────────────┘
              │     │      Bastion      │──┘ │                 
              │     └─────┬─────────────┘    │                 
              └───────────┼──────────────────┘                 
 ┌───┐                    │                                    
 │ R │            .───────▼───────────.                        
 │ 5 │──────────▶(  Private Registry   )                       
 │ 3 │            `───────────────────'                        
 └───┘                                                         

Customzied AMI

It comes in git-submodule and uses packer.io to build our own base image in AWS. The base path for this submodule is under .packer-docker

Take a look at docker.json to see detailed configuration for the customized AMI. Briefly it takes ubnunt 18.04, docker ce_18.06, docker-compose, cloud-init for attached Ebs and AWS CLI installed.

docker.options enables metrics, experimental=true and insecure-registry to 10.0.0.0/8192.0.0.0/8 and 172.0.0.0/8 for testing purpose.

Prerequisites

Install awscli in your host machine and configure your access key secret token.

Command

git submodule update --init
cd .packer-docker
packer build docker.json
Note 1: For building your own AMI, you need to update three parameters. region, security_group_ids, subnet_id.
Note 2: Update README.md and commit into repository when you produce a new AMI.

Docker Swarm Mode

It spins up whole infrastructure for docker swarm in AWS.

Terraform Module VPC

This module build up the fundamental of infrastructure including VPC, Subnet, Gateway and Route table. Also, this module would build up the infrastrucutre for the specific environment and project. For example: stg environment for project WRS.

You can change/update the environment profile by using Terraform's command.

terraform workspace -help

You can change project name and region by update variable.tf

variable "project" {
    default = "SampleProject"
}
variable "region" {
    default = "us-east-1"
}

Terraform - Swarm

This is the primary module in this repository. It carries all docker swarm mode needs. Includes,

Resource Purpose
Bastion With EIP
Manager With swarm init ready
Node With swarm join ready
Security Groups Restrict policy
EBS Persist storage attached on Node
IAM IAM Role for Instance Profile
ELB For Grafana:3000
ELB For Kibana:5601
R53 For Logstash:5000/udp

There are a few parameters that you will need to know.

  1. count_instance_per_az, how many instance will be created in the same availability zone? Default is 2.
  2. count_swarm_manager, how many manager will be created in total? Default is 3 for minimal HA requirement.
  3. count_swarm_node, how many node will be created in total? Default is count_instance_per_az * len(vpc.availability_zones) - count_swarm_manager
Those parameters can be found in VPC module.

The simple algorithm will spread EC2 instance to all subnets that created by VPC module as much as possible to make sure we use every availbility zone in specific region to have best high availability.

IAM and Instance Profile

This will create a Instance Profile under storage-node IAM Role. This profile will be attached to each one of managers and workers node.

The storage-node comes with two major permissions,

  • Ebs management
  • S3 management

Terraform Module Registry

This module creates private registry and store images in S3 bucket and the container runs on Bastion machine. Default Route53_record for private registry is {ENV}-registry.{PROJECT}.internal.

Resource Purpose
S3 Private registry run on Bastion
Route53 Point to private registry dns

Important Note

Add /docker folder under root of the bucket for registry, this is the bug of registry.

This was added into terraform script, the script would create bucket and folder at the same time once enabled.

Bucket: registry.hub.internal
        /docker

Terraform Module Backup

The module to create scheduler for backup all EBS that be mounted at /dev/xvd* and then create tag, DeleteOn with the days that tag , retention indicated or default 14 days. CloudWatch trigger will run the Lambda function every day at 08:00 UTC.

Lambda script was wrote in Go for myself practice.

Resource Purpose
CloudWatch Scheduled backup for persist storage
Lambda For backup and cleanup script

Terraform Module Script

Most beautiful feature, it generate your ssh config file from Terraform state file. This version comes with Bastion server settings.

(Optional)

Make sure you are able to access the S3 bucket that setup in variable.tf if you're using terraform backend feature for storing state file.

terraform {
    backend "s3" {
        bucket = "internal"
        key    = "terraform.tfstate"
        region = "us-east-1"
    }
}

Command

Initialize Terraform (one time job)

terraform init
terraform get

Generate SSH key for bastion and node instance (one time job)

ssh-keygen -q -t rsa -b 4096 -f keys/node -N ''
ssh-keygen -q -t rsa -b 4096 -f keys/manager -N ''
ssh-keygen -q -t rsa -b 4096 -f keys/bastion -N ''

Import the persistent stroage

terraform import module.registry.aws_s3_bucket.registry registry.hub.internal
terraform import module.swarm.aws_ebs_volume.storage-metric vol-034afe17b80deb0f7

Modify variable from default.tfvars.example

cp default.tfvars.exmaple default.auto.tfvars

Apply

terraform apply

Additional

Update your ssh config

ruby keys/ssh_config_*.rb

Teardown the infrastructure

terraform state rm module.registry.aws_s3_bucket.registry
terraform state rm module.registry.aws_s3_bucket_object.docker
terraform state rm module.swarm.aws_ebs_volume.storage-metric
terraform destroy -force

Prometheus and Grafana

Those docker-compose file brings you the completed stack of prometheus and Grafana.

Command

Build your docker image

cd prometheus
docker-compose build

Spin up

docker stack deploy prometheus -c docker-compose.yml

You can find admin password in docker-compose.yml under grafana service. The best dashboard that fits to us is Docker Swarm & Container Overview. Follow the screen to setup your metric source.

Docker Stack for Logging

Those docker-compose file brings you the completed stack of ELK 6.2.

Command

Build your docker image

cd elk
docker-compose build

Spin up

docker stack deploy elk -c docker-compose.yml

Docker Logging Driver

docker run --rm -it \
             --log-driver gelf \
             --log-opt gelf-address=udp://stg-logstash.sampleproject.internal:5000 \
             busybox echo This is my message.

Or

docker service create \
             --name temp_service \
             --network elk_logging \
             --log-driver gelf \
             --log-opt gelf-address=udp://stg-logstash.sampleproject.internal:5000 \
             busybox echo This is my message.

Docker log driver document (https://docs.docker.com/engine/admin/logging/gelf/#gelf-options)

Docker Stack for Docker Flow Proxy

cd dfproxy
docker stack deploy app -c docker-compose.yml

Docker Stack for Dynamic Storage

The specified docker volume is necessary here if the service/stack requires persistent storage.

Here is the example,

version: "3.5"
volumes:
    data-rexray-ebs:
        name: 'rexray_ebs_{{.Service.Name}}-{{.Task.Slot}}'
        driver: rexray/ebs
        driver_opts:
            size: 29
            snapshotID: ""
            volumeType: "gp2"    #io1 needs to come with iops
            iops: 0              #default 100 in gp2
            availabilityZone: "" #No quick solution for changing azs in docker compose
            encrypted: "false"
            encryptionKey: "test"
            force: "false"
            fsType: "ext4"

    data-rexray-s3fs:
        name: 'rexray_s3fs_{{.Service.Name}}-{{.Task.Slot}}'
        driver: rexray/s3fs
services:
    sameple:
        image: "busybox"
        command: "tail -f /dev/null"
        volumes:
           - data-rexray-ebs:/data
        deploy:
            placement:
                constraints:
                    - node.labels.azs == 0

Known issues

  1. There is no easy way to get all availability zone for user to pick up and use it in their docker-compose file
  2. In the practice, every deployment command should run in bastion machine and it is located in {regionName}-{index}a, so we can append - node.labels.azs == 0 in services section
  3. Don't know what to set in configuration file? Take a look at here
tags: amazons web service, aws, terraform, docker, docker swarm, ELK, HAProxy
Deprecated: ebs.tf, take a look here is you need old school way to handle EBS.

terraform-docker-swarm's People

Contributors

lancekuo avatar lancekuo-newtopia avatar

Watchers

 avatar  avatar

terraform-docker-swarm's Issues

Rexray/ebs/s3fs feature redesign

Comes with redesigned modules.

module swarm is getting bigger and bigger even it really works well and fulfill my requirement, but it would be better to redesign the module structure.

That might have a bunch of variable names and refers need to change.

lancekuo/tf-swarm#1

README updates for storage

  • Added section for docker stack, how to place ur container with right storage (azs) in docker-compose file

Add healthcheck for all services

In order to have an understanding in service health check, it'll be good to have this in every service I have been created.

Add healthcheck in all services.

    healthcheck:
      test: nc -z localhost 9090
      interval: 30s
      timeout: 30s
      retries: 3
  • ELK
  • Prometheus
  • DFP

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.