Coder Social home page Coder Social logo

Comments (8)

baskinsy avatar baskinsy commented on May 3, 2024 1

Hello,

After trying also with @tiangolo's setup I ended with several synchronization issues on consul and also redundancy problems resulting on traefik being not able to get certs and create routes on multiple tests like rebooting nodes or redeploying services. So after looking deeper and researching several points I have deployed the following on two clusters up to now and it seems it works as expected. I have tested redundancy and synchronization of consul cluster nodes and also have upgrade from 1.5.0 to 1.5.1 without issues. I'm using also a traefik init service to store the config in consul which I found in the official traefik docs.

version: '3.4'

services:
  traefik_init:
    image: traefik:1.7
    command: 
      - "storeconfig"
      - "--docker"
      - "--docker.swarmMode"
      - "--docker.watch"
      - "--docker.exposedbydefault=false"
      - "--constraints=tag==traefik-public"
      - "--entrypoints=Name:http Address::80 Redirect.EntryPoint:https"
      - "--entrypoints=Name:https Address::443 TLS"
      - "--consul"
      - "--consul.endpoint=consul:8500"
      - "--consul.prefix=traefik"
      - "--acme"
      - "[email protected]"
      - "--acme.storage=traefik/acme/account"
      - "--acme.entryPoint=https"
      - "--acme.httpChallenge.entryPoint=http"
      - "--acme.onHostRule=true"
      - "--acme.onDemand=false"
      - "--acme.acmelogging=true"
      - "--logLevel=INFO"
      - "--accessLog"
      - "--api"
    networks:
      - consul
    deploy:
      restart_policy:
        condition: on-failure
    depends_on:
      - consul

  consul:
    image: consul:latest
    command: agent -server -client=0.0.0.0 -bootstrap-expect=3 -ui -data-dir /consul/data -retry-join consul.cluster
    volumes:
      - consul-data:/consul/data
    environment:
      - 'CONSUL_LOCAL_CONFIG={ "skip_leave_on_interrupt": true, "leave_on_terminate": false, "datacenter":"staging", "data_dir":"/consul/data", "server":true }'
      - CONSUL_BIND_INTERFACE=eth0
    networks:
      consul:
        aliases:
          - consul.cluster
      traefik-public:
    deploy:
      endpoint_mode: dnsrr
      mode: global
      placement:
        constraints:
          - node.role == manager
      resources:
        reservations:
          cpus: '0.5'
          memory: 128M 
      update_config:
        parallelism: 1
        delay: 30s
      restart_policy:
        condition: on-failure
      labels:
        - traefik.frontend.rule=Host:consul.domain.tld
        - traefik.enable=true
        - traefik.port=8500
        - traefik.tags=traefik-public
        - traefik.docker.network=traefik-public
        # Traefik service that listens to HTTP
        - traefik.redirectorservice.frontend.entryPoints=http
        - traefik.redirectorservice.frontend.redirect.entryPoint=https
        # Traefik service that listens to HTTPS
        - traefik.webservice.frontend.entryPoints=https
        - traefik.frontend.auth.basic.users=admin:xxxxxxxxx

  traefik:
    image: traefik:1.7
    ports:
      - 80:80
      - 443:443
    deploy:
      mode: global
      placement:
        constraints:
          - node.role == manager
      update_config:
        parallelism: 1
        delay: 10s
      restart_policy:
        condition: on-failure  
      labels:
        - traefik.frontend.rule=Host:traefik.domain.tld
        - traefik.enable=true
        - traefik.port=8080
        - traefik.tags=traefik-public
        - traefik.docker.network=traefik-public
        - traefik.backend.loadbalancer.stickiness=true
        # Traefik service that listens to HTTP
        - traefik.redirectorservice.frontend.entryPoints=http
        - traefik.redirectorservice.frontend.redirect.entryPoint=https
        # Traefik service that listens to HTTPS
        - traefik.webservice.frontend.entryPoints=https
        - traefik.frontend.auth.basic.users=admin:xxxxxxxx
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    command:
      - "--consul"
      - "--consul.endpoint=consul:8500"
      - "--consul.prefix=traefik"
    networks:
      - consul
      - traefik-public
    depends_on:
      - consul
      - traefik_init      
        
networks:
  traefik-public:
    driver: overlay
    external: true
  consul:
    driver: overlay
    external: true

volumes:
  consul-data:

In addition to that I have changed the autopilot settings after deploying by entering a consul container and execute consul operator autopilot set-config -parameter=value. I have only changed LastContactThershold and ServerStabilizationTime to 1s and 20s.

/ # consul operator autopilot get-config
CleanupDeadServers = true
LastContactThreshold = 1s
MaxTrailingLogs = 250
ServerStabilizationTime = 20s
RedundancyZoneTag = ""
DisableUpgradeMigration = false
UpgradeVersionTag = ""

Hope it helps, I'm still evaluating and working on it and have combined information and solutions from multiple sources.

from dockerswarm.rocks.

tiangolo avatar tiangolo commented on May 3, 2024

Are you deploying a single node or several nodes?

from dockerswarm.rocks.

darkl0rd avatar darkl0rd commented on May 3, 2024

I'm deploying in a swarm cluster, the instances are spread across my 3 manager nodes. The volumes are backed by EBS Volumes through RexRay.

from dockerswarm.rocks.

tiangolo avatar tiangolo commented on May 3, 2024

Have you tried without EBS Volumes and RexRay?

That would be the first thing, to debug if it's something related to Consul/Docker Swarm or EBS Volumes and RexRay.

from dockerswarm.rocks.

darkl0rd avatar darkl0rd commented on May 3, 2024

I did, that's how I started. But since I ran into this error, I figured it had to with it losing state. As such I backed it with EBS volumes (Which FWIW, I use at very large scale in production) - the problem persists. Somehow, the consul-leader doesn't seem to consider itself the leader, even though it's started with --bootstrap-expect=1.

I have also tried to start it in phases, first consul-leader, wait for it to be started, then the agents - but with the same result, as the consul-leader already cannot acquire leadership frequently.

from dockerswarm.rocks.

tiangolo avatar tiangolo commented on May 3, 2024

Ah, yes, I think I had similar issues using --bootstrap-expect=1, what seemed to work correctly was -bootstrap.

I know they recommend --bootstrap-expect=1 instead, but in my tests it didn't work as expected, while -boostrap did.

It also allowed having a single deployment Docker Stack (Docker Compose) that would work for a single consul server or for multiple (with several replicas).

from dockerswarm.rocks.

tiangolo avatar tiangolo commented on May 3, 2024

Thanks for sharing it @baskinsy !

from dockerswarm.rocks.

github-actions avatar github-actions commented on May 3, 2024

Assuming the original issue was solved, it will be automatically closed now. But feel free to add more comments or create new issues.

from dockerswarm.rocks.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.