Coder Social home page Coder Social logo

mbert / kubeadm2ha Goto Github PK

View Code? Open in Web Editor NEW
29.0 7.0 11.0 229 KB

A set of scripts and documentation for adding redundancy (etcd cluster, multiple masters) to a cluster set up with kubeadm 1.8 and above

License: Apache License 2.0

Shell 58.67% Jinja 41.33%

kubeadm2ha's Introduction

kubeadm2ha - Automatic setup of HA clusters using kubeadm

This code largely follows the instructions published on kubernetes.io and provides a convenient automation for setting up a highly-available Kubernetes cluster (i.e. with more than one master node) and the Dashboard. It also provides limited support for installing and running systems not connected to the internet.

July 2024: removal of additional software

As of 2024 this repository no longer contains automatic installation support for additional software (like e.g. the EFK stack, etcd operator, ...) and focuses solely on the core: Kubernetes and the Dashboard. Upholding support was no longer feasible because one after the other these components were abandoned by their respective developers.

Overview

This repository contains a set of ansible scripts to do this. There are these playbooks:

  1. playbook-00-os-setup.yaml sets up the prerequisites for installing Kubernetes on Oracle Linux 7 nodes.
  2. playbook-01-cluster-setup.yaml sets up a complete cluster including the HA setup. See below for more details.
  3. playbook-51-cluster-uninstall.yaml removes data and configuration files to a point that cluster-setup.yaml can be used again.
  4. playbook-02-dashboard.yaml sets up the dashboard including influxdb/grafana.
  5. playbook-03-local-access.yaml creates a patched admin.conf file in /tmp/-admin.conf. After copying it to ~/.kube/config remote kubectl access via V-IP / load balancer can be tested.
  6. playbook-00-cluster-images.yaml prefetches all images needed for Kubernetes operations and transfers them to the target hosts.
  7. playbook-52-uninstall-dashboard.yaml removes the dashboard.
  8. playbook-31-cluster-upgrade.yaml upgrades a cluster.
  9. playbook-zz-zz-configure-imageversions.yaml updates the image tag names in the file vars/imageversions.yaml.

Due to the frequent upgrades to both Kubernetes and kubeadm, these scripts cannot support all possible versions. For both, fresh installs and upgrades, please refer to the value of KUBERNETES_VERSION in ansible/group_vars/all.yaml to find out which target version has been used for developing them. Other versions may work, too, but you may turn out to be the first to try this. Please refer to the following documents for compatibility information:

Prerequisites

Ansible version 2.4 or higher is required. Older versions will not work.

Prepare an environment for ansible

In order to use the ansible scripts, at least two files need to be configured:

  1. Either edit my-cluster.inventory or create your own. The inventory must define the following groups: primary_master (a single machine on which kubeadm will be run), secondary_masters (the other masters), masters (all masters), minions (the worker nodes), nodes (all nodes), etcd (all machines on which etcd is installed, usually the masters).
  2. Create a file named as the group defined in your inventory file in group_vars overriding the defaults from all.yaml where necessary. You may also decide to change some of the defaults for your environment: LOAD_BALANCING (kube-vip, haproxy or nginx), NETWORK_PLUGIN (weavenet, flannel or calico) and ETCD_HOSTING (stacked if running on the masters, else external).

Prepare your hosts

On the target environment, some settings for successful installation of Kubernetes are necessary. The "Before you begin" section in the official kubernetes documentation applies, nevertheless here is a convenience list of things to take care of:

  1. Set the value of /proc/sys/net/bridge/bridge-nf-call-iptables to 1. There may be different, distro-dependent ways to accomplish this in a persistent way, however most people will get away by editing /etc/sysctl.conf.
  2. Load the ip_vs kernel module. Most people will want to create a file in /etc/modprobe.de for this.
  3. Disable swap. Make sure to edit /etc/fstab to remove the swap mount from it.
  4. Make sure to have enough disk space on /var/lib/docker, ideally you should set up a dedicated partition and mount it, so that if downloaded docker images exceed the available space the operating system still works.
  5. Make sure that the docker engine is installed. Other container engine implementations may work but have not been tested.
  6. The primary master host requires passwordless ssh access to all other cluster hosts as root. After successful installation this can be removed if desired.
  7. The primary master host needs to be able to resolve all other cluster host names as configured in the environments inventory files (see previous step), either via DNS or by entries in /etc/hosts.
  8. Activate and start containerd and docker.

What the cluster setup does

  1. Set up an etcd cluster with self-signed certificates on all hosts in group etcd..
  2. Set up a virtual IP and load balancing: either using a static pod for kube-vip or a keepalived cluster with nginx on all hosts in group masters.
  3. Set up a master instance on the host in group primary_master using kubeadm.
  4. Set up master instances on all hosts in group secondary_masters by copying and patching (replace the primary master's host name and IP) the configuration created by kubeadm and have them join the cluster.
  5. Use kubeadm to join all hosts in the group minions.
  6. Sets up a service account 'admin-user' and cluster role binding for the role 'cluster-admin' for remote access (if wanted).

Note that this step assumes that the Kubernetes software packages can be downloaded from some repository (like yum or apt). If your system has no connection to the internet you will need to set up a repository in your network and install the required packages beforehand.

What the images setup does

  1. Pull all required images locally (hence you need to make sure to have docker installed on the host from which you run ansible).
  2. Export the images to tar files.
  3. Copy the tar files over to the target hosts.
  4. Import the images from the tar files on the target hosts.

What the image tag update does

  1. Detect the currently latest image tags with respect to the configured Kubernetes version
  2. Overwrite the file vars/imageversions.yaml with the latest image names and versions

Note that the image versions configured by this playbook will not necessarly work, as more recent versions may introduce incompatibilities, hence it they are merely a starting point and a helper for K8S updates.

Setting up the dashboard

The playbook-02-dashboard.yaml playbook does the following:

  1. Install the dashboard and metrics-server components.
  2. Scale the number of instances to the number of master nodes.

For accessing the dashbord run kubectl proxy on your local host (which requires to have configured kubectl for your local host, see Configuring local access below for automating this), then access via http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/#/login

The dashboard will ask you to authenticate. A user with admin privileges has been created during installation. In order to log in as this user, use this command to generate a token:

kubectl -n kubernetes-dashboard create token admin-user

Copy the token from the console and use it for logging in to the dashboard.

  1. Use the playbook-03-local-access.yaml playbook to generate a configuration file. That file can be copied to ~/.kube/config for local kubectl access. It can also be uploaded as kubeconfig file in the dashboard's login dialogue.

Configuring local access

Running the playbook-03-local-access.yaml playbook creates a file /tmp/-admin.conf that can be used as ~/.kube/config. If the dashboard has been installed (see above) the file will contain the 'admin-user' service account's token, so that for both kubectl and the dashboard root-like access is possible. If that service account does not exist, the client-side certificate will be used instead which is OK for testing environments but is generally considered not recommendable because the client-side certificates are not supposed to leave their master host.

Upgrading a cluster

Note: this automatic upgrade will delete local pod storage. See below whether this is relevant for you or not: Pods may keep local data (e.g. the dashboard and its metrics components). Whether such data needs to be preserved or not, depends on your application. If the answer is yes, then don't use this here; upgrade manually instead.

For upgrading a cluster several steps are needed:

  1. Find out which software versions to upgrade to.
  2. Set the ansible variables to the new software versions.
  3. Run the playbook-00-cluster-images.yaml playbook if the cluster has no Internet access.
  4. Run the playbook-31-cluster-upgrade.yaml playbook.

Note: Never upgrade a productive cluster without having tried it on a reference system before.

Preparation

First thing to do is find out to which version you want to upgrade. We only support systems where the version for all Kubernetes-related components (native packages, like kubelet, kubectl, kubeadm) and whatever they will run in containers when installed (API Server, Controller Manager, Scheduler, Kube Proxy) is the same. Hence, after having determined the version to upgrade to, update the variable KUBERNETES_VERSION either in group_vars/all.yaml (global) or in group_vars/.yaml (your environment only).

Next, you need to be able to upgrade the kubelet, kubectl, kubeadm and - if upgraded, too - kubernetes-cni on your cluster's machines using their package manager (yum, apt or whatever). If you are connected to the internet, this is a no-brainer; the automatic upgrade will actually take care of this.

However in an isolated environment without internet access you will need to download these packages elsewhere and then make them available for your nodes, so that they can be installed using their package managers. This will most likely lead to creating local repos on the nodes or on a server in the same network and configure the package managers to use them. If your system is like that, again, the automatic upgrade will take care of upgradign the packages for you. I strongly recommend following this pattern, because the package upgrade needs to take place at a specific moment during upgrade which will effectively force you to perform the upgrade manually in the end.

Note that upgrading etcd is only supported if it is installed on the master nodes (ETCD_HOSTING is stacked). Else you will have to upgrade etcd manually which is beyond scope here.

Having configured this you may now want to fetch and install the new images for your to-be-upgraded cluster, if your cluster has no internet access. If it has you may want to do this anyway to make the upgrade more seamless.

To do so, run the following command:

ansible-playbook -f <good-number-of-concurrent-processes> -i <your-environment>.inventory playbook-00-cluster-images.yaml

I usually set the number of concurrent processes manually because if a cluster consists of more than 5 (default) nodes picking a higher value here significantly speeds up the process.

Perform the upgrade

You may want to backup /etc/kubernetes on all your master machines. Do this before running the upgrade.

The actual upgrade is automated. Run the following command:

ansible-playbook -f <good-number-of-concurrent-processes> -i <your-environment>.inventory playbook-31-cluster-upgrade.yaml

See the comment above on setting the number of concurrent processes.

The upgrade is not fully free of disruptions:

  • while kubeadm applies the changes on a master, it restarts a number of services, hence they may be unavailable for a short time
  • if containers running on the minions keep local data they have to take care to rebuild it when relocated to different minions during the upgrade process (i.e. local data is ignored)

If any of these is unacceptable, a fully automated upgrade process does not really make any sense because deep knowledge of the application running in a respective cluster is required to work around this. Hence in that case a manual upgrade process is recommended.

If something goes wrong

If the upgrade fails the situation afterwards depends on the phase in which things went wrong.

If kubeadm failed to upgrade the cluster it will try to perform a rollback. Hence if that happened on the first master, chances are pretty good that the cluster is still intact. In that case all you need is to start docker, kubelet and keepalived on the secondary masters and then uncordon them (kubectl uncordon <secondary-master-fqdn>) to be back where you started from.

If kubeadm on one of the secondary masters failed you still have a working, upgraded cluster, but without the secondary masters which may be in a somewhat undefined condition. In some cases kubeadm fails if the cluster is still busy after having upgraded the previous master node, so that waiting a bit and running kubeadm upgrade apply v<VERSION> may even succeed. Otherwise you will have to find out what went wrong and join the secondaries manually. Once this has been done, finish the automatic upgrade process by processing the second half of the playbook only:

ansible-playbook -f <good-number-of-concurrent-processes> -i <your-environment>.inventory playbook-31-cluster-upgrade.yaml --tags nodes

If upgrading the software packages (i.e. the second half of the playbook) failed, you still have a working cluster. You may try to fix the problems and continue manually. See the .yaml files under roles/upgrade-nodes/tasks for what you need to do.

If you are trying out the upgrade on a reference system, you may have to downgrade at some point to start again. See the sequence for reinstalling a cluster below for an instruction how to do this (hint: it is important to erase the some base software packages before setting up a new cluster based on a lower Kubernetes version).

Examples

To run one of the playbooks (e.g. to set up a cluster), run ansible like this:

ansible-playbook -i <your-inventory-file>.inventory playbook-01-cluster-setup.yaml

You might want to adapt the number of parallel processes to your number of hosts using the `-f' option.

A sane sequence of playbooks for a complete setup would be:

  • playbook-00-cluster-images.yaml
  • playbook-01-cluster-setup.yaml
  • playbook-02-cluster-dashboard.yaml

The following playbooks can be used as needed:

  • playbook-51-cluster-uninstall.yaml
  • playbook-03-local-access.yaml
  • playbook-52-uninstall-dashboard.yaml

Sequence for reinstalling a cluster:

INVENTORY=<your-inventory-file> 
NODES=<number-of-nodes>
ansible-playbook -f $NODES -i $INVENTORY playbook-51-cluster-uninstall.yaml 
# if you want to downgrade your kubelet, kubectl, ... packages you need to uninstall them first
# if this is not the issue here, you can skip the following line
ansible -u root -f $NODES -i $INVENTORY nodes -m command -a "rpm -e kubelet kubectl kubeadm kubernetes-cni"
for i in playbook-01-cluster-setup.yaml playbook-02-cluster-dashboard.yaml; do 
    ansible-playbook -f $NODES -i $INVENTORY $i || break
    sleep 15s
done

Known limitations

This is a preview in order to obtain early feedback. It is not done yet. Known limitations are:

  • The code has been tested almost exclusively in a Redhat-like (RHEL) environment. More testing on other distros is needed.

kubeadm2ha's People

Contributors

joshuacox avatar mbert avatar ptrsauer avatar researchiteng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

kubeadm2ha's Issues

modprobe ip_vs - suggestion

"/root/join-worker-node.sh" - WARNING
(ignorable, as it should be done as a prereq maybe)
RequiredIPVSKernelModulesAvailable]: the IPVS proxier will not be used, because the following required kernel modules are not loaded: [ip_vs_wrr ip_vs_sh ip_vs ip_vs_rr]
add:
modprobe ip_vs_wrr ip_vs_sh ip_vs ip_vs_rr
But also make them persistent.

- name: load ip_vs kernel modules
  modprobe: name={{ item }} state=present
  with_items:
  - ip_vs_wrr
  - ip_vs_rr
  - ip_vs_sh
  - ip_vs

- name: persist ip_vs kernel modules
  copy:
    path: /etc/modules-load.d/ip_vs.conf
      content: |
        ip_vs_wrr
        ip_vs_rr
        ip_vs_sh
        ip_vs

20-etcd-service-manager.conf -> missing cgroup-driver

20-etcd-service-manager.conf is missing the "--cgroup-driver=systemd"
BTW, FYI, starting 1.11 this setting is handled automatically by kubeadm, but this step is before calling kubeadm.

"kubeadm now detects the Docker cgroup driver and starts the kubelet with the matching driver. This eliminates a common error experienced by new users in when the Docker cgroup driver is not the same as the one set for the kubelet due to different Linux distributions setting different cgroup drivers for Docker, making it hard to start the kubelet properly. (#64347, @neolit123)"

The prepare-nodes cgroup driver part does not apply to this because:
a) 20-etcd-service-manager.conf overrides the 10-kubeadm.conf
b) the prepare-nodes code won't do anything any longer, as 10-kubeadm.conf no longer holds the "cgroup-driver" string.

swapoff - suggestion

prepare nodes doesn't swapoff /remove from fstab. (ignorable, as it should be done as a prereq maybe)

Different kubeadm-init Configs

Hi,

why are you using different kubeadm-init.yaml.j2 tempalte files for master / secondary? I just ran your playbook and you should always set "endpoint-reconciler-type" for the apiserver (not only on the secondaries but also on the master):

apiServerExtraArgs:
  {% if KUBERNETES_VERSION | match('^1\.8') %}apiserver-count: "{{ groups['masters'] | length }}"{% else %}endpoint-reconciler-type: "lease"{% endif %}

You should always use the "global" templates/kubeadm-init.yaml.j2 file:
template/kubeadm-init.yaml.j2

etcd? yaml to json?

โ— etcd.service - Etcd Server
   Loaded: loaded (/usr/lib/systemd/system/etcd.service; enabled; vendor preset: disabled)
   Active: failed (Result: start-limit) since Thu 2018-03-01 18:56:36 UTC; 26min ago
  Process: 13849 ExecStart=/bin/bash -c GOMAXPROCS=$(nproc) /usr/bin/etcd --name="${ETCD_NAME}" --data-dir="${ETCD_DATA_DIR}" --listen-client-urls="${ETCD_LISTEN_CLIENT_URLS}" (code=exited, status=1/FAILURE)
 Main PID: 13849 (code=exited, status=1/FAILURE)

Mar 01 18:56:35 localhost.localdomain systemd[1]: etcd.service: main process exited, code=exited, status...LURE
Mar 01 18:56:35 localhost.localdomain systemd[1]: Failed to start Etcd Server.
Mar 01 18:56:35 localhost.localdomain systemd[1]: Unit etcd.service entered failed state.
Mar 01 18:56:35 localhost.localdomain systemd[1]: etcd.service failed.
Mar 01 18:56:36 localhost.localdomain systemd[1]: etcd.service holdoff time over, scheduling restart.
Mar 01 18:56:36 localhost.localdomain systemd[1]: start request repeated too quickly for etcd.service
Mar 01 18:56:36 localhost.localdomain systemd[1]: Failed to start Etcd Server.
Mar 01 18:56:36 localhost.localdomain systemd[1]: Unit etcd.service entered failed state.
Mar 01 18:56:36 localhost.localdomain systemd[1]: etcd.service failed.
Hint: Some lines were ellipsized, use -l to show in full.

but if I log in directly to the host and su to the etcd user:

-bash-4.2$ etcd --config-file=/etc/etcd/etcd.conf
2018-03-01 19:22:21.671961 I | etcdmain: Loading server configuration from "/etc/etcd/etcd.conf"
2018-03-01 19:22:21.672261 E | etcdmain: error verifying flags, error converting YAML to JSON: yaml: line 7: did not find expected <document start>. See 'etcd --help'.

etcd is still 3.2.7

etcd --version
etcd Version: 3.2.7
Git SHA: bb66589
Go Version: go1.8.3
Go OS/Arch: linux/amd64

I see this as an example:
https://github.com/coreos/etcd/blob/master/etcd.conf.yml.sample

am I missing something? Should this be an environment file instead?

kube component can not do `Watch` when apiserver is set to VIP:PORT?

hi (:
the default load balancing strategy of nginx is rr, so when a pod(sth like kube-proxy) do Watch action, it will print a lots of warning log message like

W0301 02:10:52.929987       1 reflector.go:341] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:85: watch of *core.Service ended with: very short watch: k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:85: Unexpected watch close - watch lasted less than a second and no items received

How to deal with this issue? or just ignore it?

etcd role works only from root

If the playbook is run without root, the etcd role fails due to lack of perms. when it tries to copy the certs from localhost to all etcd machines:

- name: Copy certs to all etcd nodes

Reason being: at unarchive, all files are owned by root (as they were on the primary-etcd), and now, a non-root user on control machine (localaction) cannot read them.

  • Ugly work around: just add mode=0755 at the unarchive localaction:
    - name: Unarchive certificates on localhost...
  • Ideally, split the certs.tar.gz archive in 2 archives: one for kubeadmcfg.yaml, one certificates.
    Once there are 2 archives, there is no need to unarchive on the control machine, but let ansible transfer the archive from local control machine to the destination and unarchive there, with the right perms.

bridge-nf-call-iptables - suggestion

"/root/join-worker-node.sh" - FATAL ERROR
(ignorable, as it should be done as a prereq maybe)
[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1
echo "1" > /proc/sys/net/bridge/bridge-nf-call-iptables
but also make it persistent

kubeadm token create --print-join-command - suggestion

I noticed there is a nice option "--print-join-command" , which provides the
kubeadm token create --print-join-command
I0914 14:11:14.396695 3948 feature_gate.go:230] feature gates: &{map[]}
kubeadm join 10.1.3.2:6443 --token aaaaaa.bbbbbbbbbbbbbbbb --discovery-token-ca-cert-hash sha256:bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb

You may want to look at it for the join-token role, which is quite complicated now with the openssl option.

No package matching 'nginx-1.12.2' found available, installed or updated

I am testing your scripts out and getting the following error:

TASK [nginx : Install nginx via package manager] ***********************************************************************************************************************************************************
fatal: [my-cluster-master-1]: FAILED! => {"changed": false, "failed": true, "msg": "No package matching 'nginx-1.12.2' found available, installed or updated", "rc": 126, "results": ["No package matching 'nginx-1.12.2' found available, installed or updated"]}
fatal: [my-cluster-master-2]: FAILED! => {"changed": false, "failed": true, "msg": "No package matching 'nginx-1.12.2' found available, installed or updated", "rc": 126, "results": ["No package matching 'nginx-1.12.2' found available, installed or updated"]}
fatal: [my-cluster-master-3]: FAILED! => {"changed": false, "failed": true, "msg": "No package matching 'nginx-1.12.2' found available, installed or updated", "rc": 126, "results": ["No package matching 'nginx-1.12.2' found available, installed or updated"]}

Role: ha-settings improvements

Your are doing some steps twice, which is not necessary (seems like a copy paste error for me):

ha-settings/tasks/main.yaml

  • Get current kube-proxy settings
  • Edit current kube-proxy settings to use the virtual IP instead of the host IP
  • Apply edited kube-proxy settings
  • Force restart of all kube-proxy pods

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.