kubernetes-retired / kubeadm-dind-cluster Goto Github PK
View Code? Open in Web Editor NEW[EOL] A Kubernetes multi-node test cluster based on kubeadm
License: Apache License 2.0
[EOL] A Kubernetes multi-node test cluster based on kubeadm
License: Apache License 2.0
I'm back... I have a question, is there a way to enable the docker arg --net=host on first each node, and then all of there children docker containers. This is specifically for the v1.6 script and I am going to need it for the rbd commands that apparently need a clear path networking path to the host. You have any pointers for me to enable this?
Im hoping I can continue leveraging this script, im trying to avoid going to straight kubeam :(
If this is not the best way to ask questions please let me know of other avenues.
Thanks,
aramis
Hi There i am trying to build one dev machine with 1.8, however its failing at kube-proxy.
Any help really appreciated.
WARNING: cluster glitch: proxy pods aren't removed; pods may 'blink' for some time after restore
NAME READY STATUS RESTARTS AGE
etcd-kube-master 1/1 Running 0 17s
kube-dns-545bc4bfd4-vnq5z 2/3 Terminating 0 59s
kube-dns-855bdc94cb-gj8q9 0/3 Terminating 0 25s
kube-proxy-mfx66 0/1 Terminating 2 59s
kube-scheduler-kube-master 0/1 Pending 0 2s
- Waiting for kube-proxy and the nodes
.......................................................................................................................................................................................................Error waiting for kube-proxy and the nodes
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system etcd-kube-master 1/1 Running 1 3m
kube-system kube-apiserver-kube-master 1/1 Running 1 1m
kube-system kube-controller-manager-kube-master 1/1 Running 1 1m
kube-system kube-proxy-d7dlb 0/1 CrashLoopBackOff 4 2m
kube-system kube-proxy-gwdfl 0/1 CrashLoopBackOff 4 2m
kube-system kube-proxy-t49zz 0/1 Error 4 1m
kube-system kube-scheduler-kube-master 1/1 Running 1 3m
W0118 14:16:26.111384 1 server.go:191] WARNING: all flags other than --config, --write-config-to, and --cleanup are deprecated. Please begin using a config file ASAP.
time="2018-01-18T14:16:26Z" level=warning msg="Running modprobe ip_vs failed with message: ``, error: exec: "modprobe": executable file not found in $PATH"
W0118 14:16:26.123933 1 server_others.go:268] Flag proxy-mode="" unknown, assuming iptables proxy
I0118 14:16:26.126071 1 server_others.go:122] Using iptables Proxier.
W0118 14:16:26.134066 1 proxier.go:476] clusterCIDR not specified, unable to distinguish between internal and external traffic
I0118 14:16:26.134240 1 server_others.go:157] Tearing down inactive rules.
I0118 14:16:26.194376 1 conntrack.go:98] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
I0118 14:16:26.194605 1 conntrack.go:52] Setting nf_conntrack_max to 131072
I0118 14:16:26.201140 1 conntrack.go:83] Setting conntrack hashsize to 32768
error: write /sys/module/nf_conntrack/parameters/hashsize: operation not supported
For our use case we don't require the kubernetes dashboard, so it would be useful for us (and save time) to have a flag that would not require it to be installed and wait for it to come up before completing a run / restore.
According to kubeadm doc, kubeadm uses etcd v3.0.17. But etcd image in mirantis/kubeadm-dind-cluster v1.6 and v1.7 is still v2.2.5. So could you upgrade etcd image to v3.0.17.
I didn't see how the save.tar.lz4 was created so I don't know how to upgrade this myself
First time used the dind-cluster-v1.8.sh script with up and installed a sample application. Service, deployments etc are working fine. After some time the vm got restarted. We tried to bring up dind cluster with the script and reup command. But service deployments everything are deleted.
How to bring up high availability dind cluster?
The reason is a race condition in the docker info and grep -q pipe-line used as detection mechanism.
This is the detection mechanism used to detect moby Linux:
if docker info|grep -q '^Kernel Version: .*-moby$'; then
is_moby_linux=1
fi
Since grep -q exits immediately with a zero status as soon as a match is found, sometimes it exists while docker info is still writing to the pipe, but there is no reader (because grep has exited), so docker info receives a SIGPIPE signal from the kernel and it exits with a status of 141.
Now, given that pipefail shell option is enabled, the pipe-line's return status is the value of the last (rightmost) command to exit with a non-zero status, that is, docker info exit status equal to 141, hence, the if condition is not true in these cases.
kubeadm-dind works great on amd64 machines, it would be awesome if the marintis-dind container ran on arm and arm64 arch too.
I am using the v1.6 script to deploy a cluster. It works, but does not respect config changes that I apply to config.sh. For example, I tried updating the DIND_SUBNET and deploy. The cluster was deployed with the default 10.192.0.0 network. I then ran down/clean, updated the NUM_NODES to 1 and ran up but 2 nodes were deployed.
$ git diff -w
diff --git a/config.sh b/config.sh
index d735523..c399885 100644
--- a/config.sh
+++ b/config.sh
@@ -1,5 +1,5 @@
# DIND subnet (/16 is always used)
-DIND_SUBNET=10.192.0.0
+DIND_SUBNET=2001::
# Apiserver port
APISERVER_PORT=${APISERVER_PORT:-8080}
@@ -7,7 +7,7 @@ APISERVER_PORT=${APISERVER_PORT:-8080}
# Number of nodes. 0 nodes means just one master node.
# In case of NUM_NODES=0 'node-role.kubernetes.io/master' taint is removed
# from the master node.
-NUM_NODES=${NUM_NODES:-2}
+NUM_NODES=${NUM_NODES:-1}
# Use non-dockerized build
# KUBEADM_DIND_LOCAL=
However, changes are respected when the settings are updated directly in the v1.6 script:
$ git diff dind-cluster-v1.6.sh
diff --git a/fixed/dind-cluster-v1.6.sh b/fixed/dind-cluster-v1.6.sh
index 968a85e..75ee043 100755
--- a/fixed/dind-cluster-v1.6.sh
+++ b/fixed/dind-cluster-v1.6.sh
@@ -47,7 +47,7 @@ if [[ ! ${EMBEDDED_CONFIG:-} ]]; then
fi
CNI_PLUGIN="${CNI_PLUGIN:-bridge}"
-DIND_SUBNET="${DIND_SUBNET:-10.192.0.0}"
+DIND_SUBNET="${DIND_SUBNET:-2001::}"
dind_ip_base="$(echo "${DIND_SUBNET}" | sed 's/\.0$//')"
DIND_IMAGE="${DIND_IMAGE:-}"
BUILD_KUBEADM="${BUILD_KUBEADM:-}"
$ ./dind-cluster-v1.6.sh up
* Making sure DIND image is up to date
v1.6: Pulling from mirantis/kubeadm-dind-cluster
Digest: sha256:b81a47264b1992bfeb76f0407e886feded413edd7f5fcbab02ea296831b43db2
Status: Image is up to date for mirantis/kubeadm-dind-cluster:v1.6
* Saving a copy of docker host's /lib/modules
* Starting DIND container: kube-master
docker: Error response from daemon: invalid IPv4 address: 2001::.2.
I'm curious to know if it's possible to start the cluster with RBAC enabled, to allow for "--clusterrole" flag functionality and the like.
Running on Docker Mac 17.12.0 (and same problem with 17.09.01 i think)
get pods --all-namespaces
shows the kube-dns and kube-proxy images failing.The logs from kube-proxy look a lot like this issue #50:
I0113 07:50:07.318108 1 conntrack.go:52] Setting nf_conntrack_max to 131072
I0113 07:50:07.325346 1 conntrack.go:83] Setting conntrack hashsize to 32768
error: write /sys/module/nf_conntrack/parameters/hashsize: operation not supported
Reverting to the prior version of mirantis/kubeadm-dind-cluster:v1.8 fixed the problem.
(We had to hack in our docker engine to find the prior layers in order to restore. It would be quite helpful if docker hub had tags for all release instead of just replacing. e.g. tag v1.8 could be the most recent, but it would have been helpful to also have explicit tage v1.8.6, v1.8.4, .... for restoring in case of problems. Also it would be useful if we could rebuild it ourselves in cases like this, but we weren't sure how to.)
I am unable to deploy a kubeadm-dind-cluster on Ubuntu 16.04.2 or CentOS 7. I get the following error:
Jun 22 19:23:34 kube-master systemd[1]: Starting Docker Application Container Engine...
Jun 22 19:23:34 kube-master rundocker[180]: Trying to load overlay module (this may fail)
Jun 22 19:23:34 kube-master rundocker[180]: time="2017-06-22T19:23:34.197080260Z" level=info msg="libcontainerd: new containerd process, pid: 189"
Jun 22 19:23:35 kube-master rundocker[180]: time="2017-06-22T19:23:35.208001616Z" level=fatal msg="Error starting daemon: error initializing graphdriver: driver not supported"
Jun 22 19:23:35 kube-master systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Jun 22 19:23:35 kube-master systemd[1]: Failed to start Docker Application Container Engine.
Jun 22 19:23:35 kube-master systemd[1]: docker.service: Unit entered failed state.
Jun 22 19:23:35 kube-master systemd[1]: docker.service: Failed with result 'exit-code'.
Here are details of my CentOS 7 setup:
$ docker version
Client:
Version: 1.12.6
API version: 1.24
Go version: go1.6.4
Git commit: 78d1802
Built: Tue Jan 10 20:20:01 2017
OS/Arch: linux/amd64
Server:
Version: 1.12.6
API version: 1.24
Go version: go1.6.4
Git commit: 78d1802
Built: Tue Jan 10 20:20:01 2017
OS/Arch: linux/amd64
$ rpm --query centos-release
centos-release-7-3.1611.el7.centos.x86_64
# docker info|grep Storage
Storage Driver: overlay
I have tried the overlay and overlay2 drivers, but I hit the same error.
Here are the details of my Ubuntu setup:
# cat /etc/os-release
NAME="Ubuntu"
VERSION="16.04.2 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.2 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial
# docker version
Client:
Version: 1.11.2
API version: 1.23
Go version: go1.5.4
Git commit: b9f10c9
Built: Wed Jun 1 22:00:43 2016
OS/Arch: linux/amd64
Server:
Version: 1.11.2
API version: 1.23
Go version: go1.5.4
Git commit: b9f10c9
Built: Wed Jun 1 22:00:43 2016
OS/Arch: linux/amd64
# docker info | grep Stor
WARNING: No swap limit support
Storage Driver: aufs
@pmichali is experiencing the same problem.
We are trying to use kubeadm-dind-cluster for k8s IPv6 e2e testing. More details on the issue can be found at kubernetes/kubernetes#47666
Hello,
I seem to be having an issue getitng hostport working I am trying to get the off the shelf nginx ingress controller working. I am including the yaml for reference. Since this is powered by kubeam I assumed I would have to do the hostNetwork: true
trick to get hostPort
to function correctly
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: default-http-backend
labels:
k8s-app: default-http-backend
namespace: kube-system
spec:
replicas: 1
template:
metadata:
labels:
k8s-app: default-http-backend
spec:
terminationGracePeriodSeconds: 60
containers:
- name: default-http-backend
# Any image is permissable as long as:
# 1. It serves a 404 page at /
# 2. It serves 200 on a /healthz endpoint
image: gcr.io/google_containers/defaultbackend:1.0
livenessProbe:
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 30
timeoutSeconds: 5
ports:
- containerPort: 8080
resources:
limits:
cpu: 10m
memory: 20Mi
requests:
cpu: 10m
memory: 20Mi
---
apiVersion: v1
kind: Service
metadata:
name: default-http-backend
namespace: kube-system
labels:
k8s-app: default-http-backend
spec:
ports:
- port: 80
targetPort: 8080
selector:
k8s-app: default-http-backend
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginx-ingress-controller
labels:
k8s-app: nginx-ingress-controller
namespace: kube-system
spec:
replicas: 1
template:
metadata:
labels:
k8s-app: nginx-ingress-controller
spec:
# hostNetwork makes it possible to use ipv6 and to preserve the source IP correctly regardless of docker configuration
# however, it is not a hard dependency of the nginx-ingress-controller itself and it may cause issues if port 10254 already is taken on the host
# that said, since hostPort is broken on CNI (https://github.com/kubernetes/kubernetes/issues/31307) we have to use hostNetwork where CNI is used
# like with kubeadm
hostNetwork: true
terminationGracePeriodSeconds: 60
containers:
- image: gcr.io/google_containers/nginx-ingress-controller:0.9.0-beta.3
name: nginx-ingress-controller
readinessProbe:
httpGet:
path: /healthz
port: 10254
scheme: HTTP
livenessProbe:
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 10
timeoutSeconds: 1
ports:
- containerPort: 80
hostPort: 80
- containerPort: 443
hostPort: 443
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/default-http-backend
When trying to start up DinD, during kubeadm init, a failure is seen for Docker Application Container Engine, saying driver not supported.
When checking the master container, docker is not running due to this error. If an attempt is made to manually start docker, the same error is seen.
This works on Ubuntu 16.04. In comparing the two, we see that for Ubuntu the host and the master container use the "overlay2" driver. Both are running Ubuntu. For CentOS 7, the host is using "devicemapper" (overlay2 is not supported), and the master container is using "overlay2". However, the container OS is RHEL 4.8.5-11.
Is it possible that RHEL doesn't support "overlay2" driver? If not, how to force a different driver that does work?
Below line is not required any more as Weave CNI is now supported in kubeadm-dind-cluster:
https://github.com/Mirantis/kubeadm-dind-cluster/blob/master/config.sh#L31
This is to track efforts to add support in DinD for Kubernetes cluster running in IPv6 mode (initially IPv6 only, later dualstack), connected to a IPv4 network. Goal is to be able to use DinD for E2E testing of IPv6 functionality in Kubernetes.
Note: IPv6 support in Kubernetes is a WIP, and as such, for now, custom Kubernetes repos are used that have patches/changes that are in-flight for kubernetes.
A DinD PR will be provided, based on the fix-1.8+ branch that is currently available (and based on the latest on master branch).
I am using master for kdc and k8s and the dind::at-least-kubeadm-1-8 function does not work properly for me. My deployment completes successfully, but kubeadm init does not reference kubeadm.conf. When I remove the reference to dind::at-least-kubeadm-1-8 in the dind::init function, kubeadm init works properly by using kubeadm.conf. Here are a few details from my deployment.
I have to use sudo -E
when calling the gce-setup.sh script, otherwise I get permission errors when running kdc:
$ . /Users/daneyonhansen/code/go/src/github.com/Mirantis/kubeadm-dind-cluster/gce-setup.sh
<SNIP>
chown: /Users/daneyonhansen/code/go/src/k8s.io/kubernetes/_output/images/kube-build:build-f0510dd6c7-5-v1.9.1-1/Dockerfile: Operation not permitted
<SNIP>
When I run with sudo -E
, my deployment completes, but does not use kubeadm.conf:
$ sudo -E /Users/daneyonhansen/code/go/src/github.com/Mirantis/kubeadm-dind-cluster/gce-setup.sh
<SNIP>
* Starting DIND container: kube-master
* Running kubeadm: init --pod-network-cidr=10.244.0.0/16 --skip-preflight-checks
<SNIP>
It appears the regex used in dind::at-least-kubeadm-1-8 does not work:
$ docker exec kube-master kubeadm version -o short 2>/dev/null|sed 's/^\(v[0-9]*\.[0-9]*\).*')
-bash: syntax error near unexpected token `)'
This is what I get without the regex:
$ sudo -E docker exec kube-master kubeadm version -o short
v1.9.0-alpha.2.852+cbdd18eee97369
I noticed when going to use this, two things that concerned me:
I would expect the default behavior of the v1.6 script for example to use the latest bugfix version of v1.6. And optionally when starting up could override to use a different one.
I am trying to deploy kubeadm-dind-cluster to gce using the gce setup script. The deployment fails because docker engine does not start. The daemon does not start because the systemd dropin references /usr/bin/docker daemon
instead of /usr/bin/dockerd
. As soon as I update the dropin, reload and restart the daemon, docker successfully starts.
I am running docker-machine version 0.10.0, build 76ed2a6
. Here are the details of the issue. Do you have any recommendations for docker-machine creating the proper daemon name in the dropin?
With the release of k8s 1.8, are you planning to release a 1.8 version of kubeadm-dind-cluster? If so, what is the planned release date?
On alpine including running in the standard "docker:latest" image, the checksum calculation of kubectl
fails on first run, this appear to be because the echo
piped into sha1sum -c
is missing an extra space
echo "${kubectl_sha1} ${path}" | sha1sum -c
should be (note the two spaces rather than one)
echo "${kubectl_sha1} ${path}" | sha1sum -c
I haven't tested running it on different linux variants, as this might be a busybox / alpine specific problem, thus issue rather than a PR.
Is there a way to mount host volumes
Let's start with Flannel and Calico
I see the following error in a gce ipv6 deployment:
dind-cluster.sh: line 486: ip: command not found
Here is the related function in the script:
function dind::ensure-nat {
if [[ ${IP_MODE} = "ipv6" ]]; then
if ! docker ps | grep tayga >&/dev/null; then
docker run -d --name tayga --hostname tayga --net kubeadm-dind-net --label mirantis.kubeadm_dind_cluster \
--sysctl net.ipv6.conf.all.disable_ipv6=0 --sysctl net.ipv6.conf.all.forwarding=1 \
--privileged=true --ip 172.18.0.200 --ip6 ${LOCAL_NAT64_SERVER} --dns ${REMOTE_DNS64_V4SERVER} --dns ${dns_server} \
-e TAYGA_CONF_PREFIX=${DNS64_PREFIX_CIDR} -e TAYGA_CONF_IPV4_ADDR=172.18.0.200 \
danehans/tayga:latest >/dev/null
# TODO Way to add route w/o sudo? Need to check/create, as "clean" may remove route
local route="$(ip route | egrep "^172.18.0.128/25")"
if [[ -z "${route}" ]]; then
if [[ "${GCE_HOSTED}" = true ]]; then
docker-machine ssh k8s-dind sudo ip route add 172.18.0.128/25 via 172.18.0.200
else
sudo ip route add 172.18.0.128/25 via 172.18.0.200
fi
fi
fi
fi
}
I believe how the route is checked is dependent on $GCE_HOSTED. If $GCE_HOSTED = true, then:
docker-machine ssh k8s-dind sudo ip route | egrep "^172.18.0.128/25"
else:
ip route | egrep "^172.18.0.128/25"
cc @pmichali
Run "./dind-cluster-v1.6.sh up", can create the cluster succesful.
NAME STATUS AGE VERSION
kube-master Ready 9m v1.6.6
kube-node-1 Ready 6m v1.6.6
kube-node-2 Ready 6m v1.6.6
* Access dashboard at: http://localhost:8080/ui
But it spent so long time, check the print & found "[apiclient] All control plane components are healthy after 35581.805504 seconds".
Where can I got the full log, then to analysis what happand in the "35581.805504 seconds"? Thanks.
[init] Using Kubernetes version: v1.6.7
[init] Using Authorization mode: RBAC
[preflight] Skipping pre-flight checks
[certificates] Generated CA certificate and key.
[certificates] Generated API server certificate and key.
[certificates] API Server serving cert is signed for DNS names [kube-master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.192.0.2]
[certificates] Generated API server kubelet client certificate and key.
[certificates] Generated service account token signing key and public key.
[certificates] Generated front-proxy CA certificate and key.
[certificates] Generated front-proxy client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[apiclient] Created API client, waiting for the control plane to become ready
[apiclient] All control plane components are healthy after 35581.805504 seconds
[apiclient] Waiting for at least one node to register
[apiclient] First node has registered after 2.537246 seconds
[token] Using token: 7561fb.eea50f2e1a373de4
[apiconfig] Created RBAC rules
[addons] Created essential addon: kube-proxy
[addons] Created essential addon: kube-dns
Your Kubernetes master has initialized successfully!
Hello! I think people would see the obvious benefit of your approach if you documented what a workflow for using kubeadm-dind-cluster would look like for a kubernetes core developer. As an example:
git pull foo
It appears to me that steps 3-5 are very well suited to kubeadm-dind-cluster and probably much faster than using a VM, I think by giving people an example like this they can see how it would work first hand.
kube-apiserver and kube-controller-manager failed to start.
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
45dd162516f4 40c77120c3b5 "kube-apiserver --ins" 11 seconds ago Exited (2) 9 seconds ago k8s_kube-apiserver_kube-apiserver-kube-master_kube-system_e8e89663840a3c709cb6fb6c80d6114a_5
root@kube-master:/etc/kubernetes/manifests# docker logs 45dd162516f4
unknown flag: --insecure-port
Usage of kube-apiserver:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b8cc58d08dd5 40c77120c3b5 "kube-controller-mana" 11 seconds ago Exited (2) 10 seconds ago k8s_kube-controller-manager_kube-controller-manager-kube-master_kube-system_504f2ed899e013f03c12d042973e8167_6
root@kube-master:/etc/kubernetes/manifests# docker logs b8cc58d08dd5
unknown flag: --root-ca-file
Usage of kube-controller-manager:
Kubenet support is needed for k8s e2e. Here is how you configure kubenet networking:
Do not setup a CNI conf file and do not call any CNI plugin. Kubenet wraps the CNI bridge and local-ipam plugins and dynamically creates the CNI conf file based on the controller-manager args below.
Update all kubelet configs to include:
Environment="KUBELET_NETWORK_ARGS=--network-plugin=kubenet --non-masquerade-cidr=$CLUSTER_CIDR"
- --allocate-node-cidrs=true
- --cluster-cidr=$CLUSTER_CIDR
sudo route add -net 10.10.1.0 netmask 255.255.255.0 gw 192.168.100.20
sudo route add -net 10.10.0.0 netmask 255.255.255.0 gw 192.168.100.10
cc: @pmichali
I installed kubeadm-dind-cluster successfully on my local host using dind-cluster-v1.8.sh up
, i created deployment and exposed a service. I can access that service inside master container, but failed on local. Can anybody help me please?
Hi,
I made a kubernetes workshop, and some people made a fix to run kubeadm-dind on OSX:
"one1zero1one
Done also in OSX 10.11.6, had to do the following: brew install md5shasum, created a /boot folder in OSX and added it to docker preferences file sharing, then did a chmod a+x kubectl-v1.6.1 in the /Uses/user/.kubeadm-dind-cluster - after that the script runs fine and you get teh cluster."
I don't have OSX myself, but I am gonna try to reproduce it via a vagrant OSX box using KVM and nested virtualization.
I think he used Docker-on-mac.
In wrapkubeadm, the script passes --conntrack-max=0 and --conntrack-max-per-core=0 to kube-proxy in an attempt to tell it to skip trying to update the hashsize, when there is a large conntrack max configured on the host to avoid docker issue moby/moby#24000.
Unfortunately, recent changes to kube-proxy have it read the CLI args, and then attempt to read from a config file. If there is a config file (there always is), the CLI args are ignored. As a result, with the reading of the config file, there are no conntrack settings so the default conntrack max-per-core value of 32768 is used. When on a system with 32 CPUs, the resulting conntrack value (1048576) can be more than four times larger than the hashsize (e.g. 262144), which causes kube-proxy to attempt to increase the hashsize and hits the docker issue.
DinD needs to be able to set both conntrack max and max-per-core to zero in the config file, which tells kube-proxy to ignore attempting to modify the hashsize in the condition where there is a large number of conntracks.
I'm trying to run your awesome project behind a corporate proxy.
I first ran $ ./dind-cluster-v1.8.sh up
to see what I get and it fails when trying to obtain mirantis/kubeadm-dind-cluster
from Docker Hub. I didn't configured the proxy on my Docker daemon as I'm proxifying any call to Docker Hub through an internal Docker registry. So I updated the name of the images to obtain in dind-cluster-v1.8.sh
with the prefix of my internal Docker registry.
I ran $ ./dind-cluster-v1.8.sh up
another time. Pulling mirantis/kubeadm-dind-cluster
worked. But then I got another issue:
...
Created symlink from /etc/systemd/system/multi-user.target.wants/kubelet.service to /lib/systemd/system/kubelet.service.
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
unable to get URL "https://dl.k8s.io/release/stable-1.8.txt": Get https://dl.k8s.io/release/stable-1.8.txt: dial tcp 23.236.58.218:443: i/o timeout
More attempts to reach this URL follow and quite logically fail. So I have to exit the process manually at this stage.
Do you have any idea what I have to configure in order to be able to run the cluster from behind a corporate proxy (or if it was at least already tested) ?
Assuming you've already ran a ./dind-cluster-v1.6.sh up
successfully once, it would be nice to allow the cluster to still start up offline with no internet connection (airplane, park, etc.).
Turn off wifi on your laptop and run ./dind-cluster-v1.6.sh up
$ ./dind-cluster-v1.6.sh up
* Making sure DIND image is up to date
Error response from daemon: Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io on 192.168.65.1:53: read udp 192.168.65.2:43749->192.168.65.1:53: i/o timeout
I have not looked yet the extent to which a new script and changes would be needed to support v1.7 but putting a placeholder here so that it can be tracked if that's OK.
I successfully installed dind-cluster on my laptop using 1.8 sh. I then followed steps from k8s tutorial:
https://kubernetes.io/docs/tutorials/stateless-application/expose-external-ip-address/
It is possible to access the service from inside the cluster.
However, external port never gets assigned, EXTERNAL-IP stays forever in pending:
sava@DellXPS:/mnt/c/k8s/helloworld$ kubectl get services my-service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
my-service LoadBalancer 10.104.13.14 <pending> 8080:31339/TCP 21m
Could somebody give me a hint if / how would be possible to expose LoadBalancer service on the host with running kubeadm-dind-cluster? Could https://github.com/Mirantis/k8s-externalipcontroller be used?
To allow DinD to be easily used by developers at different locations, it would be nice to be able to specify the "zone" or have it read it in via config settings (like project is set).
May also want to allow the VM name to be overridden, so that the same script could be used from multiple machines.
In dind_cluster.sh, when E2E command is performed, it ends up running e2e.go and passing -check_version_skew, which appears to have been renamed to -check-version-skew.
If wanted, I could do a pull request with these changes.
So I have successfully used the dind-cluster-v1.5.sh script to install a 1.5 version of kubernetes in my docker container.
However when I use the dind-cluster-v1.6.sh script to install version 1.6 it hangs at the following;
" * Waiting for kube-proxy and the nodes "
At this point Kubectl commands are not executing from the k8s host or the master node. Is this a known issue?
When starting a 1.9 cluster using the fixed version script we fail to pull the image from docker hub:
./fixed/dind-cluster-v1.9.sh up
* Making sure DIND image is up to date
Error response from daemon: manifest for mirantis/kubeadm-dind-cluster:v1.9 not found
The v1.9
tag hasn't been pushed upstream yet: https://hub.docker.com/r/mirantis/kubeadm-dind-cluster/tags/
Solution is to tag and publish this image in the mirantis repo.
build your own image locally -- use the non-fixed script:
build/build-local.sh
DIND_IMAGE=mirantis/kubeadm-dind-cluster:local ./dind-cluster.sh up
use mine:
DIND_IMAGE="stealthybox/kubeadm-dind-cluster:v1.9" ./dind-cluster.sh up
As of now, /boot
and /lib/modules
are not copied from the host and we don't mount them with -v
either because this is not compatible with Moby Linux-based Dockers (e.g. Mac OS X). The proper solution to this is copying these directories from the host Linux by means of e.g. busybox image + nsenter. This is needed for Virtlet to work on kubeadm-dind-cluster.
I'm trying to increase my worker nodes to 3 and currently there is 2.
Couldn't find any workaround through the wiki.
This will make it easier to use kubeadm-dind-cluster for Virtlet devenv & CI.
Provide prebuilt image for k8s stable (1.5.x).
Try to make an image for k8s 1.4 if this is not too difficult.
Hi,
This is an awsome project, and I try it on "CentOS Linux release 7.3.1611 (Core)" with docker 1.12.6, but got the following error info.
It look like unsupport the devicemapper, which is the default graphdriver of CentOS, am I right?
[root@localhost kubeadm-dind-cluster]# ./dind-cluster-v1.5.sh up
WARNING: Usage of loopback devices is strongly discouraged for production use. Use--storage-opt dm.thinpooldev
to specify a custom block storage device.
WARNING: Usage of loopback devices is strongly discouraged for production use. Use--storage-opt dm.thinpooldev
to specify a custom block storage device.
- Making sure DIND image is up to date
Trying to pull repository docker.io/mirantis/kubeadm-dind-cluster ...
v1.5: Pulling from docker.io/mirantis/kubeadm-dind-cluster
952132ac251a: Already exists
82659f8f1b76: Already exists
c19118ca682d: Already exists
8296858250fe: Already exists
24e0251a0e2c: Already exists
2545d638d973: Already exists
e0b45d7ea196: Already exists
8d7d40f3e602: Already exists
216f5a138844: Already exists
c71de27d6b60: Already exists
b4905a66b05c: Already exists
88d9c6d89a0e: Already exists
5b20a29e0052: Already exists
096f47601f48: Already exists
cb5873b128e5: Already exists
90aa4e16a184: Pull complete
b9cbd586a93b: Pull complete
fe48e937b7c1: Pull complete
1a7ea6f613e5: Pull complete
0720888e3849: Pull complete
0e6e0fb90af6: Pull complete
6352f14208e8: Pull complete
5e25d1c1f645: Pull complete
93db4372e6a5: Pull complete
Digest: sha256:051af9b28a1cb767e91a678d89bbaa36007606b39d1242da68f5a069481d016e- Removing container: 87af074de5cb
87af074de5cb- Starting DIND container: kube-master
- Running kubeadm: init --skip-preflight-checks
Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details.
docker failed to start. Diagnostics below:
● docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/docker.service.d
└─10-dind.conf, 20-fs.conf
Active: failed (Result: exit-code) since Tue 2017-03-28 06:47:20 UTC; 19ms ago
Docs: https://docs.docker.com
Process: 205 ExecStart=/usr/local/bin/rundocker $DOCKER_EXTRA_OPTS (code=exited, status=1/FAILURE)
Main PID: 205 (code=exited, status=1/FAILURE)Mar 28 06:47:19 kube-master systemd[1]: Starting Docker Application Container Engine...
Mar 28 06:47:19 kube-master rundocker[205]: Trying to load overlay module (this may fail)
Mar 28 06:47:19 kube-master rundocker[205]: time="2017-03-28T06:47:19.483431970Z" level=info msg="libcontainerd: new containerd process, pid: 217"
Mar 28 06:47:20 kube-master rundocker[205]: time="2017-03-28T06:47:20.492311655Z" level=fatal msg="Error starting daemon: error initializing graphdriver: driver not supported"
Mar 28 06:47:20 kube-master systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Mar 28 06:47:20 kube-master systemd[1]: Failed to start Docker Application Container Engine.
Mar 28 06:47:20 kube-master systemd[1]: docker.service: Unit entered failed state.
Mar 28 06:47:20 kube-master systemd[1]: docker.service: Failed with result 'exit-code'.
*** kubeadm failed
[root@localhost kubeadm-dind-cluster]#
Basically the goal is to replace minikube with this. Minikube is cpu and memory intensive and has only one master node and no workers.
The dind-approach is by far better, just not as mature.
On thing that doesn't work (or I found no way of doing it) is to build images locally without pushing them to any repository and the use them in the dind-cluster.
That would be awesome.
Hi,
I am running the script under Alpine 3.7, and I hit the following issue:
root@host# ls -1 dind-cluster-v1.6.sh
dind-cluster-v1.6.sh
root@host# docker run --privileged -v $PWD:/mnt -it alpine:3.7 /bin/sh
root@container# apk update && apk add curl bash docker
root@container# ./dind-cluster-v1.6.sh up
[...]
No resources found
* Setting cluster config
./dind-cluster-v1.6.sh: line 615: /root/.kubeadm-dind-cluster/kubectl: Permission denied
If I do a chmod +x on it, it then works fine.
Current kubeadm no longer supports KUBE_HYPERKUBE_IMAGE
and requires passing control plane image via the config.
kubeadm.conf supports setting extra args for tailoring a kubeadm deployment. See the following example for using extra args to set k8s component IPs:
apiVersion: kubeadm.k8s.io/v1alpha1
kind: MasterConfiguration
apiServerExtraArgs:
etcd-servers: "http://${APISERVER_ADVERTISE_ADDRESS}:2379"
controllerManagerExtraArgs:
address: "${APISERVER_ADVERTISE_ADDRESS}"
schedulerExtraArgs:
address: "${APISERVER_ADVERTISE_ADDRESS}"
etcd:
extraArgs:
listen-client-urls: "http://${APISERVER_ADVERTISE_ADDRESS}:2379"
There's a typo in bash scripts due to which $POD_NETWORK_CIDR
is not changed to "192.168.0.0/16". I'm not quite sure why such CIDR adjustment is needed, but it does not work as expected.
Steps to reproduce:
SKIP_SNAPSHOT=true CNI_PLUGIN=calico-kdd ./dind-cluster-v1.8.sh up
and check Pods IPs
Expected result:
Pods have IPs from"192.168.0.0/16" network
Actual result:
Check Pod IPs, they are from "10.244.0.0/16" network
Hello,
Yes its that time again. So I ran into an issue where I use your v1.5 script to set up a 2 node k8s cluster. I then create a service which uses an external ip like
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
web 10.106.52.236 172.17.4.201 80:32023/TCP 1h
i can hit the service from the master k8s node like so;
curl 10.106.52.236:80 --> success 200
curl 0.0.0.0:32023 --> success 200
curl 127.0.0.1:32023 --> success 200
However whether I am on the master of the docker host, I cannot hit that EXTERNAL-IP of 172.17.4.201, is there any reason you can think of why this would be an issue?
Thanks,
aramisjohnson
When running DinD, if a kernel module attempts to target an ip address that is a Kubernetes service IP, the route does not get forwarded.
I've tried the different networking options of bridge, calico, flannel, and weave and they all have the same behavior that the kernel module fails to target the service IP.
In the GCE documentation it indicates that they had to set iptables and sysctl net.ipv4.ip_forward=1
so the kernel will work correctly with bridged containers. Seems like we need something similar for DinD to work with the kernel.
For example, in Rook we have a service defined:
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
rook-ceph-mon0 10.101.104.208 <none> 6790/TCP 42m
The service routes to the pod:
NAME READY STATUS RESTARTS AGE IP NODE
rook-ceph-mon0-kqtgv 1/1 Running 0 42m 10.244.1.5 kube-node-1
The routing happens perfectly from other pods/user mode. However, the rbd kernel module is not able to route with the service ip. Any input on this would be appreciated!
So this worked like a charm, i booted an ubuntu container with DinD enabled on it, then i ran ./dind-cluster-v1.5.sh up. The cluster/node came up in a matter of minutes. I can docker exec into the newly created container/node and start issuing kubectl commands like normal.
The only problem is, if i dont docker exec into the container/node, kubectl commands return the following error;
The connection to the server localhost:8080 was refused - did you specify the right host or port?
How can I enable it so kubectl commands work on the parent container?
Again great job with this and any help you can provide will be much appreciated...
Ty
I see that master node has exposed port 8080. I need to access some services exposed with NodePort from my machine. Is there an easy way to enable it?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.