Coder Social home page Coder Social logo

ivanfioravanti / kubernetes-the-hard-way-on-azure Goto Github PK

View Code? Open in Web Editor NEW

This project forked from kelseyhightower/kubernetes-the-hard-way

444.0 444.0 206.0 832 KB

Bootstrap Kubernetes the hard way on Microsoft Azure Platform. No scripts.

License: Apache License 2.0

Makefile 100.00%
azure kubernetes

kubernetes-the-hard-way-on-azure's People

Contributors

4c74356b41 avatar aberoham avatar adriaandejonge avatar agrajm avatar ahrkrak avatar akarih avatar alan01252 avatar amouat avatar artburkart avatar bhummerstone avatar borqosky avatar celamb4 avatar dturkenk avatar dy-dx avatar eamonkeane avatar fmoctezuma avatar german1608 avatar heoelri avatar ivanfioravanti avatar jomagam avatar kelseyhightower avatar ksingh7 avatar kyohmizu avatar lastcoolnameleft avatar msaravindh avatar mt-gitlocalize avatar rahulvmarathe avatar rkttu avatar vicperdana avatar vlad-m-r avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kubernetes-the-hard-way-on-azure's Issues

kube-dns does not work after walking through instructions.

After walking through the steps, I was unable to perform the kube-dns verification. When performing the DNS resolution, it would hang when running "nslookup" and ultimately respond with unable resolve host. I tried running the same command on the actual kube-dns pod, and the pod was still unable to resolve the host (However, the response was instantaneous instead of hanging)

Unfortunately, I lost the logs which showcased this, and my cluster is having problems starting up again, I apologize for not being able to provide better logs.

DNS resolution failed when there is app Pod and the coredns Pod on the same node

I need your help. I finished all process the hardway, and I'm checking name resolution from Pods.
Then I realized a problem that failed name resolution.

Following indicated Pod deployment:

  • coredns-59845f77f8-w26gc Pod is on worker-1
  • util3 is on worker-1
  • util4 is on worker-0
kuberoot@controller-0:~/cilium-master$ k get po -o wide -A
NAMESPACE              NAME                                         READY   STATUS    RESTARTS   AGE     IP            NODE       NOMINATED NODE   READINESS GATES
default                busybox                                      1/1     Running   10         99d     10.200.0.19   worker-0   <none>           <none>
default                busybox2                                     1/1     Running   1          114m    10.200.1.17   worker-1   <none>           <none>
default                nginx                                        1/1     Running   3          99d     10.200.0.17   worker-0   <none>           <none>
default                sample-2pod                                  2/2     Running   2          94d     10.200.0.18   worker-0   <none>           <none>
default                sample-pod                                   1/1     Running   2          95d     10.200.1.14   worker-1   <none>           <none>
default                ubuntu1                                      1/1     Running   1          112m    10.200.0.20   worker-0   <none>           <none>
default                ubuntu2                                      1/1     Running   1          112m    10.200.1.18   worker-1   <none>           <none>
default                util                                         1/1     Running   0          5h27m   10.200.1.16   worker-1   <none>           <none>
default                util2                                        1/1     Running   0          92m     10.200.0.21   worker-0   <none>           <none>
default                util3                                        1/1     Running   0          70m     10.200.1.19   worker-1   <none>           <none>
default                util4                                        1/1     Running   0          70m     10.200.0.22   worker-0   <none>           <none>
kube-system            coredns-59845f77f8-w26gc                     1/1     Running   0          22m     10.200.1.20   worker-1   <none>           <none>
kubernetes-dashboard   dashboard-metrics-scraper-7b8b58dc8b-m78x4   1/1     Running   3          99d     10.200.0.16   worker-0   <none>           <none>
kubernetes-dashboard   kubernetes-dashboard-866f987876-dxm4c        1/1     Running   9          99d     10.200.1.15   worker-1   <none>           <none>
kuberoot@controller-0:~/cilium-master$
kuberoot@controller-0:~/cilium-master$ k get svc -o wide -A
NAMESPACE              NAME                        TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)                  AGE    SELECTOR
default                busybox                     ClusterIP   10.32.0.33    <none>        80/TCP                   117m   run=busybox
default                kubernetes                  ClusterIP   10.32.0.1     <none>        443/TCP                  99d    <none>
default                nginx                       NodePort    10.32.0.41    <none>        80:30557/TCP             99d    run=nginx
default                util                        ClusterIP   10.32.0.202   <none>        80/TCP                   123m   run=util
kube-system            kube-dns                    ClusterIP   10.32.0.10    <none>        53/UDP,53/TCP,9153/TCP   99d    k8s-app=kube-dns
kubernetes-dashboard   dashboard-metrics-scraper   ClusterIP   10.32.0.102   <none>        8000/TCP                 99d    k8s-app=dashboard-metrics-scraper
kubernetes-dashboard   kubernetes-dashboard        ClusterIP   10.32.0.135   <none>        443/TCP                  99d    k8s-app=kubernetes-dashboard

Senario 1 - try name resolve from util3(coredns Pod is same node):

$ k exec -it util3 -- nslookup www.bing.com
;; reply from unexpected source: 10.200.1.20#53, expected 10.32.0.10#53
;; reply from unexpected source: 10.200.1.20#53, expected 10.32.0.10#53
;; reply from unexpected source: 10.200.1.20#53, expected 10.32.0.10#53
;; connection timed out; no servers could be reached

command terminated with exit code 1

Senario 2 - try name resolve from util4(coredns Pod is NOT same node):

$ k exec -it util4 -- nslookup www.bing.com
Server:         10.32.0.10
Address:        10.32.0.10#53

Non-authoritative answer:
www.bing.com    canonical name = a-0001.a-afdentry.net.trafficmanager.net.
a-0001.a-afdentry.net.trafficmanager.net        canonical name = dual-a-0001.a-msedge.net.
Name:   dual-a-0001.a-msedge.net
Address: 13.107.21.200
Name:   dual-a-0001.a-msedge.net
Address: 204.79.197.200
Name:   dual-a-0001.a-msedge.net
Address: 2620:1ec:c11::200

I tried deleting coredns to move it to other node and this result was reverse as I expected.
I hope to name resolution both senarios.
Could you give me some idea?

cfssl on osx

Consider updating to use homebrew install for OSX

Brew install cfssl

Kubectl unable to connect on masters

Hello,

Apologies if this is not the correct place to ask this. I have been following these instructions on Azure using the free tier with demo license.

First off contrary to the instructions, I am only able to create four VMs. In my case I have settled with 2 controllers and 2 workers.

While following along 08-bootstrapping-kubernetes-controllers.md and attempting to verify everything works, I am unable to connect to the apiserver locally from the master.

kuberoot@controller-0:~$ kubectl get componentstatuses --kubeconfig admin.kubeconfig The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?

If anyone can provide feedback of a point in the right direction, it would be much appreciated.

Thank you kindly,

`kuberoot@controller-0:~$ kubectl config view --kubeconfig admin.kubeconfig
apiVersion: v1
clusters:

  • cluster:
    certificate-authority-data: DATA+OMITTED
    server: https://127.0.0.1:6443
    name: kubernetes-the-hard-way
    contexts:
  • context:
    cluster: kubernetes-the-hard-way
    user: admin
    name: default
    current-context: default
    kind: Config
    preferences: {}
    users:
  • name: admin
    user:
    client-certificate-data: REDACTED
    client-key-data: REDACTED`

Output of journalctl -u kube-apiserver is here:
https://pastebin.com/zgxkAtj3

Update to 1.9.2

Original repo has been updated to Kubernetes 1.9.0. I'll update this one to 1.9.2 in the next days.

'Error from server (NotFound): deployments.extensions "nginx" not found' when exposing NodePort

@ivanfioravanti, I just ran the command to expose the NodePort based on ngnix deployment and got an error.

kubectl expose deployment nginx --port 80 --type NodePort

Error from server (NotFound): deployments.extensions "nginx" not found

To fix, just generated the deploy yaml file and exposed based on that:

kubectl get pods -o yaml nginx > nginx.yaml
kubectl expose -f nginx.yaml --port 80 --type NodePort

service/nginx exposed

curl -I http://${EXTERNAL_IP}:${NODE_PORT}

HTTP/1.1 200 OK
Server: nginx/1.15.8
Date: Mon, 14 Jan 2019 17:36:35 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 25 Dec 2018 09:56:47 GMT
Connection: keep-alive
ETag: "5c21fedf-264"
Accept-Ranges: bytes

Is this repo still active?

Hi,

There's a few things I'd like to contribute back to this repo, but I've noticed lingering issues and PRs for a few months now. Just want to check in if this is still active. I'd prefer to contribute back to here but will consider forking to keep the maintenance going if need be.

Thanks

cfssljson version not defined

The issue is cause by cfssljson version parameter not defined.

In the repo says
cfssljson -version which is not correct ( as follows )

image

I believe the correct practice is to provide version and help as parameter so that we can use the following

cfssljson version
cfssljson help

Thanks

It's not working with the free tier on azure

Hey @ivanfioravanti,

thanks for supporting the hard way on azure. Currently it isn't working anymore with the free tier. After creating the controller and worker. I got the following error message.

az vm availability-set create -g kubernetes -n controller-as
for i in 0 1 2; do
    echo "[Controller ${i}] Creating public IP..."
    az network public-ip create -n controller-${i}-pip -g kubernetes > /dev/null

    echo "[Controller ${i}] Creating NIC..."
    az network nic create -g kubernetes \
        -n controller-${i}-nic \
        --private-ip-address 10.240.0.1${i} \
        --public-ip-address controller-${i}-pip \
        --vnet kubernetes-vnet \
        --subnet kubernetes-subnet \
        --ip-forwarding \
        --lb-name kubernetes-lb \
        --lb-address-pools kubernetes-lb-pool > /dev/null

    echo "[Controller ${i}] Creating VM..."
    az vm create -g kubernetes \
        -n controller-${i} \
        --image Canonical:UbuntuServer:16.04.0-LTS:latest \
        --nics controller-${i}-nic \
        --availability-set controller-as \
        --nsg '' > /dev/null
done
az vm availability-set create -g kubernetes -n worker-as
[Controller 0] Creating public IP...
[Controller 0] Creating NIC...
[Controller 0] Creating VM...
[Controller 1] Creating public IP...
[Controller 1] Creating NIC...
[Controller 1] Creating VM...
[Controller 2] Creating public IP...
[Controller 2] Creating NIC...
[Controller 2] Creating VM...
[Worker 0] Creating public IP...
[Worker 0] Creating NIC...
[Worker 0] Creating VM...
[Worker 1] Creating public IP...
[Worker 1] Creating NIC...
[Worker 1] Creating VM...
Deployment failed. Tracking ID: xxxxxxx-xxx-xxxx-xxxx-xxxx-xxxxxx. Operation results in exceeding quota limits of Core. Maximum allowed: 4, Current in use: 4, Additional requested: 1. Please read more about quota increase at http://aka.ms/corequotaincrease.
[Worker 2] Creating public IP...
[Worker 2] Creating NIC...
[Worker 2] Creating VM...
Deployment failed. Tracking ID: xxxxxxx-xxx-xxxx-xxxx-xxxx-xxxxxx. Operation results in exceeding quota limits of Core. Maximum allowed: 4, Current in use: 4, Additional requested: 1. Please read more about quota increase at http://aka.ms/corequotaincrease.

So what should we do now ?

Error: dial tcp 10.240.0.12:2397: connect: connection refused

I'm having a bear of a time getting the etcd cluster working as expected.
I've recreated all the resources twice now and I keep getting this error:

tim@controller-2:~$ sudo ETCDCTL_API=3 etcdctl member list \
   --endpoints=https://${INTERNAL_IP}:2379 \
   --cacert=/etc/etcd/ca.pem \
   --cert=/etc/etcd/kubernetes.pem \
   --key=/etc/etcd/kubernetes-key.pem

Error: dial tcp 10.240.0.12:2379: connect: connection refused

At this step:
https://github.com/ivanfioravanti/kubernetes-the-hard-way-on-azure/blob/master/docs/07-bootstrapping-etcd.md#verification

I have no other errors, and I'm wondering if I didn't configure the NSG correctly? Or if the Nic's aren't properly provisioned for the VMs?

I can get this to work: https://github.com/etcd-io/etcd/blob/master/Documentation/demo.md
But then its not run as a daemon.

Issue when running

I am using Azure VM to bootstrap the cluster using below commands
sudo ETCDCTL_API=3 etcdctl member list
--endpoints=https://${INTERNAL_IP}:2379
--cacert=/etc/etcd/ca.pem
--cert=/etc/etcd/kubernetes.pem
--key=/etc/etcd/kubernetes-key.pem

However it is failing with below error -
"retrying of unary invoker failed" probably due to "Connection timeout" issue

I tried increasing the size of VM (as suggested in one of Github articles) but the issue still persists , any help or guidance you can extend for the same

Create the kube-apiserver.service systemd unit file PUBLIC_IP_ADDRESS not set

When creating kube-apiserver.service PUBLIC_IP_ADDRESS is not set in that context/scope, and for coherence I believe it might be KUBERNETES_PUBLIC_ADDRESS (Load balancer IP) only available on the original machine not on the controllers.

cat <<EOF | sudo tee /etc/systemd/system/kube-apiserver.service
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes

[Service]
ExecStart=/usr/local/bin/kube-apiserver \\
  --advertise-address=${INTERNAL_IP} \\
  --allow-privileged=true \\
  --audit-log-maxage=30 \\
  --audit-log-maxbackup=3 \\
  --audit-log-maxsize=100 \\
  --audit-log-path=/var/log/audit.log \\
  --authorization-mode=Node,RBAC \\
  --bind-address=0.0.0.0 \\
  --client-ca-file=/var/lib/kubernetes/ca.pem \\
  --enable-admission-plugins=NamespaceLifecycle,NodeRestriction,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota \\
  --etcd-cafile=/var/lib/kubernetes/ca.pem \\
  --etcd-certfile=/var/lib/kubernetes/kubernetes.pem \\
  --etcd-keyfile=/var/lib/kubernetes/kubernetes-key.pem \\
  --etcd-servers=https://10.240.0.10:2379,https://10.240.0.11:2379,https://10.240.0.12:2379 \\
  --event-ttl=1h \\
  --encryption-provider-config=/var/lib/kubernetes/encryption-config.yaml \\
  --kubelet-certificate-authority=/var/lib/kubernetes/ca.pem \\
  --kubelet-client-certificate=/var/lib/kubernetes/kubernetes.pem \\
  --kubelet-client-key=/var/lib/kubernetes/kubernetes-key.pem \\
  --runtime-config='api/all=true' \\
  --service-account-key-file=/var/lib/kubernetes/service-account.pem \\
  --service-account-signing-key-file=/var/lib/kubernetes/service-account-key.pem \\
  --service-account-issuer=https://${PUBLIC_IP_ADDRESS}:6443 \\
  --service-cluster-ip-range=10.32.0.0/24 \\
  --service-node-port-range=30000-32767 \\
  --tls-cert-file=/var/lib/kubernetes/kubernetes.pem \\
  --tls-private-key-file=/var/lib/kubernetes/kubernetes-key.pem \\
  --v=2
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

etcdctl: command not found

When following the instructions for Bootstrapping the etcd Cluster, you are instructed to run a specific command to list the etcd cluster members.

sudo ETCDCTL_API=3 etcdctl member list \
  --endpoints=https://${INTERNAL_IP}:2379 \
  --cacert=/etc/etcd/ca.pem \
  --cert=/etc/etcd/kubernetes.pem \
  --key=/etc/etcd/kubernetes-key.pem

However, when you do so, you encounter the error:
sudo: etcdctl: command not found

The reason is because etcdctl is not already installed. You should add to the Installing the Client Tools instructions, to also install etcdctl, via the sudo apt install etcd-client command.

Current guidance doesn't work for free tier of Azure

The free tier only allows three PIPs, so the way I was able to proceed was by using ONE controller and ONE worker, since one PIP is already taken up by kubernetes-pip.

If it seems suitable to the maintainer(s), I'm happy to make updates to the docs so that free tier users only create one controller and one worker.

kube-controller-manager.kubeconfig missing?

I am new to this, and for some reason, I am not able to find the file "kube-controller-manager.kubeconfig" here: https://github.com/ivanfioravanti/kubernetes-the-hard-way-on-azure/blob/master/docs/08-bootstrapping-kubernetes-controllers.md#configure-the-kubernetes-controller-manager

I see "kube-controller-manager", which has been moved to "/usr/local/bin/" in an earlier step, but I can't seem to find the "kube-controller-manager.kubeconfig" file itself.

Any help would be appreciated.

kubectl and kube-proxy unable to connect to load balancer in 09-bootstrapping-kubernetes-workers

When executing kubectl get nodes no ressources were found.
After investigation of logs using journalctl -fu kube-proxy and journalctl -fu kube-proxy it turned out it was due to the missing IP of the load balancer (processes were trying to make connection to https://:6443 instead of http://${LOAD_BALANCER_IP}:6443).
I had to modify the address in the server section of /var/lib/kubelet/kubeconfig and /var/lib/kube-proxy/kubeconfig

On Windows, command needs to be run on C: drive

If you have generated all the configs and keys on some drive other than C:, you will receive an error like below -

PS D:\dev\cka> kubectl config set-credentials admin --client-certificate="admin.pem" --client-key="admin-key.pem"
error: Rel: can't make D:\dev\cka\admin.pem relative to C:\Users\Pranav\.kube

when trying to follow this page. The following command will fail -

kubectl config set-credentials admin \
  --client-certificate=admin.pem \
  --client-key=admin-key.pem

Can some instructions be added to that page or even at the home page to ensure that users always follow this guide while being on C: drive on Windows? I have seen other issues like this happen with kubectl and minikube.

Walk failed: GetFileAttributesEx version: The system cannot find the file specified.

I am getting the below error when I try to install CFSSLJSON:

PS C:\windows\system32> Invoke-WebRequest -Uri https://pkg.cfssl.org/R1.2/mkbundle_windows-amd64.exe -OutFile cfssljson.exe
PS C:\windows\system32> cfssljson version
[INFO] Found version
[ERROR] Walk failed: GetFileAttributesEx version: The system cannot find the file specified.
[INFO] Wrote 0 certificates.

Error persists even after downloading and running file from the browser link instead of running it in powershell.

Mismatching cni and cri versions

There is a small problem in https://github.com/ivanfioravanti/kubernetes-the-hard-way-on-azure/blob/master/docs/09-bootstrapping-kubernetes-workers.md: the instructions ask to download cni-plugins-amd64-v0.7.0.tgz and cri-containerd-1.0.0-beta.1.linux-amd64.tar.gz but after that they try to install previous versions of those packages:

---
sudo tar -xvf cni-plugins-amd64-v0.6.0.tgz -C /opt/cni/bin/
+++
sudo tar -xvf cni-plugins-amd64-v0.7.0.tgz -C /opt/cni/bin/

and

---
sudo tar -xvf cri-containerd-1.0.0-alpha.0.tar.gz -C /
+++
sudo tar -xvf cri-containerd-1.0.0-beta.1.linux-amd64.tar.gz -C /

unable to start etcd

Please help me fix the below error, i am facing this when i try to start the etcd.

kuberoot@controller-0:~$ {

sudo systemctl daemon-reload
sudo systemctl enable etcd
sudo systemctl start etcd
}
Created symlink /etc/systemd/system/multi-user.target.wants/etcd.service โ†’ /etc/systemd/system/etcd.service.
Job for etcd.service failed because a timeout was exceeded.
See "systemctl status etcd.service" and "journalctl -xe" for details.
kuberoot@controller-0:~$ journalctl -xe
Jan 15 14:06:37 controller-0 etcd[3392]: health check for peer ffed16798470cab5 could not connect: dial tcp 10.240.0.11:2380: connect: connection refused (prober "ROUND_TRIPPER_RAFT_MESSAGE")
Jan 15 14:06:37 controller-0 etcd[3392]: health check for peer 3a57933972cb5131 could not connect: dial tcp 10.240.0.12:2380: connect: connection refused (prober "ROUND_TRIPPER_SNAPSHOT")
Jan 15 14:06:37 controller-0 etcd[3392]: health check for peer ffed16798470cab5 could not connect: dial tcp 10.240.0.11:2380: connect: connection refused (prober "ROUND_TRIPPER_SNAPSHOT")
Jan 15 14:06:38 controller-0 etcd[3392]: f98dc20bce6225a0 is starting a new election at term 148
Jan 15 14:06:38 controller-0 etcd[3392]: f98dc20bce6225a0 became candidate at term 149
Jan 15 14:06:38 controller-0 etcd[3392]: f98dc20bce6225a0 received MsgVoteResp from f98dc20bce6225a0 at term 149
Jan 15 14:06:38 controller-0 etcd[3392]: f98dc20bce6225a0 [logterm: 1, index: 3] sent MsgVote request to ffed16798470cab5 at term 149
Jan 15 14:06:38 controller-0 etcd[3392]: f98dc20bce6225a0 [logterm: 1, index: 3] sent MsgVote request to 3a57933972cb5131 at term 149
Jan 15 14:06:39 controller-0 sshd[3434]: Invalid user baikal from 207.118.182.172 port 34502
Jan 15 14:06:39 controller-0 sshd[3434]: Received disconnect from 207.118.182.172 port 34502:11: disconnected by user [preauth]
Jan 15 14:06:39 controller-0 sshd[3434]: Disconnected from invalid user baikal 207.118.182.172 port 34502 [preauth]
Jan 15 14:06:39 controller-0 etcd[3392]: f98dc20bce6225a0 is starting a new election at term 149
Jan 15 14:06:39 controller-0 etcd[3392]: f98dc20bce6225a0 became candidate at term 150

The NSG parameter needs to be removed creating the Controller Virtual machines

In this section - https://github.com/ivanfioravanti/kubernetes-the-hard-way-on-azure/blob/master/docs/03-compute-resources.md#kubernetes-controllers

This modification also needs to happen in this section - https://github.com/ivanfioravanti/kubernetes-the-hard-way-on-azure/blob/master/docs/03-compute-resources.md#kubernetes-workers

The NSG must be specified for the code to run. Otherwise, the command fails with -

When specifying an existing NIC, do not specify NSG, public IP, ASGs, VNet or subnet.

The code should be like below -

for i in 0 1 2; do
    echo "[Controller ${i}] Creating public IP..."
    az network public-ip create -n controller-${i}-pip -g kubernetes > /dev/null

    echo "[Controller ${i}] Creating NIC..."
    az network nic create -g kubernetes \
        -n controller-${i}-nic \
        --private-ip-address 10.240.0.1${i} \
        --public-ip-address controller-${i}-pip \
        --vnet kubernetes-vnet \
        --subnet kubernetes-subnet \
        --ip-forwarding \
        --lb-name kubernetes-lb \
        --lb-address-pools kubernetes-lb-pool > /dev/null

    echo "[Controller ${i}] Creating VM..."
    az vm create -g kubernetes \
        -n controller-${i} \
        --image ${UBUNTULTS} \
        --nics controller-${i}-nic \
        --availability-set controller-as \
        --admin-username 'kuberoot' > /dev/null \
        --generate-ssh-keys
done

Tested on Windows.

Recent update seems to be missing ClusterRole for CoreDNS

This recent update seems to have duplicated ClusterRoleBinding declarations and removed ClusterRole declaration for CoreDNS:
fcf8cfd#diff-2225f0b75853ffad6a04bccdf2e6ecf7

Pulling logs for the CoreDNS pod returns the following error:
Failed to list *v1.Namespace: namespaces is forbidden: User "system:serviceaccount:kube-system:coredns" cannot list resource "namespaces" in API group "" at the cluster scope: RBAC: clusterrole.rbac.authorization.k8s.io "system:coredns" not found

Failed create pod sandbox: open /run/systemd/resolve/resolv.conf: no such file or directory

Hi I followed the steps for Azure and noticed that Ubuntu 16 not 18 was deployed

When starting up coredns my POD would not start and I see this error

Warning FailedCreatePodSandBox 88s (x575 over 126m) kubelet, worker-1 Failed create pod sandbox: open /run/systemd/resolve/resolv.conf: no such file or directory

I fixed by linking ln -s /run/resolvconf/ /run/systemd/resolve as described in
kubernetes/kubeadm#1124

Full output of pod

kubectl describe pod --namespace=kube-system coredns-699f8ddd77-5x8n6

Name: coredns-699f8ddd77-5x8n6
Namespace: kube-system
Priority: 0
PriorityClassName:
Node: worker-1/10.240.0.21
Start Time: Sun, 23 Dec 2018 10:58:49 +0000
Labels: k8s-app=kube-dns
pod-template-hash=699f8ddd77
Annotations:
Status: Pending
IP:
Controlled By: ReplicaSet/coredns-699f8ddd77
Containers:
coredns:
Container ID:
Image: coredns/coredns:1.2.2
Image ID:
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Environment:
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-48jfv (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-48jfv:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-48jfv
Optional: false
QoS Class: Burstable
Node-Selectors:
Tolerations: CriticalAddonsOnly
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message


Warning FailedCreatePodSandBox 88s (x575 over 126m) kubelet, worker-1 Failed create pod sandbox: open /run/systemd/resolve/resolv.conf: no such file or directory

Thanks
Vittorio

DNS Cluster Add-on issue

All seems fine with the exception of the DNS add-on, despite all appearing well kube-dns fails to resolve anything:

PS C:\Users\user> kubectl get service --all-namespaces
NAMESPACE     NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)         AGE
default       kubernetes   ClusterIP   10.32.0.1    <none>        443/TCP         6d
kube-system   kube-dns     ClusterIP   10.32.0.10   <none>        53/UDP,53/TCP   2h

PS C:\Users\user> kubectl exec -ti $POD_NAME -- nslookup kubernetes.default
Server:         10.32.0.10
Address:        10.32.0.10:53

** server can't find kubernetes.default: NXDOMAIN

*** Can't find kubernetes.default: No answer

PS C:\Users\user> kubectl exec -ti $POD_NAME -- cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local iuupro2pvonutmgl325yqnbsdsh.fx.internal.cloudapp.net
nameserver 10.32.0.10
options ndots:5

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.