Coder Social home page Coder Social logo

ottoyiu / k8s-ec2-srcdst Goto Github PK

View Code? Open in Web Editor NEW
18.0 5.0 8.0 42.37 MB

A Kubernetes Controller that will ensure that the EC2 Source Destination Check (source-dest-check attribute) is disabled on nodes within the cluster.

License: Apache License 2.0

Go 89.56% Makefile 8.89% Dockerfile 1.55%
kubernetes calico calico-deployment calico-tutorials kubernetes-networking aws-ec2 kops

k8s-ec2-srcdst's Introduction

k8s-ec2-srcdst (formerly as kubernetes-ec2-srcdst-controller)

Build Status Go Report Card

A Kubernetes Controller that will ensure that Source/Dest Check on the nodes within the cluster that are EC2 instances, are disabled. This is useful for Calico deployments in AWS where routing within a VPC subnet can be possible without IPIP encapsulation.

Quick Start

To deploy this controller into your Kubernetes cluster, please make sure your cluster fufills the requirements as listed below. Then go to deploy/README.md for a quick start guide on how to deploy this to your Kubernetes cluster.

Requirements

k8s-ec2-srcdst must have the ability to access the Kubernetes API for a list of nodes and also ability to add an annotation to a node (write access). Please ensure the service account has sufficient access if ran in-cluster. Otherwise, please make sure that the user specified in the kubeconfig has sufficient permissions.

k8s-ec2-srcdst also needs the ability to modify the EC2 instance attributes of the nodes running in the Kubernetes cluster. Please make sure to schedule the controller on a node with the IAM policy:

  • ec2:ModifyInstanceAttribute

If you are running a Kubernetes cluster in AWS created by kops, only the master node(s) have that IAM policy set (ec2:*). The deployment mainfest files (deploy/*/*.yaml) already sets the NodeAffinity and Tolerations to only deploy the controller on one of the master nodes.

Kops Integration

k8s-ec2-srcdst has been incorporated as an addon in Kops that deploys alongside with Calico when cross-subnet mode is set. Please read this document for instructions on how to enable this on a cluster deployed using Kops.

Usage

Usage of ./bin/linux/k8s-ec2-srcdst:
  -alsologtostderr
        log to standard error as well as files
  -kubeconfig string
        Path to a kubeconfig file
  -log_backtrace_at value
        when logging hits line file:N, emit a stack trace
  -log_dir string
        If non-empty, write log files in this directory
  -logtostderr
        log to standard error instead of files
  -stderrthreshold value
        logs at or above this threshold go to stderr
  -v value
        log level for V logs
  -version
        Prints current k8s-ec2-srcdst version
  -vmodule value
        comma-separated list of pattern=N settings for file-filtered logging

Specifying the verbosity level of logging to 4 using the -v flag will get debug level output.

You only need to specify the location to kubeconfig using the -kubeconfig flag if you are running the controller out of the cluster for development and testing purpose.

The AWS Region must be set as an environmental variable. As well, if you are running this controller outside of the cluster or a node that does not have the proper IAM instance profile, you will need to specify AWS credentials as environmental variables:

Environmental Variables

Variable Description
AWS_REGION Region Name (eg. us-west-2) - required
AWS_ACCESS_KEY AWS Access Key (Optional if using IAM instance profiles)
AWS_SECRET_ACCESS_KEY AWS Secret Access Key (Optional if using IAM instance profiles)

k8s-ec2-srcdst's People

Contributors

aledbf avatar ottoyiu avatar oyiu-dw avatar rahanar avatar seh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

k8s-ec2-srcdst's Issues

has k8s-ec2-srcdst any self-healing?

is it possible that the cache informer loses the update on some node somehow?

if so, I think that we should also have a periodic check on all nodes for self healing them...

what do you think @ottoyiu ?

Overrides taints on nodes

๐Ÿ‘‹ just a heads up on something we've been debugging: when running this controller and having nodes join using the register-with-taints flag, the taints get overridden and removed. We believe this is an issue with a lost update when doing the .Update call on a node. Not sure yet why exactly this is possible (given that the update should fail if the resource version is newer), but it's the case.

Fails on EC2 instances with multiple interfaces

On a Kubernetes cluster setup with KOPS & aws-vpc-k8s-cni, k8s-ec2-srcdst fails to disable srcdst with the following message:
"srcdst_controller.go:87] Fail to disable src dst check for EC2 instance: i-xxxxx; InvalidInstanceID: There are multiple interfaces attached to instance 'i-xxxxxx'. Please specify an interface ID for the operation instead."

I believe this is because aws-vpc-k8s-cni creates instances with several network interfaces - and srcdst needs to be disabled on each of them separately. The following documentation mentions a different procedure for an instance with more than a single interface: https://docs.aws.amazon.com/vpc/latest/userguide/VPC_NAT_Instance.html#EIP_Disable_SrcDestCheck

Maybe k8s-ec2-srcdst should list the interfaces for the current instances and disable srcdestcheck for each of them?

Panic observed when a node gets deleted

A panic occurs when a node gets deleted and returns a cache.DeletedFinalStateUnknown instead of a Node.

I0305 12:49:48.849075       1 main.go:42] k8s-ec2-srcdst: v0.2.1
E0305 12:56:57.201434       1 reflector.go:205] github.com/ottoyiu/k8s-ec2-srcdst/cmd/k8s-ec2-srcdst/main.go:48: Failed to list *v1.Node: Get https://100.64.0.1:443/api/v1/nodes?resourceVersion=0: dial tcp 100.64.0.1:443: getsockopt: con$
ection refused
E0305 12:56:58.202361       1 reflector.go:205] github.com/ottoyiu/k8s-ec2-srcdst/cmd/k8s-ec2-srcdst/main.go:48: Failed to list *v1.Node: Get https://100.64.0.1:443/api/v1/nodes?resourceVersion=0: dial tcp 100.64.0.1:443: getsockopt: con$
ection refused
E0305 12:57:29.203208       1 reflector.go:205] github.com/ottoyiu/k8s-ec2-srcdst/cmd/k8s-ec2-srcdst/main.go:48: Failed to list *v1.Node: Get https://100.64.0.1:443/api/v1/nodes?resourceVersion=0: dial tcp 100.64.0.1:443: i/o timeout
E0305 12:58:00.204087       1 reflector.go:205] github.com/ottoyiu/k8s-ec2-srcdst/cmd/k8s-ec2-srcdst/main.go:48: Failed to list *v1.Node: Get https://100.64.0.1:443/api/v1/nodes?resourceVersion=0: dial tcp 100.64.0.1:443: i/o timeout
E0305 12:58:31.205268       1 reflector.go:205] github.com/ottoyiu/k8s-ec2-srcdst/cmd/k8s-ec2-srcdst/main.go:48: Failed to list *v1.Node: Get https://100.64.0.1:443/api/v1/nodes?resourceVersion=0: dial tcp 100.64.0.1:443: i/o timeout
I0305 12:58:32.427858       1 srcdst_controller.go:96] Marking node ip-10-63-163-245.us-west-2.compute.internal with SrcDstCheckDisabledAnnotation
E0305 12:58:32.448368       1 runtime.go:66] Observed a panic: &runtime.TypeAssertionError{interfaceString:"interface {}", concreteString:"cache.DeletedFinalStateUnknown", assertedString:"*v1.Node", missingMethod:""} (interface conversio$
: interface {} is cache.DeletedFinalStateUnknown, not *v1.Node)
/home/travis/gopath/src/github.com/ottoyiu/k8s-ec2-srcdst/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:72
/home/travis/gopath/src/github.com/ottoyiu/k8s-ec2-srcdst/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
/home/travis/gopath/src/github.com/ottoyiu/k8s-ec2-srcdst/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
/home/travis/.gimme/versions/go1.9.linux.amd64/src/runtime/asm_amd64.s:509
/home/travis/.gimme/versions/go1.9.linux.amd64/src/runtime/panic.go:491
/home/travis/.gimme/versions/go1.9.linux.amd64/src/runtime/iface.go:172
/home/travis/gopath/src/github.com/ottoyiu/k8s-ec2-srcdst/pkg/controller/srcdst_controller.go:64
/home/travis/gopath/src/github.com/ottoyiu/k8s-ec2-srcdst/pkg/controller/srcdst_controller.go:51
/home/travis/gopath/src/github.com/ottoyiu/k8s-ec2-srcdst/vendor/k8s.io/client-go/tools/cache/controller.go:209
<autogenerated>:1
/home/travis/gopath/src/github.com/ottoyiu/k8s-ec2-srcdst/vendor/k8s.io/client-go/tools/cache/controller.go:320
/home/travis/gopath/src/github.com/ottoyiu/k8s-ec2-srcdst/vendor/k8s.io/client-go/tools/cache/delta_fifo.go:451
/home/travis/gopath/src/github.com/ottoyiu/k8s-ec2-srcdst/vendor/k8s.io/client-go/tools/cache/controller.go:150
/home/travis/gopath/src/github.com/ottoyiu/k8s-ec2-srcdst/vendor/k8s.io/client-go/tools/cache/controller.go:124
/home/travis/gopath/src/github.com/ottoyiu/k8s-ec2-srcdst/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133
/home/travis/gopath/src/github.com/ottoyiu/k8s-ec2-srcdst/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134
/home/travis/gopath/src/github.com/ottoyiu/k8s-ec2-srcdst/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88
/home/travis/gopath/src/github.com/ottoyiu/k8s-ec2-srcdst/vendor/k8s.io/client-go/tools/cache/controller.go:124
/home/travis/gopath/src/github.com/ottoyiu/k8s-ec2-srcdst/cmd/k8s-ec2-srcdst/main.go:48
/home/travis/.gimme/versions/go1.9.linux.amd64/src/runtime/proc.go:185
/home/travis/.gimme/versions/go1.9.linux.amd64/src/runtime/asm_amd64.s:2337

a rewrite is in-order with the new style of writing these Controllers in client-go...

Related to: kubernetes/kops#4466

CrashLoopBackOff on RHEL/CentOS Host with KOPS+Calico+crossSubnet

The main problem is that the certificate path is different on CentOS / RHEL systems. On Debian based systems the root cert store is: '/etc/ssl/certs/ca-certificates.crt' while on CentOS/RHEL (way different but the root store is): '/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt'. The POD does not find this path on them and crashing.

https://github.com/ottoyiu/k8s-ec2-srcdst/search?utf8=%E2%9C%93&q=ca-certificates.crt&type=

Maybe it is kops related issue, but there is no way to inform kops about different hostOS and this configuration options combination to use/find different certpath .

POD LOG:

ubuntu@ip-10-202-4-127:~$ kubectl describe pod k8s-ec2-srcdst-78f785ff98-gmvsx --namespace kube-system
Name:           k8s-ec2-srcdst-78f785ff98-gmvsx
Namespace:      kube-system
Node:           ip-10-202-41-237.eu-west-1.compute.internal/10.202.41.237
Start Time:     Wed, 24 Jan 2018 08:39:30 +0000
Labels:         k8s-app=k8s-ec2-srcdst
                pod-template-hash=3493419954
                role.kubernetes.io/networking=1
Annotations:    kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"kube-system","name":"k8s-ec2-srcdst-78f785ff98","uid":"f2b81a96-00e1-11e8-bd3d-02...
                scheduler.alpha.kubernetes.io/critical-pod=
Status:         Running
IP:             10.202.41.237
Controlled By:  ReplicaSet/k8s-ec2-srcdst-78f785ff98
Containers:
  k8s-ec2-srcdst:
    Container ID:  docker://4db42f2dc7081d3b99d040935f6443573de47cf936f6143373f90390aa716854
    Image:         ottoyiu/k8s-ec2-srcdst:v0.1.0
    Image ID:      docker-pullable://ottoyiu/k8s-ec2-srcdst@sha256:d156bd23fb1e584fabfded239fcdd3f9612ed16feb941856c21d94390afcc080
    Port:          <none>
    State:         Waiting
      Reason:      CrashLoopBackOff
    Last State:    Terminated
      Reason:      ContainerCannotRun
      Message:     oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:359: container init caused \"rootfs_linux.go:54: mounting \\\"/etc/ssl/certs/ca-certificates.crt\\\" to rootfs \\\"/var/lib/docker/overlay/2123c332b0b9198cbcfc9d82f936a49674af607d8a7e388166b58d6a39616924/merged\\\" at \\\"/var/lib/docker/overlay/2123c332b0b9198cbcfc9d82f936a49674af607d8a7e388166b58d6a39616924/merged/etc/ssl/certs/ca-certificates.crt\\\" caused \\\"not a directory\\\"\""
: Are you trying to mount a directory onto a file (or vice-versa)? Check if the specified host path exists and is the expected type
      Exit Code:    127
      Started:      Wed, 24 Jan 2018 10:23:53 +0000
      Finished:     Wed, 24 Jan 2018 10:23:53 +0000
    Ready:          False
    Restart Count:  25
    Requests:
      cpu:     10m
      memory:  64Mi
    Environment:
      AWS_REGION:  eu-west-1
    Mounts:
      /etc/ssl/certs/ca-certificates.crt from ssl-certs (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from k8s-ec2-srcdst-token-gd9jn (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          False 
  PodScheduled   True 
Volumes:
  ssl-certs:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/ssl/certs/ca-certificates.crt
    HostPathType:  
  k8s-ec2-srcdst-token-gd9jn:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  k8s-ec2-srcdst-token-gd9jn
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  node-role.kubernetes.io/master=
Tolerations:     CriticalAddonsOnly
                 node-role.kubernetes.io/master:NoSchedule
                 node.alpha.kubernetes.io/notReady:NoExecute for 300s
                 node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason      Age                From                                                  Message
  ----     ------      ----               ----                                                  -------
  Warning  FailedSync  4m (x490 over 1h)  kubelet, ip-10-202-41-237.eu-west-1.compute.internal  Error syncing pod


Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.