jtblin / kube2iam Goto Github PK

View Code? Open in Web Editor NEW

2.0K 42.0 315.0 359 KB

kube2iam provides different AWS IAM roles for pods running on Kubernetes

License: BSD 3-Clause "New" or "Revised" License

Makefile 4.37% Go 92.22% Shell 0.87% Dockerfile 0.43% Mustache 2.10%

kubernetes aws

kube2iam's Introduction

kube2iam

Provide IAM credentials to containers running inside a kubernetes cluster based on annotations.

Context

Traditionally in AWS, service level isolation is done using IAM roles. IAM roles are attributed through instance profiles and are accessible by services through the transparent usage by the aws-sdk of the ec2 metadata API. When using the aws-sdk, a call is made to the EC2 metadata API which provides temporary credentials that are then used to make calls to the AWS service.

Problem statement

The problem is that in a multi-tenanted containers based world, multiple containers will be sharing the underlying nodes. Given containers will share the same underlying nodes, providing access to AWS resources via IAM roles would mean that one needs to create an IAM role which is a union of all IAM roles. This is not acceptable from a security perspective.

Solution

The solution is to redirect the traffic that is going to the ec2 metadata API for docker containers to a container running on each instance, make a call to the AWS API to retrieve temporary credentials and return these to the caller. Other calls will be proxied to the EC2 metadata API. This container will need to run with host networking enabled so that it can call the EC2 metadata API itself.

Usage

IAM roles

It is necessary to create an IAM role which can assume other roles and assign it to each kubernetes worker and list regions.
List regions required permissions because aws-go-sdk-v2 doesn't include regions list.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": [
        "sts:AssumeRole"
      ],
      "Effect": "Allow",
      "Resource": "*"
    },
    {
      "Action": [
        "ec2:DescribeRegions"
      ],
      "Effect": "Allow",
      "Resource": "*"
    },
  ]
}

The roles that will be assumed must have a Trust Relationship which allows them to be assumed by the kubernetes worker role. See this StackOverflow post for more details.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    },
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::123456789012:role/kubernetes-worker-role"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

kube2iam daemonset

Run the kube2iam container as a daemonset (so that it runs on each worker) with hostNetwork: true. The kube2iam daemon and iptables rule (see below) need to run before all other pods that would require access to AWS resources.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube2iam
  labels:
    app: kube2iam
spec:
  selector:
    matchLabels:
      name: kube2iam
  template:
    metadata:
      labels:
        name: kube2iam
    spec:
      hostNetwork: true
      containers:
        - image: jtblin/kube2iam:latest
          name: kube2iam
          args:
            - "--base-role-arn=arn:aws:iam::123456789012:role/"
            - "--node=$(NODE_NAME)"
          env:
            - name: NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
          ports:
            - containerPort: 8181
              hostPort: 8181
              name: http

iptables

To prevent containers from directly accessing the EC2 metadata API and gaining unwanted access to AWS resources, the traffic to 169.254.169.254 must be proxied for docker containers.

iptables \
  --append PREROUTING \
  --protocol tcp \
  --destination 169.254.169.254 \
  --dport 80 \
  --in-interface docker0 \
  --jump DNAT \
  --table nat \
  --to-destination `curl 169.254.169.254/latest/meta-data/local-ipv4`:8181

This rule can be added automatically by setting --iptables=true, setting the HOST_IP environment variable, and running the container in a privileged security context.

Warning: It is possible that other pods are started on an instance before kube2iam has started. Using --iptables=true (instead of applying the rule before starting the kubelet) could give those pods the opportunity to access the real EC2 metadata API, assume the role of the EC2 instance and thereby have all permissions the instance role has (including assuming potential other roles). Use with care if you don't trust the users of your kubernetes cluster or if you are running pods (that could be exploited) that have permissions to create other pods (e.g. controllers / operators).

Note that the interface --in-interface above or using the --host-interface cli flag may be different than docker0 depending on which virtual network you use e.g.

for Calico, use cali+ (the interface name is something like cali1234567890)
for kops (on kubenet), use cbr0
for CNI, use cni0
for EKS/amazon-vpc-cni-k8s, even with calico installed uses eni+. (Each pod gets an interface like eni4c0e15dfb05)
for weave use weave
for flannel use cni0
for kube-router use kube-bridge
for OpenShift use tun0
for Cilium use lxc+

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube2iam
  labels:
    app: kube2iam
spec:
  selector:
    matchLabels:
      name: kube2iam
  template:
    metadata:
      labels:
        name: kube2iam
    spec:
      hostNetwork: true
      containers:
        - image: jtblin/kube2iam:latest
          name: kube2iam
          args:
            - "--base-role-arn=arn:aws:iam::123456789012:role/"
            - "--iptables=true"
            - "--host-ip=$(HOST_IP)"
            - "--node=$(NODE_NAME)"
          env:
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
            - name: NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
          ports:
            - containerPort: 8181
              hostPort: 8181
              name: http
          securityContext:
            privileged: true

kubernetes annotation

Add an iam.amazonaws.com/role annotation to your pods with the role that you want to assume for this pod. The optional iam.amazonaws.com/external-id will allow the use of an ExternalId as part of the assume role

apiVersion: v1
kind: Pod
metadata:
  name: aws-cli
  labels:
    name: aws-cli
  annotations:
    iam.amazonaws.com/role: role-arn
    iam.amazonaws.com/external-id: external-id
spec:
  containers:
  - image: fstab/aws-cli
    command:
      - "/home/aws/aws/env/bin/aws"
      - "s3"
      - "ls"
      - "some-bucket"
    name: aws-cli

You can use --default-role to set a fallback role to use when annotation is not set.

ReplicaSet, CronJob, Deployment, etc.

When creating higher-level abstractions than pods, you need to pass the annotation in the pod template of the resource spec.

Example for a Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 3
  template:
    metadata:
      annotations:
        iam.amazonaws.com/role: role-arn
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.9.1
        ports:
        - containerPort: 80

Example for a CronJob:

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: my-cronjob
spec:
  schedule: "00 11 * * 2"
  concurrencyPolicy: Forbid
  startingDeadlineSeconds: 3600
  jobTemplate:
    spec:
      template:
        metadata:
          annotations:
            iam.amazonaws.com/role: role-arn
        spec:
          restartPolicy: OnFailure
          containers:
          - name: job
            image: my-image

Namespace Restrictions

By using the flag --namespace-restrictions you can enable a mode in which the roles that pods can assume is restricted by an annotation on the pod's namespace. This annotation should be in the form of a json array.

To allow the aws-cli pod specified above to run in the default namespace your namespace would look like the following.

apiVersion: v1
kind: Namespace
metadata:
  annotations:
    iam.amazonaws.com/allowed-roles: |
      ["role-arn"]
  name: default

Note: You can also use glob-based matching for namespace restrictions, which works nicely with the path-based namespacing supported for AWS IAM roles.

Example: to allow all roles prefixed with my-custom-path/ to be assumed by pods in the default namespace, the default namespace would be annotated as follows:

apiVersion: v1
kind: Namespace
metadata:
  annotations:
    iam.amazonaws.com/allowed-roles: |
      ["my-custom-path/*"]
  name: default

If you prefer regexp to glob-based matching you can specify --namespace-restriction-format=regexp, then you can use a regexp in your annotation:

apiVersion: v1
kind: Namespace
metadata:
  annotations:
    iam.amazonaws.com/allowed-roles: |
      ["my-custom-path/.*"]
  name: default

RBAC Setup

This is the basic RBAC setup to get kube2iam working correctly when your cluster is using rbac. Below is the bare minimum to get kube2iam working.

First we need to make a service account.

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: kube2iam
  namespace: kube-system

Next we need to setup roles and binding for the the process.

---
apiVersion: v1
items:
  - apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: kube2iam
    rules:
      - apiGroups: [""]
        resources: ["namespaces","pods"]
        verbs: ["get","watch","list"]
  - apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: kube2iam
    subjects:
    - kind: ServiceAccount
      name: kube2iam
      namespace: kube-system
    roleRef:
      kind: ClusterRole
      name: kube2iam
      apiGroup: rbac.authorization.k8s.io
kind: List

You will notice this lives in the kube-system namespace to allow for easier seperation between system services and other services.

Here is what a kube2iam daemonset yaml might look like.

---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube2iam
  namespace: kube-system
  labels:
    app: kube2iam
spec:
  selector:
    matchLabels:
      name: kube2iam
  template:
    metadata:
      labels:
        name: kube2iam
    spec:
      serviceAccountName: kube2iam
      hostNetwork: true
      containers:
        - image: jtblin/kube2iam:latest
          imagePullPolicy: Always
          name: kube2iam
          args:
            - "--app-port=8181"
            - "--base-role-arn=arn:aws:iam::xxxxxxx:role/"
            - "--iptables=true"
            - "--host-ip=$(HOST_IP)"
            - "--host-interface=weave"
            - "--verbose"
          env:
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
          ports:
            - containerPort: 8181
              hostPort: 8181
              name: http
          securityContext:
            privileged: true

Using on OpenShift

OpenShift 3

To use kube2iam on OpenShift one needs to configure additional resources.

A complete example for OpenShift 3 looks like this. For OpenShift 4, see the next section.

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: kube2iam
  namespace: kube-system
---
apiVersion: v1
items:
  - apiVersion: rbac.authorization.k8s.io/v1beta1
    kind: ClusterRole
    metadata:
      name: kube2iam
    rules:
      - apiGroups: [""]
        resources: ["namespaces","pods"]
        verbs: ["get","watch","list"]
  - apiVersion: rbac.authorization.k8s.io/v1beta1
    kind: ClusterRoleBinding
    metadata:
      name: kube2iam
    subjects:
    - kind: ServiceAccount
      name: kube2iam
      namespace: kube-system
    roleRef:
      kind: ClusterRole
      name: kube2iam
      apiGroup: rbac.authorization.k8s.io
kind: List
---
kind: SecurityContextConstraints
apiVersion: v1
metadata:
  name: kube2iam
allowPrivilegedContainer: true
allowHostPorts: true
allowHostNetwork: true
runAsUser:
  type: RunAsAny
seLinuxContext:
  type: MustRunAs
users:
- system:serviceacount:kube-system:kube2iam
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: kube2iam
  namespace: kube-system
  labels:
    app: kube2iam
spec:
  selector:
    matchLabels:
      name: kube2iam
  template:
    metadata:
      labels:
        name: kube2iam
    spec:
      serviceAccountName: kube2iam
      hostNetwork: true
      nodeSelector:
        role: app
      containers:
        - image: docker.io/jtblin/kube2iam:latest
          imagePullPolicy: Always
          name: kube2iam
          args:
            - "--app-port=8181"
            - "--auto-discover-base-arn"
            - "--iptables=true"
            - "--host-ip=$(HOST_IP)"
            - "--host-interface=tun0"
            - "--verbose"
          env:
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
          ports:
            - containerPort: 8181
              hostPort: 8181
              name: http
          securityContext:
            privileged: true

Note: In (OpenShift) multi-tenancy setups it is recommended to restrict the assumable roles on the namespace level to prevent cross-namespace trust stealing.

OpenShift 4

To use kube2iam on OpenShift 4, the additional resources are slightly different from those for OpenShift 3 shown above. OpenShift 4 has hard-coded iptables rules that block connections from containers to the EC2 metadata service 169.254.169.254. The kube2iam pods already run with host networking enabled, they are not affected by these OpenShift iptables rules.

The OpenShift iptables rules have implications for pods authenticating through kube2iam though. But let's look at an example for deploying kube2iam on OpenShift 4 first:

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: kube2iam
  namespace: kube-system
---
apiVersion: v1
items:
  - apiVersion: rbac.authorization.k8s.io/v1beta1
    kind: ClusterRole
    metadata:
      name: kube2iam
    rules:
      - apiGroups: [""]
        resources: ["namespaces","pods"]
        verbs: ["get","watch","list"]
  - apiVersion: rbac.authorization.k8s.io/v1beta1
    kind: ClusterRoleBinding
    metadata:
      name: kube2iam
    subjects:
    - kind: ServiceAccount
      name: kube2iam
      namespace: kube-system
    roleRef:
      kind: ClusterRole
      name: kube2iam
      apiGroup: rbac.authorization.k8s.io
kind: List
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: kube2iam
  namespace: kube-system
  labels:
    app: kube2iam
spec:
  selector:
    matchLabels:
      name: kube2iam
  template:
    metadata:
      labels:
        name: kube2iam
    spec:
      serviceAccountName: kube2iam
      hostNetwork: true
      nodeSelector:
        node-role.kubernetes.io/worker: ''
      containers:
        - image: docker.io/jtblin/kube2iam:latest
          imagePullPolicy: Always
          name: kube2iam
          args:
            - "--app-port=8181"
            - "--auto-discover-base-arn"
            - "--host-ip=$(HOST_IP)"
            - "--host-interface=tun0"
            - "--verbose"
          env:
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
          ports:
            - containerPort: 8181
              hostPort: 8181
              name: http

Compared to the OpenShift 3 example in the previous section, we removed the kube2iam SecurityContextConstraint. In the kube2iam DaemonSet, we changed the nodeSelector to the match OpenShift 4 worker nodes, removed the iptables argument, and removed the privileged securityContext.

We use the OpenShift hostnetwork SecurityContextConstraint for kube2iam:

oc adm policy add-scc-to-user hostnetwork -n kube-system -z kube2iam

For applications, the iptables rule that kube2iam would create to redirect 169.254.169.254 connections to the kube2iam pods has no effect because the hard-coded iptables rules block those connections on OpenShift 4.

As a workaround, the environment variables http_proxy and no_proxy can be set to use kube2iam as a HTTP proxy when accessing the metadata service. Below is an example for the aws-service-operator:

- kind: Deployment
  apiVersion: apps/v1beta1
  metadata:
    name: aws-service-operator
    namespace: aws-service-operator
  spec:
    replicas: 1
    template:
      metadata:
        annotations:
          iam.amazonaws.com/role: aws-service-operator
        labels:
          app: aws-service-operator
      spec:
        serviceAccountName: aws-service-operator
        containers:
        - name: aws-service-operator
          image: awsserviceoperator/aws-service-operator:v0.0.1-alpha4
          imagePullPolicy: Always
          command:
            - /bin/sh
          args:
          - "-c"
          - export http_proxy=${HOST_IP}:8181; /usr/local/bin/aws-service-operator server --cluster-name=<CLUSTER_NAME> --region=<REGION> --account-id=<ACCOUNT_ID> --k8s-namespace=<K8S_NAMESPACE>
        env:
          - name: HOST_IP
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: status.hostIP
          - name: no_proxy
            value: "*.amazonaws.com,<KUBE_API_IP>:443"

Compared to the Deployment definition from aws-service-operator/configs/aws-service-operator.yaml, this adds the http_proxy and no_proxy environment variables.

Because we use the IP address of the OpenShift node to access the kube2iam pod, we cannot set http_proxy in the env list, but use a shell command instead.

The value for the no_proxy environment variable is specific to the application. kube2iam only allows proxy connections to 169.254.169.254. All other hostnames or IP addresses that the application connects to through HTTP or HTTPS need to be listed in the no_proxy variable.

For example, the aws-service-operator needs access to various AWS APIs and the Kubernetes API. The Kubernetes API listens on the first IP address in the OpenShift service network. If 172.31.0.0/16 is the OpenShift cluster service network, KUBE_API_IP is 172.31.0.1.

Debug

By using the --debug flag you can enable some extra features making debugging easier:

/debug/store endpoint enabled to dump knowledge of namespaces and role association.

Base ARN auto discovery

By using the --auto-discover-base-arn flag, kube2iam will auto discover the base ARN via the EC2 metadata service.

Using ec2 instance role as default role

By using the --auto-discover-default-role flag, kube2iam will auto discover the base ARN and the IAM role attached to the instance and use it as the fallback role to use when annotation is not set.

AWS STS Endpoint and Regions

STS is a unique service in that it is actually considered a global service that defaults to endpoint at https://sts.amazonaws.com, regardless of your region setting. However, unlike other global services (e.g. CloudFront, IAM), STS also has regional endpoints which can only be explicitly used programatically. The use of a regional sts endpoint can reduce the latency for STS requests.

kube2iam supports the use of STS regional endpoints by using the --use-regional-sts-endpoint flag as well as by setting the appropriate AWS_REGION environment variable in your daemonset environment. With these two settings configured, kube2iam will use the STS api endpoint for that region. If you enable debug level logging, the sts endpoint used to retrieve credentials will be logged.

Metrics

kube2iam exports a number of Prometheus metrics to assist with monitoring the system's performance. By default, these are exported at the /metrics HTTP endpoint on the application server port (specified by --app-port). This does not always make sense, as anything with access to the application server port can assume roles via kube2iam. To mitigate this use the --metrics-port argument to specify a different port that will host the /metrics endpoint.

All of the exported metrics are prefixed with kube2iam_. See the Prometheus documentation for more information on how to get up and running with Prometheus.

Options

By default, kube2iam will use the in-cluster method to connect to the kubernetes master, and use the iam.amazonaws.com/role annotation to retrieve the role for the container. Either set the base-role-arn option to apply to all roles and only pass the role name in the iam.amazonaws.com/role annotation, otherwise pass the full role ARN in the annotation.

$ kube2iam --help
Usage of kube2iam:
      --api-server string                     Endpoint for the api server
      --api-token string                      Token to authenticate with the api server
      --app-port string                       Kube2iam server http port (default "8181")
      --auto-discover-base-arn                Queries EC2 Metadata to determine the base ARN
      --auto-discover-default-role            Queries EC2 Metadata to determine the default Iam Role and base ARN, cannot be used with --default-role, overwrites any previous setting for --base-role-arn
      --backoff-max-elapsed-time duration     Max elapsed time for backoff when querying for role. (default 2s)
      --backoff-max-interval duration         Max interval for backoff when querying for role. (default 1s)
      --base-role-arn string                  Base role ARN
      --iam-role-session-ttl                  Length of session when assuming the roles (default 15m)
      --debug                                 Enable debug features
      --default-role string                   Fallback role to use when annotation is not set
      --host-interface string                 Host interface for proxying AWS metadata (default "docker0")
      --host-ip string                        IP address of host
      --iam-role-key string                   Pod annotation key used to retrieve the IAM role (default "iam.amazonaws.com/role")
      --iam-external-id string                Pod annotation key used to retrieve the IAM ExternalId (default "iam.amazonaws.com/external-id")
      --insecure                              Kubernetes server should be accessed without verifying the TLS. Testing only
      --iptables                              Add iptables rule (also requires --host-ip)
      --log-format string                     Log format (text/json) (default "text")
      --log-level string                      Log level (default "info")
      --metadata-addr string                  Address for the ec2 metadata (default "169.254.169.254")
      --metrics-port string                   Metrics server http port (default: same as kube2iam server port) (default "8181")
      --namespace-key string                  Namespace annotation key used to retrieve the IAM roles allowed (value in annotation should be json array) (default "iam.amazonaws.com/allowed-roles")
      --cache-resync-period                   Refresh interval for pod and namespace caches
      --resolve-duplicate-cache-ips           Queries the k8s api server to find the source of truth when the pod cache contains multiple pods with the same IP
      --namespace-restriction-format string   Namespace Restriction Format (glob/regexp) (default "glob")
      --namespace-restrictions                Enable namespace restrictions
      --node string                           Name of the node where kube2iam is running
      --use-regional-sts-endpoint             use the regional sts endpoint if AWS_REGION is set
      --verbose                               Verbose
      --version                               Print the version and exits

Development loop

Use minikube to run cluster locally
Build and push dev image to docker hub: make docker-dev DOCKER_REPO=<your docker hub username>
Update deployment.yaml as needed
Deploy to local kubernetes cluster: kubectl create -f deployment.yaml or kubectl delete -f deployment.yaml && kubectl create -f deployment.yaml
Expose as service: kubectl expose deployment kube2iam --type=NodePort
Retrieve the services url: minikube service kube2iam --url
Test your changes e.g. curl -is $(minikube service kube2iam --url)/healthz

Author

Jerome Touffe-Blin, @jtblin, About me

License

kube2iam's People

Contributors

Stargazers

Watchers

Forkers

digideskio thomasdesr bharrisau tazjin evie404 whizard mikekap alexouzounis dick9gag rimusz-lab ideahitme nordstrom jshartshorn mikkeloscar leprechaun robinpercy owainperry bbriggs pavelnikolov negz struz mtanlee fsero arehmandev everesio jrnt30 jescarri pingles mikesplain vatit-devops mrwlad pvdvreede-forked totallyunknown spaceapegames bismarck lstoll opsdev-ws boardthatpowder johanneswuerbach getcloudnative jepsenwan atlassian cargill-inc brblol corlettb rajeshz jchanam moos3 waldt ashishagg matt-deboer mumoshu heroku tigerwings etsangsplk alexxnica kryndex ankon dgem pwillie burdara lily922 hagaibarel greenboxal cinderellagarage thoeni people-ai mittal-anshul biochimia julianocristian optimuspaul pms1969 faraazkhan clearbit-boneyard digdug101 demandjump zegl markjacksonfishing vardhan0707 jbarrioshiya boazjohn mabartosz msfhubanbei schlomo partkyle darend homelinen eaceaser danwent pedrosland ashutosh16 jessestuart ministryofjustice mwhittington21 metacyclic asafpelegcodes higanworks raavula jennyli90 aranair

kube2iam's Issues

What's the easiest way to check the IAM role of a running pod?

IAM roles not applied to init containers

The title says it all. Pods in the deployment correctly assume the declared IAM role. However this does not work for init containers. This is problem for us because our init containers needs to bootstrap AWS resources (governed by the IAM policy). Luckily we can get around this for now, but may not be able to in the future. Are init containers supposed to be working or is this a feature request?

Incorrect values for '--base-role-arn' do not throw proper exception

When defining the values for kube2iam daemonset , if there is a incorrect value being passed to the '-base-role-arn', it takes the incorrect value and set it against the base-arn.There is no proper exception or error message that the value passed is invalid or incorrect.

We have defined the daemonset.yml file as:

---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: kube2iam
  labels:
    app: kube2iam
spec:
  template:
    metadata:
      labels:
        name: kube2iam
    spec:
      hostNetwork: true
      containers:
        - image: jtblin/kube2iam:0.5.2
          name: kube2iam
          args:
            - "--base-role-arn=arn:aws:iam::123456789012:role/<role-name>"
            - "--iptables=true"
            - "--host-ip=$(HOST_IP)"
            - "--host-interface=weave"
            - "--verbose"
            - "--debug"
          env:
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
          securityContext:
            privileged: true
          ports:
             - containerPort: 8181
               hostPort: 8181
               name: http

While debugging we found that in this case the 'base-role-arn' was being set as `arn:aws:iam::123456789012:role//arn:aws:iam::123456789012:role/' which is an invalid role-arn and hence it was not able to find the correct default ARN value for base-role.

Ideally it should have caught any incorrect value being passed for 'base-role-arn' and reported it first place, but it accepted the value as 'arn:aws:iam::123456789012:role/' .

Not sure if we did it the wrong way or it was a default behavior for 'base-role-arn' argument, but took a lot of time for us to debug and figure it out.

Not running as DaemonSet?

One of the concerns of running kube2iam as a container on each host through DaemonSet is that it requires all minion nodes to have the sts:AssumeRole permission. This means any escalation to the host network or container interface, intended or otherwise, has the potential of gaining all application roles.

Since kube2iam has dependencies on only the Kubernetes API and the EC2 metadata service, both reachable by network, it seems like, with some modifications, we can run kube2iam on dedicated nodes. This way, we restrict the number of EC2 instances having sts:AssumeRole permission, reducing the exposure of the potentially sensitive permission.

BadRequest: Container kube2iam is not available

This showed up in the logs of my Master Node:

{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"container \"kube2iam\" in pod \"kube2iam-xc54t\" is not available","reason":"BadRequest","code":400}

It doesn't error on my two slave nodes.

How do I go about debugging this?

default-role race condition

With 0.6.2 and 0.6.3 when using --default-role=value created pods get the default role and after that role expires(15-30min) they get the role that is specified in the annotation. If I don't use --default-role the correct role is applied by the first log statement.

kube2iam logs

time="2017-06-07T15:58:34Z" level=debug msg="Pod OnUpdate" pod.iam.role=worker-dns-role pod.name=external-dns-external-dns-1964887470-j8xm4 pod.namespace=infra pod.status.ip=10.2.5.5
time="2017-06-07T15:58:34Z" level=debug msg="Pod OnUpdate" pod.iam.role=worker-dns-role pod.name=external-dns-external-dns-1964887470-j8xm4 pod.namespace=infra pod.status.ip=10.2.5.5
time="2017-06-07T15:58:35Z" level=debug msg="Proxy ec2 metadata request" metadata.url=169.254.169.254 req.method=GET req.path="/latest/meta-data/iam/security-credentials" req.remote=10.2.4.8
time="2017-06-07T15:58:35Z" level=info msg="Handling request" req.method=GET req.path="/latest/meta-data/iam/security-credentials" req.remote=10.2.4.8 res.duration=782881 res.status=301
time="2017-06-07T15:58:35Z" level=warning msg="Using fallback role for IP 10.2.4.8"
time="2017-06-07T15:58:35Z" level=info msg="Handling request" req.method=GET req.path="/latest/meta-data/iam/security-credentials/" req.remote=10.2.4.8 res.duration=29529 res.status=200
time="2017-06-07T15:58:35Z" level=warning msg="Using fallback role for IP 10.2.4.8"
time="2017-06-07T15:58:35Z" level=info msg="Handling request" req.method=GET req.path="/latest/meta-data/iam/security-credentials/worker-role" req.remote=10.2.4.8 res.duration=44191759 res.status=200
time="2017-06-07T15:59:05Z" level=debug msg="Pod OnUpdate" pod.iam.role=worker-dns-role pod.name=external-dns-external-dns-1964887470-j8xm4 pod.namespace=infra pod.status.ip=
time="2017-06-07T15:59:05Z" level=debug msg="Pod OnDelete" pod.iam.role=worker-dns-role pod.name=external-dns-external-dns-1964887470-j8xm4 pod.namespace=infra pod.status.ip=10.2.5.5
time="2017-06-07T15:59:05Z" level=debug msg="Pod OnAdd" pod.iam.role=worker-dns-role pod.name=external-dns-external-dns-1964887470-j8xm4 pod.namespace=infra pod.status.ip=
time="2017-06-07T15:59:06Z" level=debug msg="Pod OnUpdate" pod.iam.role=worker-dns-role pod.name=external-dns-external-dns-1964887470-j8xm4 pod.namespace=infra pod.status.ip=
time="2017-06-07T15:59:17Z" level=debug msg="Pod OnUpdate" pod.iam.role=worker-dns-role pod.name=external-dns-external-dns-1964887470-j8xm4 pod.namespace=infra pod.status.ip=
time="2017-06-07T15:59:17Z" level=debug msg="Pod OnUpdate" pod.iam.role=worker-dns-role pod.name=external-dns-external-dns-1964887470-j8xm4 pod.namespace=infra pod.status.ip=
time="2017-06-07T15:59:17Z" level=debug msg="Pod OnDelete" pod.iam.role=worker-dns-role pod.name=external-dns-external-dns-1964887470-j8xm4 pod.namespace=infra pod.status.ip=

external-dns

time="2017-06-07T15:58:35Z" level=info msg="config: &{Master: KubeConfig: Sources:[service] Namespace: FqdnTemplate: Compatibility: Provider:xxxx GoogleProject: DomainFilter:dns.com Policy:upsert-only Registry:txt TXTOwnerID:me TXTPrefix: Interval:1m0s Once:false DryRun:true LogFormat:text MetricsAddress::7979 Debug:false}"
time="2017-06-07T15:58:35Z" level=info msg="running in dry-run mode. No changes to DNS records will be made."
time="2017-06-07T15:58:35Z" level=info msg="Connected to cluster at https://10.3.0.1:443"
time="2017-06-07T15:58:35Z" level=error msg="AccessDenied: User: arn:aws:sts::ACCOUNT:assumed-role/worker-role/f58ef8f8-worker-role is not authorized to perform: route53:ListHostedZones
	status code: 403, request id: 29b95d7c-4b9a-11e7-8fef-7590e06bba06"
time="2017-06-07T15:59:35Z" level=error msg="AccessDenied: User: arn:aws:sts::ACCOUNT:assumed-role/worker-role/f58ef8f8-worker-role is not authorized to perform: route53:ListHostedZones

Provide an example configuration

I'm really struggling to fit the pieces together here. Is is possible to provide a real-world example for say listing the contents of an S3 bucket?

Add tests, coverage and CI

How kube2iam works ?

Does Kube2iam listen to docker events and hence detect the start of the a new container and then construct the IAM role to assume using the base-role + role-name from the container's am.amazonaws.com/role value (K8s Annotations) ?

Are the results cached ?
If 2 containers being started at the same time, how does it handle then ?

Thanks

Ken

Update example template in README to v1

According to the docs, v1beta1 (as well as v1beta2 and v1beta3) is deprecated, and they recommend users to move to v1.

I can submit a PR with any needed changes later on, but thought I should get it noted down first.

Don't hang forever if there is no role defined for a pod

Currently a call to metadata from a pod without a role will hang forever because it retries to get the role from the IP indefinitely: https://github.com/jtblin/kube2iam/blob/master/cmd/server.go#L61-L66.

If the retry really is needed I think it would be better to limit it to X number of retries and then respond. Hanging the connection forever is not nice for the calling application.

I know this could be solved by setting a default role, but it should also work when no default role is defined IMO.

I wouldn't mind making a PR fixing this, but I would like your (@jtblin) opinion before I start, if you don't mind.

Do we really need to retry or would it be ok to respond with a 404, this obviously gives a false positive if kube2iam just is too slow to recognize the pod annotation?
If we need to retry, can we limit it to a finite number of retries?

Support adding/modifying annotations for existing pods

First off, thanks for the project, we have found it useful and to the point!

Issue:
When you annotate an existing pod, Kube2IAM will continue to serve up the cached role/credentials for the pod.

In our case, we are attempting to migrate an existing stack over to Kube2IAM with as little impact as possible. This has meant annotating some pods that are part of a stateful set via the kubectl annotate command.

Request:
It would be very useful if the addition of or modification of a "relevant" annotation triggered a flush of the kube2iam cache

$HOST_INTERFACE might not exist at runtime

So,

I'm running a cluster using kops, which means I have to specify --host-interface cbr0. kube2iam worked fine when I first tested it, on an existing cluster. However, when including kube2iam as a DaemonSet I on a new cluster, it crashes.

What's going on, and I'm only guessing here, the instances won't bother creating the cbr0 bridge until that node runs a Pod w/ hostNetworking: false, at which point kube2iam can successfully run. Anytime before that, it CrashLoops with a message about the interface not existing (however it happens to format this error message, https://github.com/jtblin/kube2iam/blob/master/iptables/iptables.go#L38)

I'm willing to send in a patch, but haven't found good way without downsides. Thinking of something like this:

--wait-for-interface, which would let us start the server, and asynchronously add the iptables forwarding, in one of three ways:

Loop / Sleep while the interface doesn't exist

Ugly, but works. Uses extra resources while polling. How aggressively we poll determines how reliable it can be made (eg, how small the time window between pod creation and pods requesting creds could be)

Netlink / RTMGRP_LINK

http://man7.org/linux/man-pages/man7/netlink.7.html. We can get notifications when network interfaces are created/deleted / ip addresses changed etc. However, I'm unsure about how well go supports it atm.

This might be the best option.

TTL mangling hack

If neither of the above, it would also be possible to mangle all the packets destined to the meta-data service, unless their TTL had been set to a pre-selected number (used as sign that we don't need to proxy again). This would definitely work, but at the cost of a malicious pod being able to circumvent the proxy at obtain the hosts own IAM creds.

Thoughts?

iptables not set in CoreOS

kube2iam appears to not print any errors but the iptables rule is not created.
Spec file is:


---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: kube2iam
  labels:
    app: kube2iam
spec:
  template:
    metadata:
      labels:
        name: kube2iam
    spec:
      hostNetwork: true
      containers:
        - image: jtblin/kube2iam:latest
          name: kube2iam
          args:
            - "--base-role-arn=arn:aws:iam::123456789012:role/"
            - "--iptables=true"
            - "--host-ip=$(HOST_IP)"
          env:
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
          ports:
            - containerPort: 8181
              hostPort: 8181
              name: http
          securityContext:
            privileged: true

running the command on the terminal:

$ sudo iptables   --append PREROUTING   --destination 169.254.169.254   --dport 80   --in-interface docker0   --jump DNAT   --protocol tcp   --table nat   --to-destination `curl 169.254.169.254/latest/meta-data/local-ipv4`:818

returns:

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    13  100    13    0     0  13742      0 --:--:-- --:--:-- --:--:-- 13000
iptables v1.4.21: unknown option "--dport"
Try `iptables -h' or 'iptables --help' for more information.

If i move the --protocol flag before --dport, the command works as expected

Provide a way to specify the AWS region

At the moment, when creating the AWS Session object no region can be specified (i.e., https://github.com/jtblin/kube2iam/blob/master/cmd/iam.go#L79).

The effect of this is that calls to STS default to https://sts.amazonaws.com and according to AWS that URL maps to US East but could as well change.

In some corporate environments, such calls are restricted per region and AWS provides local endpoint that can be used.

The AWS APIs provide a way to specify the region when creating the Session object: https://docs.aws.amazon.com/sdk-for-go/api/aws/session

TimeoutError: Could not load credentials from any providers

17-02-28T16:54:07.985641609Z 2017-02-28T16:54:07.985Z - error: Unable to scan. Error: {
2017-02-28T16:54:07.985694116Z   "message": "Missing credentials in config",
2017-02-28T16:54:07.985699668Z   "code": "CredentialsError",
2017-02-28T16:54:07.985704261Z   "time": "2017-02-28T16:54:07.980Z",
2017-02-28T16:54:07.985718282Z   "retryable": true,
2017-02-28T16:54:07.985723359Z   "originalError": {
2017-02-28T16:54:07.985727679Z     "message": "Could not load credentials from any providers",
2017-02-28T16:54:07.985732000Z     "code": "CredentialsError",
2017-02-28T16:54:07.985736229Z     "time": "2017-02-28T16:54:07.980Z",
2017-02-28T16:54:07.985740243Z     "retryable": true,
2017-02-28T16:54:07.985744330Z     "originalError": {
2017-02-28T16:54:07.985748410Z       "message": "Connection timed out after 1000ms",
2017-02-28T16:54:07.985752561Z       "code": "TimeoutError",
2017-02-28T16:54:07.985756792Z       "time": "2017-02-28T16:54:07.979Z",
2017-02-28T16:54:07.985760781Z       "retryable": true
2017-02-28T16:54:07.985764605Z     }
2017-02-28T16:54:07.985768446Z   }
2017-02-28T16:54:07.985772230Z }

This is reported by an app in my container after setting up kube2iam, notice the TimeoutError?

Release 0.5.1 does not work with Kubernetes 1.6.2

It appears that on Kubernetes 1.6.2, release 0.5.1 does not work. It errors with:

2017-04-25T04:24:30.461470394Z kube2iam flag redefined: log_dir
2017-04-25T04:24:30.46394961Z panic: kube2iam flag redefined: log_dir
2017-04-25T04:24:30.469260508Z 
2017-04-25T04:24:30.469269816Z goroutine 1 [running]:
2017-04-25T04:24:30.469274018Z flag.(*FlagSet).Var(0xc42000e120, 0x1e59ba0, 0xc420019780, 0x15b7dc5, 0x7, 0x15e26fa, 0x2f)
2017-04-25T04:24:30.469279025Z 	/usr/local/Cellar/go/1.8/libexec/src/flag/flag.go:793 +0x420
2017-04-25T04:24:30.469282795Z flag.(*FlagSet).StringVar(0xc42000e120, 0xc420019780, 0x15b7dc5, 0x7, 0x0, 0x0, 0x15e26fa, 0x2f)
2017-04-25T04:24:30.469285697Z 	/usr/local/Cellar/go/1.8/libexec/src/flag/flag.go:696 +0x8b
2017-04-25T04:24:30.469288735Z flag.(*FlagSet).String(0xc42000e120, 0x15b7dc5, 0x7, 0x0, 0x0, 0x15e26fa, 0x2f, 0xc420019770)
2017-04-25T04:24:30.469291591Z 	/usr/local/Cellar/go/1.8/libexec/src/flag/flag.go:709 +0x90
2017-04-25T04:24:30.469294465Z flag.String(0x15b7dc5, 0x7, 0x0, 0x0, 0x15e26fa, 0x2f, 0x23)
2017-04-25T04:24:30.469297237Z 	/usr/local/Cellar/go/1.8/libexec/src/flag/flag.go:716 +0x69
2017-04-25T04:24:30.469300033Z github.com/jtblin/kube2iam/vendor/k8s.io/client-go/vendor/github.com/golang/glog.init()
2017-04-25T04:24:30.469303218Z 	/Users/jtblin/src/go/src/github.com/jtblin/kube2iam/vendor/k8s.io/client-go/vendor/github.com/golang/glog/glog_file.go:41 +0x14a
2017-04-25T04:24:30.46930643Z github.com/jtblin/kube2iam/vendor/k8s.io/client-go/kubernetes.init()
2017-04-25T04:24:30.46930955Z 	/Users/jtblin/src/go/src/github.com/jtblin/kube2iam/vendor/k8s.io/client-go/kubernetes/import_known_versions.go:42 +0x48
2017-04-25T04:24:30.469312542Z github.com/jtblin/kube2iam/cmd.init()
2017-04-25T04:24:30.46931517Z 	/Users/jtblin/src/go/src/github.com/jtblin/kube2iam/cmd/store.go:161 +0x7d
2017-04-25T04:24:30.469318023Z main.init()
2017-04-25T04:24:30.469322592Z 	/Users/jtblin/src/go/src/github.com/jtblin/kube2iam/main.go:92 +0x53

0.5.0, however, does work. I suspect that #60 is the cause. A quick google came across kubernetes/kubernetes#35096, which suggests using a vendoring tool like Godeps or Glide to flatten out these dependencies. Maybe that would work?

The 'base-role-arn' argument or 'auto-discover-base-arn' needs to be passed explicitly

When defining the specs for kube2iam daemonset , the ‘base-role-arn’ not able to correctly find the base arn. However when we explicitly define ‘--auto-discover-base-arn ’, then it seems to work fine.

As per the document,

Either set the base-role-arn option to apply to all roles and only pass the role name in the iam.amazonaws.com/role annotation, otherwise pass the full role ARN in the annotation.

We initially had the daemonset.yml defined as :

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: kube2iam
  labels:
    app: kube2iam
spec:
  template:
    metadata:
      labels:
        name: kube2iam
    spec:
      hostNetwork: true
      containers:
        - image: jtblin/kube2iam:0.5.2
          name: kube2iam
          args:
            - "--base-role-arn=arn:aws:iam::123456789012:role/<role-name>"
            - "--iptables=true"
            - "--host-ip=$(HOST_IP)"
            - "--host-interface=weave"
          env:
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
          securityContext:
            privileged: true
          ports:
             - containerPort: 8181
               hostPort: 8181
               name: http

However this did not seem to work as expected , and our aws-cli pod was not able to access the s3 bucket. The aws-cli pod definition is:

---
apiVersion: v1
kind: Pod
metadata:
  name: aws-cli
  labels:
    name: aws-cli
  annotations:
    iam.amazonaws.com/role: Kube2IAMTest
spec:
  containers:
  - image: fstab/aws-cli
    command:
      - "/home/aws/aws/env/bin/aws"
      - "s3"
      - "ls"
      - "kube2iam-test-bucket"	
    name: aws-cli
  restartPolicy: Never

As it can be seen that we have passed the full role-ARN in the annotations as 'iam.amazonaws.com/role: Kube2IAMTest', however it does not work.

When we changed the daemonset file definition and added '--auto-discover-base-arn' argument, then it seems to be able to discover the base-arn and work correctly.

The new daemonset file is

---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: kube2iam
  labels:
    app: kube2iam
spec:
  template:
    metadata:
      labels:
        name: kube2iam
    spec:
      hostNetwork: true
      containers:
        - image: jtblin/kube2iam:0.5.2
          name: kube2iam
          args:
            - "--auto-discover-base-arn"
            - "--iptables=true"
            - "--host-ip=$(HOST_IP)"
            - "--host-interface=weave"
            - "--verbose"
            - "--debug"
          env:
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
          securityContext:
            privileged: true
          ports:
             - containerPort: 8181
               hostPort: 8181
               name: http

Add iptables rule on startup

Me again!

Would you be interested in a PR that enables kube2iam to add the necessary iptables rule?

Bind IAM credentials to ServiceAccounts, not directly to Pods

From a logical standpoint, I believe it is clearer and better aligned with Kubernetes' security primitives to associate IAM credentials with a service account, rather than directly with a Pod. Specifically, this would mirror how Kubernetes' tokens are automatically mounted into the Pod filesystem, based on the associated ServiceAccount[1].

I don't (yet) have a good idea about how to associate an IAM role with a ServiceAccount. We currently rely on the Namespace restriction annotation to limit the IAM roles that are allowed to be assumed by the Pods within a Namespace. Because we are running a multi-tenant cluster, we cannot allow users to assume every role. At the moment, I'm not sure how to associate an IAM role with a ServiceAccount in a way that both 1) allows users to maintain their own SAs, and 2) prevents users from binding arbitrary/disallowed IAM roles.

Thoughts?

[1] This also parallels Istio's use of the ServiceAccount as the basis/binding point of workload identity: https://github.com/istio/auth#identity

Use pod namespace and name in logs to identify pods

kube2iam currently identifies the pods using their "remote address", but that means when scanning through these logs one needs to manually translate IPs into "which pod is that?" This is especially problematic in highly volatile environments, where IPs get recycled over time.

Operationally I do need to be able to review a kube2iam log and check that the pods I do assign roles get the these roles, and pods that should get the default roles do.

It would be really helpful if mate could log the pod namespace and name for each log message related to a specific pod.

Restrict role assignment to single container

Consider a multi conainter setup with an application and a maintenance container. The maintenance container is used to copy backups to S3.

How could I restrict access to the injected AWS credentials to only the maintenance container so that the main container with the (internet exposed) application won't have access to them?

Maybe also interesting: How do give different roles to different containers on the same pod.

Throw an error if the wrong role is requested

Currently, Server.roleHandler fetches the role specified in the annotation and ignores the role specified in the url. I think it would make more sense to return a 404 if the role in the url doesn't match the url in the annotation.

Valid iptable Interface format with wildcards(+) are treated as wrong

Since d55bfe9 we get:

2016-12-15T14:10:18.858624928Z time="2016-12-15T14:10:18Z" level=fatal msg="route ip+net: no such network interface"

This is probably because we use cali+ as the interface and the check implemented in the above commit is marking it as incorrect even though its valid.

Dockerfile

Hi,

Want to build my own image and would like to confirm if I am to run the Alpine container on a CoreOS k8s cluster, shall I compile kube2iam on a Alpine VM or a CoreOS version is fine please ?

Thanks

Ken

Nodes are not assuming any roles

Let me first just say that I assume this is something I'm doing wrong since others don't seem to have an issue.

With that said, I'm running a vanilla kops-created kubenet cluster. I've attached an extra role to my masters and nodes both that allow them to assume any role, as well as created a new role with the access I want (DynamoDB) and granted the master/node roles as trusted.

I know the IAM side of things is set up properly since I can ssh into one of my cluster nodes and run aws sts to get credentials for the dynamo role (but not other roles). However, when running the exact example, with iptables=true and s3 ls changed to dynamodb list-tables, my pod appears to just not assume a role at all. The error reported from the pod is this:

An error occurred (AccessDeniedException) when calling the ListTables operation: User: arn:aws:sts::<redacted>:assumed-role/nodes.k8s.dev.redacted/i-068a7d585ad916589 is not authorized to perform: dynamodb:ListTables on resource: *

I even tried an IAM policy that grants everything except DDB access to the nodes but it had no impact. My guess is there's something I missed (perhaps specific to kops) needed to get kube2iam functioning? The logs only show me startup information about listening on 8181 without any updates as I create new pods.

go get fails

$ go get
# github.com/jtblin/kube2iam/cmd
cmd/k8s.go:20: undefined: "k8s.io/kubernetes/pkg/client/unversioned".Client

Set annotation on ReplicaSet/Deployment level

Right now I'm trying to set-up kube2iam but I can't seem to get it working. First of all the Pods are created automatically (using Deis Workflow), which means I cannot annotate them through a file.

Would it be possible to request the role on a Deployment/ReplicaSet/Service level? It could pass the annotation down to pods within.

Monitor iptables rule

When kube2iam is configured to set up the iptables rule I think it should also monitor the rule's presence somehow. I for one am afraid of some process resetting the rule and thus allowing pods some dangerous privilege escalations.

Thoughts on this? If people think it's a good idea I may be able to spend some time on implementing it.

is host ip required when setting `--iptables=true`?

just wondering, since kube2iam already runs on the host network, it could just acquire the local IP through hostname -i or even 169.254.169.254/latest/meta-data/local-ipv4. confirmed both by running inside a kube2iam pod. not sure if it's specific to my networking setup.

Allow policy to control which namespaces are able to assume which roles

We're in the process of experimenting with running a cluster where many different teams would be able to deploy their different apps onto the same cluster. We'd like to be able to control which processes are able to assume which roles- so a role which allows access to a particular KMS resource would only be available to pods running in the kube-system namespace, for example.

Is anyone else working on this (or interested in something like it)?

What version of docker is needed to run kube2iam please ?

I used a CoreOS k8s cluster running docker 1.10.3 to test using kube2iam with fstab/aws-cli
and somehow the aws-cli container could not have the specified role assumed.

Is it due to the docker version being 1.10.3 ?

There is no logging about it in the kube2iam log.
Would like to see if it is possible to turn on more verbal logging to trace the issue

Many thanks

Ken

Annotation on a CronJob doesn't work?

Hi,

We're using kube2iam and it's been working great, but we just introduced our first CronJob into the cluster. This creates a pod without the annotation. I've been trying to trace through the code on both sides, but I'm pretty clueless. Is this something that can be fixed on the kube2iam side to magically work?

For calico networking users: interface is cali+

If you're using calico as a network manager (instead of flannel), interface names are cali1234567890, we can leverage iptables wildcard (+). More info:
https://www.centos.org/docs/5/html/5.1/Deployment_Guide/s2-iptables-options-parameters.html

Please provide release notes

There are some new releases since 0.3.0 (which we are using in production, see https://github.com/zalando-incubator/kubernetes-on-aws/blob/dev/cluster/manifests/kube2iam/daemonset.yaml), but they are lacking release notes: https://github.com/jtblin/kube2iam/releases

It would be nice to have concise release notes instead of reading the whole code diff 😏

Support for other resource types

I (stupidly) was trying to figure out why it wasn't working but then realised I was testing Job resources (part of the beta batch api). Although they ultimately create pods the annotations were associated on the Job (and not the Pod).

I'll probably take another look soon as I'd like to start using Job resources for testing a CD system.

Thanks so much for the project!

add healthchecks

kube2iam should provide healthchecks such as /healthz and /healthz/ping so kubelet can ensure the proxies are running correctly.

some requirements:

encapsulate it in a different path, like maybe /kube2iam, so it's explicit that 169.254.169.254/kubei2am is specific to the proxy, and not affect any paths that might be exposed by the metadata service itself.

having this path also provides an easy way to check whether 169.254.169.254 goes to EC2 Metadata Service or kube2iam. If it goes to kube2iam, then 169.254.169.254/kube2iam/healthz/ping should return 200. it's potentially useful to debug iptables rule to make sure the redirection goes to the proxy correctly, since bare EC2 Metadata Service doesn't have that endpoint.

/healthz/ping seems easy to implement with just returning a 200 ok. /healthz might be more complicated depending on what we want to check. i might take a stab at /healthz/ping

IPtables rule not working

I am not 100% sure why, but the iptables rule is not working for our Kubernetes setup.

I've found that if I do not specify the interface as docker0 and instead specify the source as my private subnet (in my case, 100.64.0.0/10) it works perfectly. Unfortunately, for some completely unobvious reason, trying to specify a CIDR range as the source doesn't work in code (I tried it in my fork.) Maybe has to do with incompatibilities of iptables on the host and the container? It works if I run the iptables rule directly on the host.

It is worth noting that I'm using the VPC networking and not an overlay network.

Pods are sometimes assigned to the incorrect IAM role

In our cluster sometimes the pods that have a proper "iam.amazonaws.com/role" annotation do not receive their role when they start up. kube2iam returns the default role to them which in our case does not have any permissions. After some time of the application requests the credentials again, it gets the proper assignment.
Relevant log messages:

level=info msg="Requesting /latest/meta-data/iam/security-credentials/"
level=warning msg="Using fallback role for IP 10.233.109.12"
level=info msg="Requesting /latest/meta-data/iam/security-credentials/kube.no-permissions"
level=warning msg="Using fallback role for IP 10.233.109.12"
.... some time later ....
level=info msg="Requesting /latest/meta-data/iam/security-credentials/"
level=info msg="Requesting /latest/meta-data/iam/security-credentials/kube.kube-system.route53-kubernetes"

I am not really sure how to debug this further, it might be related to the issues described in #32

Add default role option

add a --default-role flag
use this role if no annotation found on pod and flag is set

kube2iam should discover AWS account ID

If no base-role is set, kube2iam should probably discover the AWS account ID and use arn:aws:iam::${AWS_ACCOUNT_ID}:role/ as the default.

kube2iam never recovers after node failures

I'm relatively new to Kubernetes and kube2iam, but I've been using both for a little over a month now. Earlier today, I attempted to test the fault tolerance of my cluster (created via kops) by terminating nodes and letting the ASG bring them back. My test cluster had one master and two worker nodes, and I terminated two worker nodes at the same time.

However, when kube2iam's Pod's were recreated within the newly started nodes, they did not not appear to restore the right metadata back from the EC2 metadata service as it was before the outage. There are no errors reported in the log for any of the Pods either (all debug or info). I can confirm that kube2iam was definitely forwarding credentials properly before.

Below is an example config that I was using to verify before/after which will now error consistently:

# simple aws job that will verify kube2iam is working
---
apiVersion: batch/v1
kind: Job
metadata:
  name: aws-cli
  labels:
    name: aws-cli
  namespace: kube-system
spec:
  template:
    metadata:
      annotations:
        # role with s3 get access
        iam.amazonaws.com/role: arn:aws:iam::${account_id}:role/route53-kubernetes-role
    spec:
      restartPolicy: Never
      containers:
        - image: fstab/aws-cli
          command:
            - "/home/aws/aws/env/bin/aws"
            - "s3api"
            - "get-object"
            - "--bucket"
            - "my-bucket-yay"
            - "--key"
            - "feeds/ryan-koval/feed.json"
            - "feed.json"
          name: aws-cli

Below is the config I'm using for kube2iam:

# used to enable IAM access to docker containers within k8s
# https://github.com/jtblin/kube2iam
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: kube2iam
  labels:
    app: kube2iam
  namespace: kube-system
spec:
  template:
    metadata:
      labels:
        name: kube2iam
    spec:
      hostNetwork: true
      containers:
        - image: jtblin/kube2iam:0.6.1
          name: kube2iam
          args:
            - "--iptables=true"
            - "--host-ip=$(HOST_IP)"
            - "--verbose"
            - "--debug"
          env:
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
          ports:
            - containerPort: 8181
              hostPort: 8181
              name: http
          securityContext:
            privileged: true

This is what the debug service was outputting at localhost:8181/debug/store, if it helps:

{
  "namespaceByIP": {
    "": "default",
    "100.96.3.130": "kube-system",
    "100.96.3.5": "kube-system",
    "100.96.3.6": "kube-system",
    "100.96.3.7": "kube-system",
    "100.96.4.2": "kube-system",
    "100.96.4.3": "kube-system",
    "100.96.4.6": "kube-system",
    "100.96.4.8": "kube-system",
    "172.20.36.35": "kube-system",
    "172.20.44.110": "kube-system",
    "172.20.57.20": "kube-system"
  },
  "rolesByIP": {
    "100.96.3.130": "arn:aws:iam::${account_id}:role/route53-kubernetes-role"
  },
  "rolesByNamespace": {}
}

... and the Kubernetes cluster is running on 1.5.2.

This is all of the info I can think to provide for now, but please let me know if there's anything else I can provide to help us troubleshoot.

Containers can still access node role

I've had this roles:

worker_role
test

When I curl metadata, I receive 'test' as the role I should use. But I can circumvent this by hitting:
curl -s 169.254.169.254/latest/meta-data/iam/security-credentials/worker_role

Which will provide me the temporary credentials for that role. It would be great if we could block those calls, else an attacker could leverage it

Pod got role from another pod

I'm not sure how to debug this further so I'm looking for help. I deployed kube2iam via a DaemonSet with the --iptables option. What I saw was this:

$ kubectl exec -it ladder /bin/bash
$ curl http://169.254.169.254/latest/meta-data/iam/security-credentials/ && echo
commander_role
$ curl http://169.254.169.254/latest/meta-data/iam/security-credentials/ && echo
commander_role
$ <restart local kube2iam pod>
$ curl http://169.254.169.254/latest/meta-data/iam/security-credentials/ && echo
ladder_role

The right role for this pod was ladder_role. For some reason kube2iam kept returning a role for an unrelated pod.

Building against kubernetes v1.5.3 client version is broken

gofmt -w=true -s $(find . -type f -name '*.go' -not -path "./vendor/*")
goimports -w=true -d $(find . -type f -name '*.go' -not -path "./vendor/*")
go build -o build/bin/darwin/kube2iam -ldflags "-s -X ""github.com/jtblin"/kube2iam"/version.Version=$(git describe --abbrev=0 --tags) -X ""github.com/jtblin"/kube2iam"/version.GitCommit=$(git rev-parse --short HEAD) -X ""github.com/jtblin"/kube2iam"/version.BuildDate=$(date +%Y-%m-%d-%H:%M)" github.com/jtblin/kube2iam
# github.com/jtblin/kube2iam/cmd
cmd/k8s.go:20: undefined: "github.com/jtblin/kube2iam/vendor/k8s.io/kubernetes/pkg/client/unversioned".Client
make: *** [build] Error 2```

After running 
```glide get github.com/aws/aws-sdk-go/blob/master/aws/ec2metadata#^v1.0.8```

doing a diff to glide.lock I found that it was never updated to v1.5.3

-- name: k8s.io/client-go

version: 6631b2769fbf8fd8ff6b2074d64774b010c7d37a
subpackages:
- 1.4/pkg/api
- 1.4/pkg/api/endpoints
- 1.4/pkg/api/errors
- 1.4/pkg/api/meta
- 1.4/pkg/api/meta/metatypes
- 1.4/pkg/api/pod
- 1.4/pkg/api/resource
- 1.4/pkg/api/service
- 1.4/pkg/api/unversioned
- 1.4/pkg/api/unversioned/validation
- 1.4/pkg/api/util
- 1.4/pkg/api/v1
- 1.4/pkg/api/validation
- 1.4/pkg/apimachinery
- 1.4/pkg/apimachinery/registered
- 1.4/pkg/apis/autoscaling
- 1.4/pkg/apis/batch
- 1.4/pkg/apis/extensions
- 1.4/pkg/auth/user
- 1.4/pkg/capabilities
- 1.4/pkg/conversion
- 1.4/pkg/conversion/queryparams
- 1.4/pkg/fields
- 1.4/pkg/labels
- 1.4/pkg/runtime
- 1.4/pkg/runtime/serializer
- 1.4/pkg/runtime/serializer/json
- 1.4/pkg/runtime/serializer/protobuf
- 1.4/pkg/runtime/serializer/recognizer
- 1.4/pkg/runtime/serializer/streaming
- 1.4/pkg/runtime/serializer/versioning
- 1.4/pkg/security/apparmor
- 1.4/pkg/selection
- 1.4/pkg/third_party/forked/golang/reflect
- 1.4/pkg/types
- 1.4/pkg/util

error in logs

This is showing up in the logs every second or so:

E1130 18:15:04.738072       1 reflector.go:214] github.com/jtblin/kube2iam/cmd/k8s.go:39: Failed to list *api.Pod: Get https://172.16.0.1:443/api/v1/pods?resourceVersion=0: x509: cannot validate certificate for 172.16.0.1 because it doesn't contain any IP SANs

why is this hitting the pod api at the private ip when i have explicitly specified the --api-server option?

Prefetch credentials and refresh periodically to better emulate the metadata service.

As it currently stands, credentials are only requested from STS on an as needed basis.
Unfortunately this adds significant latency (I've seen > 100ms) compared to the metadata service (< 1ms).

This can cause some issues for clients (like boto) that have short timeouts for accessing the metadata service when going through the credentials provider chains, causing it to think that there aren't any credentials available.

This can be solved with three components

Fetch credentials as soon as the pod is added
A goroutine per pod to refresh the credentials before they expire
Stop the goroutine when the pod is deleted

I've submitted a PR #30 that implements this solution, but I'm happy to discuss other solutions, or modifications to the existing PR.

0.3.3 prepends base arn to security-credentials

0.3.3 seems to have changed the behavior of this app and it now prepends the base-role-arn to the security-credentials responses which seems to prevent containers from accessing the credentials for their role.

Is 0.3.3 known to not be backwards compatible (if so, how should we adjust our configuration?) or is this a bug?

With 0.3.3

# run inside a container via kubectl exec
curl -s http://169.254.169.254/latest/meta-data/iam/security-credentials/

arn:aws-us-gov:iam::ACCOUNT:role/bosh-passed/k8s-logger

curl -s http://169.254.169.254/latest/meta-data/iam/security-credentials/arn:aws-us-gov:iam::ACCOUNT:role/bosh-passed/k8s-logger

Invalid role arn:aws-us-gov:iam::ACCOUNT:role/bosh-passed/k8s-logger

On 0.3.1 this worked as expected:

# run inside a container via kubectl exec

curl -s http://169.254.169.254/latest/meta-data/iam/security-credentials/

k8s-logger

curl -s http://169.254.169.254/latest/meta-data/iam/security-credentials/k8s-logger

{"Code":"Success","LastUpdated":"2017-03-07T16:23:04Z","Type":"AWS-HMAC","AccessKeyId":"...","SecretAccessKey":"...","Expiration":"2017-03-07T16:53:04Z"}

kube2iam is started with the following options: --base-role-arn=arn:aws-us-gov:iam::ACCOUNT:role/bosh-passed/ --default-role=k8s-node --host-ip=192.0.2.229 --iptables=true and the examples above are from a container that's been annotated to have the k8s-logger role rather than the default role

Should iptables be removed on container shutdown/removal?

When the container shuts down should the iptables rule be reverted?

More restrictive permissions?

@jtblin mentioned in kubernetes/kubernetes#23580 (comment) that the README needs an update. I'm looking to deploy kube2iam to my clusters soon, so this would be a great time to share any refinements.

Thanks!