gardener-attic / kubify Goto Github PK

Terraform Template to Setup a Kubernetes Cluster on OpenStack/AWS/Azure

License: Other

Shell 13.29% HCL 86.71%

kubernetes aws azure openstack terraform bootkube kubernetes-setup cluster

kubify's Introduction

Kubify

Kubify is a Terraform based provisioning project for setting up production ready Kubernetes clusters on public and private Cloud infrastructures. Kubify currently supports:

OpenStack
AWS
Azure

Key features of Kubify are:

Kubernetes v1.10.12
Etcd v3.3.10 multi master node setup
Etcd backup and restore
Supports rolling updates

To start using or developing Kubify locally

See our documentation in the /docs repository or find the main documentation here.

Feedback and Support

Feedback and contributions are always welcome. Please report bugs or suggestions about our Kubernetes clusters as such or the Kubify itself as GitHub issues or join our Slack channel #gardener (Invite yourself to the Kubernetes Slack workspace here).

kubify's People

Contributors

Stargazers

Watchers

kubify's Issues

Look into Config Drive instead of calling Metadata Service on OpenStack

Issue by I820491
Monday Dec 04, 2017 at 09:36 GMT
Originally opened as https://git.removed/kubernetes-attic/kubify/issues/13

Since we are facing a few problems with the metadata service on OpenStack we should consider using the Config Drive to retrieving the instance meta data. Terraform has a support for that [1]. Also it looks like the kube-controller-manager can handle this as well [2].

[1] https://www.terraform.io/docs/providers/openstack/r/compute_instance_v2.html#config_drive
[2] kubernetes/kubernetes#23733

Describe or Automate Kubify Cluster Reboot

Under certain circumstances (technical issues in the infrastructure), a Kubify cluster may "lose coherence" and must be "rebootet" (bootstrapped via the bootstrap control plane). This is neither automated nor described (unlike the https://github.com/gardener/kubify#cluster-recovery, which is the last resort if everything breaks). It would be good to do either, please.

rke instead of bootkube

Lot's of good things can be read about rke, one of the more impressive is this: https://medium.com/@cfatechblog/bare-metal-k8s-clustering-at-chick-fil-a-scale-7b0607bd3541

Most interesting features (in my horizon):

OS agnostic
simple cluster config yaml

Maybe not worth it in the (slow, maybe) advent of its obsoletion:

Any thoughts?

Generate List of used Images by Kubify

Create a script to extract images used by Kubify. Ideally we generate a file out of it and keep it in the repository.

Failed to deploy kubify on OpenStack

Hi,

I create a customized dns module to use openstack designate ans dns provider, but after create openstack related variables, when running terraform plan variant the deployment still reaching out to aws ec2. I only use aws s3 for etcd backup

Here's the changes
jiangytcn@630a2b0


$ cat terraform.tfvars | grep -v '#' | grep -v '^$'                                                                    
os_user_name = "admin"
os_password = "ae226d1f8b27c60b31088"
os_auth_url = "http://172.29.236.100:5000/v3"
os_tenant_name = "demo"
os_domain_name = "default"
os_region = "RegionOne"
os_fip_pool_name = "public"
os_lbaas_provider = "haproxy"
os_az = "nova"
event_ttl = "168h0m0s"
os_vpc_cidr = "10.251.0.0/16"
cluster_name = "management"
cluster_type = "eval"
versions = {
  image_name = "coreos-1688.5.3"
}
dns = {
  domain_name = "lab.yacloud.int"
  dns_type = "designate"
  hosted_zone_id = "5ad92a47-def0-45af-8e6d-ed35f6a1fee0"
  access_key = "dummy"
  secret_key = "dummy"
}
master = {
  count = 3
  volume_size = 50
}
worker = {
  count = 3
  volume_size = 50
}
etcd_backup = {
  "access_key" = "XXXX"
  "region" = "ap-northeast-1"
  "secret_key" = "XXXX"
  "storage_type" = "s3"
}
addons = {
  "dashboard" = {
    "app_name" = "kubernetes-dashboard"
  }
  "nginx-ingress" = {
  }
}
dashboard_creds = "admin"
deploy_tiller = false
oidc_issuer_subdomain = "identity.ingress"
oidc_client_id = "kube-kubectl"
oidc_username_claim = "email"
oidc_groups_claim = "groups"
subnet_cidr = "10.251.128.0/17"
service_cidr = "10.241.0.0/17"
pod_cidr = "10.241.128.0/17"
selfhosted_etcd = "false"

Failures in terraform

module.instance.null_resource.master_setup - *terraform.NodePlannableResourceInstance
2019/01/17 05:12:52 [TRACE] Graph after step *terraform.RootTransformer:

module.instance.null_resource.master_setup - *terraform.NodePlannableResourceInstance
2019/01/17 05:12:52 [DEBUG] Resource state not found for "module.instance.local_file.reset_bootkube": module.instance.local_file.reset_bootkube
2019/01/17 05:12:52 [DEBUG] ReferenceTransformer: "module.instance.local_file.reset_bootkube" references: []
2019-01-17T05:12:52.112Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: 2019/01/17 05:12:52 [DEBUG] [aws-sdk-go] DEBUG: Response sts/GetCallerIdentity Details:
2019-01-17T05:12:52.112Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: ---[ RESPONSE ]--------------------------------------
2019-01-17T05:12:52.112Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: HTTP/1.1 403 Forbidden
2019-01-17T05:12:52.112Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: Connection: close
2019-01-17T05:12:52.112Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: Content-Length: 306
2019-01-17T05:12:52.112Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: Content-Type: text/xml
2019-01-17T05:12:52.112Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: Date: Thu, 17 Jan 2019 05:12:51 GMT
2019-01-17T05:12:52.112Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: X-Amzn-Requestid: 8a1734aa-1a16-11e9-9b9a-eba3b07524ac
2019-01-17T05:12:52.112Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: 
2019-01-17T05:12:52.113Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: 
2019-01-17T05:12:52.113Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: -----------------------------------------------------
2019-01-17T05:12:52.113Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: 2019/01/17 05:12:52 [DEBUG] [aws-sdk-go] <ErrorResponse xmlns="https://sts.amazonaws.com/doc/2011-06-15/">
2019-01-17T05:12:52.113Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:   <Error>
2019-01-17T05:12:52.113Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:     <Type>Sender</Type>
2019-01-17T05:12:52.113Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:     <Code>InvalidClientTokenId</Code>
2019-01-17T05:12:52.113Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:     <Message>The security token included in the request is invalid.</Message>
2019-01-17T05:12:52.113Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:   </Error>
2019-01-17T05:12:52.113Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:   <RequestId>8a1734aa-1a16-11e9-9b9a-eba3b07524ac</RequestId>
2019-01-17T05:12:52.113Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: </ErrorResponse>
2019-01-17T05:12:52.113Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: 2019/01/17 05:12:52 [DEBUG] [aws-sdk-go] DEBUG: Validate Response sts/GetCallerIdentity failed, not retrying, error InvalidClientTokenId: The security token included in the request is invalid.
2019-01-17T05:12:52.113Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: 	status code: 403, request id: 8a1734aa-1a16-11e9-9b9a-eba3b07524ac
2019/01/17 05:12:52 [ERROR] root: eval: *terraform.EvalConfigProvider, err: error validating provider credentials: error calling sts:GetCallerIdentity: InvalidClientTokenId: The security token included in the request is invalid.
	status code: 403, request id: 8a1734aa-1a16-11e9-9b9a-eba3b07524ac
2019/01/17 05:12:52 [ERROR] root: eval: *terraform.EvalSequence, err: error validating provider credentials: error calling sts:GetCallerIdentity: InvalidClientTokenId: The security token included in the request is invalid.
	status code: 403, request id: 8a1734aa-1a16-11e9-9b9a-eba3b07524ac
2019/01/17 05:12:52 [ERROR] root: eval: *terraform.EvalOpFilter, err: error validating provider credentials: error calling sts:GetCallerIdentity: InvalidClientTokenId: The security token included in the request is invalid.
	status code: 403, request id: 8a1734aa-1a16-11e9-9b9a-eba3b07524ac
2019/01/17 05:12:52 [ERROR] root: eval: *terraform.EvalSequence, err: error validating provider credentials: error calling sts:GetCallerIdentity: InvalidClientTokenId: The security token included in the request is invalid.
	status code: 403, request id: 8a1734aa-1a16-11e9-9b9a-eba3b07524ac
2019/01/17 05:12:52 [TRACE] [walkPlan] Exiting eval tree: provider.aws.route53
2019-01-17T05:12:52.582Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: 2019/01/17 05:12:52 [DEBUG] [aws-sdk-go] DEBUG: Response sts/GetCallerIdentity Details:
2019-01-17T05:12:52.582Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: ---[ RESPONSE ]--------------------------------------
2019-01-17T05:12:52.582Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: HTTP/1.1 200 OK
2019-01-17T05:12:52.582Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: Connection: close
2019-01-17T05:12:52.582Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: Content-Length: 406
2019-01-17T05:12:52.582Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: Content-Type: text/xml
2019-01-17T05:12:52.582Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: Date: Thu, 17 Jan 2019 05:12:52 GMT
2019-01-17T05:12:52.582Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: X-Amzn-Requestid: 8a5e768b-1a16-11e9-9b72-73f4bff84a99
2019-01-17T05:12:52.582Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: 
2019-01-17T05:12:52.582Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: 
2019-01-17T05:12:52.582Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: -----------------------------------------------------
2019-01-17T05:12:52.582Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: 2019/01/17 05:12:52 [DEBUG] [aws-sdk-go] <GetCallerIdentityResponse xmlns="https://sts.amazonaws.com/doc/2011-06-15/">
2019-01-17T05:12:52.582Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:   <GetCallerIdentityResult>
2019-01-17T05:12:52.582Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:     <Arn>arn:aws:iam::xxxxxxx:user/jiangytcn</Arn>
2019-01-17T05:12:52.582Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:     <UserId>xxxx</UserId>
2019-01-17T05:12:52.582Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:     <Account>xxxx</Account>
2019-01-17T05:12:52.582Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:   </GetCallerIdentityResult>
2019-01-17T05:12:52.582Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:   <ResponseMetadata>
2019-01-17T05:12:52.582Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:     <RequestId>8a5e768b-1a16-11e9-9b72-73f4bff84a99</RequestId>
2019-01-17T05:12:52.582Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:   </ResponseMetadata>
2019-01-17T05:12:52.582Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: </GetCallerIdentityResponse>
2019-01-17T05:12:52.583Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: 2019/01/17 05:12:52 [DEBUG] [aws-sdk-go] DEBUG: Request ec2/DescribeAccountAttributes Details:
2019-01-17T05:12:52.583Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: ---[ REQUEST POST-SIGN ]-----------------------------
2019-01-17T05:12:52.583Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: POST / HTTP/1.1
2019-01-17T05:12:52.583Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: Host: ec2.ap-northeast-1.amazonaws.com
2019-01-17T05:12:52.583Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: User-Agent: aws-sdk-go/1.16.16 (go1.11.4; linux; amd64) APN/1.0 HashiCorp/1.0 Terraform/0.11.9-beta1
2019-01-17T05:12:52.583Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: Content-Length: 87
2019-01-17T05:12:52.583Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: Authorization: AWS4-HMAC-SHA256 Credential=xxxxxxx/20190117/ap-northeast-1/ec2/aws4_request, SignedHeaders=content-length;content-type;host;x-amz-date, Signature=e6cfb85b85654a8ccfced1d8d64963e38cd4aafd0e0298fb37c0dcabea1deb43
2019-01-17T05:12:52.583Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: Content-Type: application/x-www-form-urlencoded; charset=utf-8
2019-01-17T05:12:52.583Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: X-Amz-Date: 20190117T051252Z
2019-01-17T05:12:52.583Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: Accept-Encoding: gzip
2019-01-17T05:12:52.583Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: 
2019-01-17T05:12:52.583Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: Action=DescribeAccountAttributes&AttributeName.1=supported-platforms&Version=2016-11-15
2019-01-17T05:12:52.583Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: -----------------------------------------------------
2019-01-17T05:12:53.115Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: 2019/01/17 05:12:53 [DEBUG] [aws-sdk-go] DEBUG: Response ec2/DescribeAccountAttributes Details:
2019-01-17T05:12:53.115Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: ---[ RESPONSE ]--------------------------------------
2019-01-17T05:12:53.115Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: HTTP/1.1 200 OK
2019-01-17T05:12:53.115Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: Connection: close
2019-01-17T05:12:53.116Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: Content-Length: 540
2019-01-17T05:12:53.116Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: Content-Type: text/xml;charset=UTF-8
2019-01-17T05:12:53.116Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: Date: Thu, 17 Jan 2019 05:12:52 GMT
2019-01-17T05:12:53.116Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: Server: AmazonEC2
2019-01-17T05:12:53.116Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: 
2019-01-17T05:12:53.116Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: 
2019-01-17T05:12:53.116Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: -----------------------------------------------------
2019-01-17T05:12:53.116Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: 2019/01/17 05:12:53 [DEBUG] [aws-sdk-go] <?xml version="1.0" encoding="UTF-8"?>
2019-01-17T05:12:53.116Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: <DescribeAccountAttributesResponse xmlns="http://ec2.amazonaws.com/doc/2016-11-15/">
2019-01-17T05:12:53.116Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:     <requestId>d408d2ed-58a9-43cf-a9e4-3a7a27d4205f</requestId>
2019-01-17T05:12:53.116Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:     <accountAttributeSet>
2019-01-17T05:12:53.116Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:         <item>
2019-01-17T05:12:53.116Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:             <attributeName>supported-platforms</attributeName>
2019-01-17T05:12:53.116Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:             <attributeValueSet>
2019-01-17T05:12:53.116Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:                 <item>
2019-01-17T05:12:53.116Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:                     <attributeValue>VPC</attributeValue>
2019-01-17T05:12:53.116Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:                 </item>
2019-01-17T05:12:53.116Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:             </attributeValueSet>
2019-01-17T05:12:53.116Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:         </item>
2019-01-17T05:12:53.116Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4:     </accountAttributeSet>
2019-01-17T05:12:53.116Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: </DescribeAccountAttributesResponse>
2019/01/17 05:12:53 [DEBUG] Resource state not found for "module.instance.module.seed.aws_s3_bucket.s3_etcd_backup": module.instance.module.seed.aws_s3_bucket.s3_etcd_backup
2019/01/17 05:12:53 [TRACE] Graph after step *terraform.AttachStateTransformer:

module.instance.module.seed.aws_s3_bucket.s3_etcd_backup - *terraform.NodePlannableResourceInstance
2019/01/17 05:12:53 [DEBUG] ReferenceTransformer: "module.instance.module.seed.aws_s3_bucket.s3_etcd_backup" references: []
2019/01/17 05:12:53 [DEBUG] plugin: waiting for all plugin processes to complete...

Error: Error running plan: 1 error(s) occurred:

* provider.aws.route53: error validating provider credentials: error calling sts:GetCallerIdentity: InvalidClientTokenId: The security token included in the request is invalid.
	status code: 403, request id: 8a1734aa-1a16-11e9-9b9a-eba3b07524ac


2019-01-17T05:12:53.134Z [DEBUG] plugin.terraform-provider-aws_v1.55.0_x4: 2019/01/17 05:12:53 [ERR] plugin: plugin server: accept unix /tmp/plugin363027272: use of closed network connection
2019-01-17T05:12:53.134Z [DEBUG] plugin.terraform-provider-tls_v1.2.0_x4: 2019/01/17 05:12:53 [ERR] plugin: plugin server: accept unix /tmp/plugin088484845: use of closed network connection
2019-01-17T05:12:53.134Z [DEBUG] plugin.terraform: local-exec-provisioner (internal) 2019/01/17 05:12:53 [DEBUG] plugin: waiting for all plugin processes to complete...
2019-01-17T05:12:53.134Z [DEBUG] plugin.terraform-provider-openstack_v1.13.0_x4: 2019/01/17 05:12:53 [ERR] plugin: plugin server: accept unix /tmp/plugin188386594: use of closed network connection
2019-01-17T05:12:53.134Z [DEBUG] plugin: plugin process exited: path=/landscape/.terraform/plugins/linux_amd64/terraform-provider-local_v1.1.0_x4
2019-01-17T05:12:53.134Z [DEBUG] plugin: plugin process exited: path=/landscape/.terraform/plugins/linux_amd64/terraform-provider-archive_v1.1.0_x4
2019-01-17T05:12:53.134Z [DEBUG] plugin.terraform-provider-template_v2.0.0_x4: 2019/01/17 05:12:53 [ERR] plugin: plugin server: accept unix /tmp/plugin938372641: use of closed network connection
2019-01-17T05:12:53.135Z [DEBUG] plugin: plugin process exited: path=/landscape/.terraform/plugins/linux_amd64/terraform-provider-openstack_v1.13.0_x4
2019-01-17T05:12:53.134Z [DEBUG] plugin.terraform: file-provisioner (internal) 2019/01/17 05:12:53 [DEBUG] plugin: waiting for all plugin processes to complete...
2019-01-17T05:12:53.135Z [DEBUG] plugin: plugin process exited: path=/landscape/.terraform/plugins/linux_amd64/terraform-provider-aws_v1.55.0_x4
2019-01-17T05:12:53.134Z [DEBUG] plugin: plugin process exited: path=/usr/local/bin/terraform
2019-01-17T05:12:53.134Z [DEBUG] plugin.terraform-provider-random_v2.0.0_x4: 2019/01/17 05:12:53 [ERR] plugin: plugin server: accept unix /tmp/plugin446398090: use of closed network connection
2019-01-17T05:12:53.134Z [DEBUG] plugin: plugin process exited: path=/usr/local/bin/terraform
2019-01-17T05:12:53.135Z [DEBUG] plugin: plugin process exited: path=/landscape/.terraform/plugins/linux_amd64/terraform-provider-random_v2.0.0_x4
2019-01-17T05:12:53.134Z [DEBUG] plugin: plugin process exited: path=/landscape/.terraform/plugins/linux_amd64/terraform-provider-template_v2.0.0_x4
2019-01-17T05:12:53.135Z [DEBUG] plugin: plugin process exited: path=/landscape/.terraform/plugins/linux_amd64/terraform-provider-null_v1.0.0_x4
2019-01-17T05:12:53.136Z [DEBUG] plugin.terraform: remote-exec-provisioner (internal) 2019/01/17 05:12:53 [DEBUG] plugin: waiting for all plugin processes to complete...
2019-01-17T05:12:53.136Z [DEBUG] plugin: plugin process exited: path=/landscape/.terraform/plugins/linux_amd64/terraform-provider-tls_v1.2.0_x4
2019-01-17T05:12:53.139Z [DEBUG] plugin: plugin process exited: path=/usr/local/bin/terraform

Btw, is it possible to disable etcd backup to s3 ?

Generated README.md

Just saw https://git.removed/kubernetes/cluster-dev-infra-ci/blob/master/README.md. Can we please generate the README.md automatically? That would be awesome. ;-)

Things that would be handy are:

kubeconfig (not so much to us, but to the community)
Dashboard URL and credentials
Grafana URL and credentials
Kibana URL and credentials
Summary of the configuration (optional)

Add RolloverIndex to ElasticSearch Deployment

Issue by I820491
Tuesday Nov 14, 2017 at 12:03 GMT
Originally opened as https://git.removed/kubernetes-attic/kubify/issues/9

Otherwise the deployment will run out of disk space eventually.
https://www.elastic.co/guide/en/elasticsearch/reference/master/indices-rollover-index.html

Add Monitoring Stack Addon to Setup

Issue by I820491
Tuesday Nov 14, 2017 at 14:00 GMT
Originally opened as https://git.removed/kubernetes-attic/kubify/issues/11

Add the support for:

Heapster
Prometheus
Grafana
Plus the corresponding configuration to the cluster. Ideally with some meaningful Grafana dashboards and Prometheus alerts.

Upgrade Kubify to Support 1.11.x Cluster

Add node-exporter to monitoring addon

In order to get better insides into the host os metrics we should integrate the node exporter [1] into our monitoring stack.

[1] https://github.com/prometheus/node_exporter

Documentation issue on page "https://github.com/gardener/kubify/blob/master/README.md"

please add the kubify logo on top of this page. good example is https://github.com/gardener/gardenctl

Page: https://github.com/gardener/kubify/blob/master/README.md
don't remove the link to the page. It's required for the processor

Provide Example for Dex Configuration

Currently we only have documentation for the properties needed for configuring dex. What would be great is that we have an example configuration for saml and github which go into the .tfvars file.

Different variants depend on aws route53

Hi,

We are currently trying to use kubify with azure. Looking into the code, cluster.tf shows it's hard depending on route53. People using different variants may want to skip dns configuration, or use their own variant's dns offering like Azure DNS zones.

Tag elastic IPs

Feature Request

Currently elastic IPs are not tagged, which makes it hard to find out which ones belong to a specific Kubify instance. This would for example be helpful in case someone wants/needs to clean up a cluster manually.

Apparently, this also applies to network interfaces as well, and maybe other resources too (didn't check yet).

Proposal: tag each resource created by Kubify with something that makes it possible to identify the Kubify instance it belongs to. Maybe the cluster domain could be used for this.

Migrate to kube-metrics-server

Remove heapster from the current deployment and enable the kube-metrics-server. Currently the horizontal pod autoscaler is broken.

Abnormality observed during Updation of Workers in the kubify cluster

I have setup up the cluster using Kubify. Initially, the worker type of the VM's were m4.large y default and the Master nodes were 3, Worker nodes were 10.
We want the vm types to be m4.4xlarge. I updated the file- https://github.com/gardener/kubify/blob/master/modules/vms/versions.tf with the vm type as m4.4xlarge and increased the worker nodes to 12.
Executed terraform init variant, terraform plan variant and terraform apply varaint, After that checked, 2 nodes of master and 2 nodes of worker were created with m4.4xlarge and the remaining 1 master node and 10 nodes were still of the type m4.large.

~ module.instance.module.worker.module.vms.aws_instance.nodes[4]
      instance_type:                             "m4.large" => "m4.4xlarge"

  + module.instance.module.worker.module.vms.aws_instance.nodes[6]
      id:                                        <computed>
      ami:                                       "ami-b5742acf"
      arn:                                       <computed>
      associate_public_ip_address:               <computed>
      availability_zone:                         "us-east-1b"
      cpu_core_count:                            <computed>
      cpu_threads_per_core:                      <computed>
      disable_api_termination:                   "false"
      ebs_block_device.#:                        <computed>
      ephemeral_block_device.#:                  <computed>
      get_password_data:                         "false"
      host_id:                                   <computed>
      iam_instance_profile:                      "perfgardener-eval-worker"
      instance_state:                            <computed>
      instance_type:                             "m4.large"
      ipv6_address_count:                        <computed>
      ipv6_addresses.#:                          <computed>
      key_name:                                  "perfgardener-eval"
      network_interface.#:                       <computed>
      network_interface_id:                      <computed>
      password_data:                             <computed>
      placement_group:                           <computed>

Ideally, all the master and worker nodes should of type m4.4xlarge as per the configuration. But thats not happening. Kindly check. Thanks !

Generate more helpful context name in kubeconfig

Hi,

I have a command prompt that shows the currently targeted K8s cluster context, but the kubify clusters show only as USER@CLUSTER_NAME where the user is always admin anyways and the CLUSTER_NAME is not the full cluster name, but only a part of it (e.g. garden) and the second important half of it is missing (e.g. dev or staging).

Just from where I stand (my PoV only here), a context name like CLUSTER_NAME-CLUSTER_TYPE would be more helpful (as we use kubify right now). Then I would see garden-dev or garden-staginginstead ofadmin@gardenandadmin@garden`.

Speaking of, why is there the cluster_type at all? I see it only referenced as part of the domain, which is maybe anyhow somewhat confusing, isn't it? So maybe it's also an option to merge it into the name and drop it altogether.

Then the remaining question is whether we can drop the user from the context name, because it's hardcoded anyways, which explains why it isn't helping me in the command prompt (that I naturally try to keep small).

Thanks for considering the above, Vedran

kube-apiserver bootstrap is unstable

landscape-setup-template users frequently hit an error during cluster setup or end up with an unhealthy cluster where only 2 out of 3 kube-apiserver pods are running. Currently, we know of the following symptoms:

Cluster setup fails early due to etcd operator errors (#48):

E0612 12:16:54.377420       1 leaderelection.go:224] error retrieving resource lock kube-system/etcd-operator: Get https://10.241.0.1:443/api/v1/namespaces/kube-system/endpoints/etcd-operator: dial tcp 10.241.0.1:443: getsockopt: connection refused

Cluster is unhealty due to kube-controller-manager continuously throwing errors (pod stays running though):

E0608 08:53:38.676846       1 leaderelection.go:224] error retrieving resource lock kube-system/kube-controller-manager: Get https://10.241.0.1:443/api/v1/namespaces/kube-system/endpoints/kube-controller-manager: dial tcp 10.241.0.1:443: getsockopt: connection refused

The fact that the error (dial tcp 10.241.0.1:443: getsockopt: connection refused) is encountered for all requests looks like a routing error at first: 2 out of 3 apiserver instances are running and reachable after all and we expect the requests to the service IP to be distributed among the set of available pods (i.e. shouldn't 2 out of 3 requests succeed?).

This is most likely due to the (default) sessionAffinity setting for the default/kubernetes service:

  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 10800

When a request from a source IP was routed to a KUBE-SEP once, it will be routed there for the next 3 hours (10800 seconds). E.g. if the leading controller-manager pod happens to be routed to the faulty node (w/o kube-apiserver running), all requests will end up there until the timeout is reached. The iptables rules for that look like:

-A KUBE-SEP-2KS7SNAGO5W6YUEJ -s 10.251.142.41/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-2KS7SNAGO5W6YUEJ -p tcp -m comment --comment "default/kubernetes:https" -m recent --set --name KUBE-SEP-2KS7SNAGO5W6YUEJ --mask 255.255.255.255 --rsource -m tcp -j DNAT --to-destination 10.251.142.41:443
-A KUBE-SEP-Y44MF4JTQLZ5QDG7 -s 10.251.185.148/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-Y44MF4JTQLZ5QDG7 -p tcp -m comment --comment "default/kubernetes:https" -m recent --set --name KUBE-SEP-Y44MF4JTQLZ5QDG7 --mask 255.255.255.255 --rsource -m tcp -j DNAT --to-destination 10.251.185.148:443
-A KUBE-SEP-YPQKGMLT4E2MSUWB -s 10.251.188.234/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-YPQKGMLT4E2MSUWB -p tcp -m comment --comment "default/kubernetes:https" -m recent --set --name KUBE-SEP-YPQKGMLT4E2MSUWB --mask 255.255.255.255 --rsource -m tcp -j DNAT --to-destination 10.251.188.234:443
-A KUBE-SERVICES ! -s 10.241.0.0/17 -d 10.241.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.241.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -m recent --rcheck --seconds 10800 --reap --name KUBE-SEP-2KS7SNAGO5W6YUEJ --mask 255.255.255.255 --rsource -j KUBE-SEP-2KS7SNAGO5W6YUEJ
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -m recent --rcheck --seconds 10800 --reap --name KUBE-SEP-Y44MF4JTQLZ5QDG7 --mask 255.255.255.255 --rsource -j KUBE-SEP-Y44MF4JTQLZ5QDG7
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -m recent --rcheck --seconds 10800 --reap --name KUBE-SEP-YPQKGMLT4E2MSUWB --mask 255.255.255.255 --rsource -j KUBE-SEP-YPQKGMLT4E2MSUWB
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -m statistic --mode random --probability 0.33332999982 -j KUBE-SEP-2KS7SNAGO5W6YUEJ
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-Y44MF4JTQLZ5QDG7
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -j KUBE-SEP-YPQKGMLT4E2MSUWB

By removing the sessionAffinity setting from the default/kubernetes service (e.g. with kubectl edit svc kubernetes), the problem can be fixed for symptom 2 (as described above): controller-manager will eventually hit a healthy apiserver instance and be able go on with its tasks. The missing kube-apiserver pod will be rescheduled shortly after.

Noteworthy is that on the faulty master node where kube-apiserver is not running, the checkpoint is also missing (otherwise the pod should be running again shortly after it stopped), find /etc/kubernetes/ -iname '*api*' is empty. The checkpointer logs shows the following:

[...]
I0614 07:59:20.397358       1 main.go:225] API GC: skipping inactive checkpoint kube-system/kube-apiserver-tdmfk
I0614 07:59:23.410633       1 main.go:394] Checkpoint manifest for "kube-system/kube-apiserver-tdmfk" already exists. Skipping                                                                                                         
I0614 07:59:23.421755       1 main.go:225] API GC: skipping inactive checkpoint kube-system/kube-apiserver-tdmfk
I0614 07:59:26.433228       1 main.go:394] Checkpoint manifest for "kube-system/kube-apiserver-tdmfk" already exists. Skipping                                                                                                         
I0614 07:59:31.270644       1 main.go:225] API GC: skipping inactive checkpoint kube-system/kube-apiserver-tdmfk
I0614 07:59:31.270759       1 main.go:283] Should start checkpoint kube-system/kube-apiserver-tdmfk
I0614 07:59:31.270949       1 main.go:397] Writing manifest for "kube-system/kube-apiserver-tdmfk" to "/etc/kubernetes/manifests/kube-system-kube-apiserver-tdmfk.json"                                                                
I0614 07:59:34.364611       1 main.go:234] API GC: should remove inactive checkpoint kube-system/kube-apiserver-tdmfk                                                                                                                  
I0614 07:59:34.364673       1 main.go:250] API GC: should remove active checkpoint kube-system/kube-apiserver-tdmfk
I0614 07:59:34.364778       1 main.go:735] Removing checkpoint of: kube-system/kube-apiserver-tdmfk

Current status:

I don't know yet why this happens, but the root cause seems to be a problem during kube-apiserver bootstrapping. I'll add more info as I find it.

Any ideas? :)

Use Calico as Default Network Provider for Kubify Clusters

Story

As a seed cluster operator I want to protect my resources using network policies. The current implementation is based on flannel which does not support network policies.

Motivation

Kubify cluster are used in our landscapes as seed clusters for OpenStack. We need to be able to also provide #266 to those clusters. As we are quite familiar with Calico it should be used as the network provider.

Questions

I am not sure whether it should be possible to migrate existing clusters from flannel to calico. I do believe this is not a necessary feature.

Definition of Done

New clusters created by Kubify use Calico as their default network implementation

Release Notes

- Kubify now uses Calico as its default network implementation

Add Logging Stack to Addons

The initial version of the logging stack is working now [1].

To enable it, you need to add the following to your terraform.tfvars project configuration:

# addons
addons = {
  "dashboard" = {}
  "heapster" = {}
  "nginx" = {}
  "fluentd-elasticsearch" = {}
}

Maybe we should rename the addon to "logging" instead of using "fluentd-elasticsearch".

[1] https://github.com/gardener/kubify/tree/master/modules/seed/templates/addons/fluentd-elasticsearch/manifests

Add ES Curator CronJob to enforce Retention of Logs

To prevent elastic search to run out of disk, we need to run a curator [1] job periodically to enforce that.

[1] https://github.com/elastic/curator/tree/master/examples/actions

Use official Gardener Helm Charts

We should get rid of the gardener manifest generation in Kubify and use instead the official helm chart for that [1]. Otherwise we end up duplicating a lot of work on our side.

[1] https://github.com/gardener/gardener/tree/master/charts/gardener

Prepare open sourcing kubify

Issue by D029788
Thursday Dec 14, 2017 at 16:59 GMT
Originally opened as https://git.removed/kubernetes-attic/kubify/pull/15

D029788 included the following code: https://git.removed/kubernetes-attic/kubify/pull/15/commits

kube-controller-manager does only have 2 replicas

kube-apiserver and kube-scheduler are deployed with 3 replicas. Why?

deploy_kubify.sh fails to deploy etcd

I followed the instructions in landscape-setup-template and deploy_kubify.sh fails to deploy etcd:

./deploy_kubify.sh
...
Amount of etcd pods (0) doesn't equal specified amount of master nodes (3) yet. Waiting ...
./wait_for_cluster.sh: line 103: fail: command not found

More logs:

Waiting for the cluster ...
Give the cluster some time to come up ... (waiting for 1 minute)
Waiting time is up. Now trying to reach the cluster ...
Cluster not yet reachable. Waiting ...
Cluster not yet reachable. Waiting ...
Cluster reachable. Waiting for all pods to be running ...
No resources found.
Amount of api server pods (0) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of api server pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of api server pods equals specified amount of master nodes: 3
Amount of etcd pods (0) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (0) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (0) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Unable to connect to the server: unexpected EOF
Amount of etcd pods (0) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Unable to connect to the server: unexpected EOF
Amount of etcd pods (0) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Amount of etcd pods (1) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Unable to connect to the server: unexpected EOF
Amount of etcd pods (0) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Unable to connect to the server: unexpected EOF
Amount of etcd pods (0) doesn't equal specified amount of master nodes (3) yet. Waiting ...
Unable to connect to the server: unexpected EOF
Amount of etcd pods (0) doesn't equal specified amount of master nodes (3) yet. Waiting ...
./wait_for_cluster.sh: line 103: fail: command not found

I got more logs from the master 0:

journalctl -u bootkube.service
-- Logs begin at Tue 2018-06-12 12:00:02 UTC, end at Tue 2018-06-12 12:25:26 UTC. --
Jun 12 12:01:36 ip-10-251-161-198.eu-west-1.compute.internal systemd[1]: Starting Bootstrap a Kubernetes cluster...
Jun 12 12:01:59 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: pubkey: prefix: "quay.io/coreos/bootkube"
Jun 12 12:01:59 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: key: "https://quay.io/aci-signing-key"
Jun 12 12:01:59 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: gpg key fingerprint is: BFF3 13CD AA56 0B16 A898  7B8F 72AB F5F6 799D 33BC
Jun 12 12:01:59 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]:         Quay.io ACI Converter (ACI conversion signing key) <[email protected]>
Jun 12 12:01:59 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: Trusting "https://quay.io/aci-signing-key" for prefix "quay.io/coreos/bootkube" without fingerprint review.
Jun 12 12:01:59 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: Added key for prefix "quay.io/coreos/bootkube" at "/etc/rkt/trustedkeys/prefix.d/quay.io/coreos/bootkube/bff313cdaa560b16a8987b8f72abf5f6799d33bc"
Jun 12 12:01:59 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: Downloading signature:  0 B/473 B
Jun 12 12:01:59 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: Downloading signature:  473 B/473 B
Jun 12 12:01:59 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: Downloading signature:  473 B/473 B
Jun 12 12:02:00 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: Downloading ACI:  0 B/21.2 MB
Jun 12 12:02:00 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: Downloading ACI:  32.8 KB/21.2 MB
Jun 12 12:02:01 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: Downloading ACI:  21.2 MB/21.2 MB
Jun 12 12:02:01 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: image: signature verified:
Jun 12 12:02:01 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]:   Quay.io ACI Converter (ACI conversion signing key) <[email protected]>
Jun 12 12:02:41 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  160.119241] bootkube[6]: Starting temporary bootstrap control plane...
Jun 12 12:02:41 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  160.121004] bootkube[6]: Waiting for api-server...
Jun 12 12:02:46 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  165.164586] bootkube[6]: W0612 12:02:46.452714       6 create.go:40] Unable to determine api-server readiness: API Server http status: 0
Jun 12 12:02:51 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  170.124268] bootkube[6]: W0612 12:02:51.412428       6 create.go:40] Unable to determine api-server readiness: API Server http status: 0
Jun 12 12:02:56 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  175.124552] bootkube[6]: W0612 12:02:56.412711       6 create.go:40] Unable to determine api-server readiness: API Server http status: 0
Jun 12 12:03:01 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  180.124785] bootkube[6]: W0612 12:03:01.412949       6 create.go:40] Unable to determine api-server readiness: API Server http status: 0
Jun 12 12:03:06 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  185.124346] bootkube[6]: W0612 12:03:06.412508       6 create.go:40] Unable to determine api-server readiness: API Server http status: 0
Jun 12 12:03:11 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  190.124738] bootkube[6]: W0612 12:03:11.412890       6 create.go:40] Unable to determine api-server readiness: API Server http status: 0
Jun 12 12:03:16 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  195.124978] bootkube[6]: W0612 12:03:16.413130       6 create.go:40] Unable to determine api-server readiness: API Server http status: 0
Jun 12 12:03:21 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  200.125271] bootkube[6]: W0612 12:03:21.413425       6 create.go:40] Unable to determine api-server readiness: API Server http status: 0
Jun 12 12:03:26 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  205.124757] bootkube[6]: W0612 12:03:26.412909       6 create.go:40] Unable to determine api-server readiness: API Server http status: 0
Jun 12 12:03:32 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.149248] bootkube[6]: Creating self-hosted assets...
Jun 12 12:03:32 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.318956] bootkube[6]: I0612 12:03:32.607113       6 log.go:19] secret "etcd-backup" created
Jun 12 12:03:32 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.319340] bootkube[6]:         created             etcd-backup secret
Jun 12 12:03:32 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.384234] bootkube[6]: I0612 12:03:32.671691       6 log.go:19] secret "etcd-client-tls" created
Jun 12 12:03:32 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.384568] bootkube[6]:         created         etcd-client-tls secret
Jun 12 12:03:32 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.546796] bootkube[6]: I0612 12:03:32.834899       6 log.go:19] deployment "etcd-operator" created
Jun 12 12:03:32 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.547310] bootkube[6]:         created           etcd-operator deployment
Jun 12 12:03:32 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.606547] bootkube[6]: I0612 12:03:32.894682       6 log.go:19] secret "etcd-peer-tls" created
Jun 12 12:03:32 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.606950] bootkube[6]:         created           etcd-peer-tls secret
Jun 12 12:03:32 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.672789] bootkube[6]: I0612 12:03:32.960947       6 log.go:19] secret "etcd-server-tls" created
Jun 12 12:03:32 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.673492] bootkube[6]:         created         etcd-server-tls secret
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.740468] bootkube[6]: I0612 12:03:33.028610       6 log.go:19] service "etcd-service" created
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.740834] bootkube[6]:         created            etcd-service service
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.764242] bootkube[6]: I0612 12:03:33.052009       6 log.go:19] storageclass "etcd" created
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.764613] bootkube[6]:         created                    etcd storageclasse
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.829026] bootkube[6]: I0612 12:03:33.117196       6 log.go:19] secret "kube-apiserver" created
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.829469] bootkube[6]:         created          kube-apiserver secret
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.854681] bootkube[6]: I0612 12:03:33.142854       6 log.go:19] daemonset "kube-apiserver" created
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.855402] bootkube[6]:         created          kube-apiserver daemonset
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.876549] bootkube[6]: I0612 12:03:33.164723       6 log.go:19] poddisruptionbudget "kube-controller-manager" created
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.877153] bootkube[6]:         created kube-controller-manager poddisruptionbudget
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.934326] bootkube[6]: I0612 12:03:33.222481       6 log.go:19] secret "kube-controller-manager" created
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.935350] bootkube[6]:         created kube-controller-manager secret
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.943334] bootkube[6]: I0612 12:03:33.231522       6 log.go:19] deployment "kube-controller-manager" created
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.943878] bootkube[6]:         created kube-controller-manager deployment
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.955722] bootkube[6]: I0612 12:03:33.243901       6 log.go:19] daemonset "kube-dns" created
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  211.956450] bootkube[6]:         created                kube-dns daemonset
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.024766] bootkube[6]: I0612 12:03:33.312896       6 log.go:19] service "kube-dns" created
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.025296] bootkube[6]:         created                kube-dns service
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.032971] bootkube[6]: I0612 12:03:33.321141       6 log.go:19] daemonset "kube-etcd-network-checkpointer" created
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.033680] bootkube[6]:         created kube-etcd-network-checkpointer daemonset
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.091783] bootkube[6]: I0612 12:03:33.379870       6 log.go:19] configmap "kube-flannel-cfg" created
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.092270] bootkube[6]:         created        kube-flannel-cfg configmap
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.101143] bootkube[6]: I0612 12:03:33.389256       6 log.go:19] daemonset "kube-flannel" created
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.101568] bootkube[6]:         created            kube-flannel daemonset
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.108735] bootkube[6]: I0612 12:03:33.396906       6 log.go:19] daemonset "kube-proxy" created
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.109191] bootkube[6]:         created              kube-proxy daemonset
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.119031] bootkube[6]: I0612 12:03:33.407205       6 log.go:19] poddisruptionbudget "kube-scheduler" created
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.119600] bootkube[6]:         created          kube-scheduler poddisruptionbudget
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.126492] bootkube[6]: I0612 12:03:33.414672       6 log.go:19] deployment "kube-scheduler" created
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.127129] bootkube[6]:         created          kube-scheduler deployment
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.136583] bootkube[6]: I0612 12:03:33.424734       6 log.go:19] storageclass "aws-ebs" created
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.137046] bootkube[6]:         created                 aws-ebs storageclasse
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.149423] bootkube[6]: I0612 12:03:33.437600       6 log.go:19] clusterrolebinding "system:default-sa" created
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.150146] bootkube[6]:         created       system:default-sa clusterrolebinding
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.218409] bootkube[6]: I0612 12:03:33.501847       6 log.go:19] serviceaccount "kubernetes-kube2iam" created
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.219525] bootkube[6]:         created     kubernetes-kube2iam serviceaccount
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.247072] bootkube[6]: I0612 12:03:33.535235       6 log.go:19] clusterrole "kubernetes-kube2iam" created
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.247495] bootkube[6]:         created     kubernetes-kube2iam clusterrole
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.269345] bootkube[6]: I0612 12:03:33.557509       6 log.go:19] clusterrolebinding "kubernetes-kube2iam" created
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.270100] bootkube[6]:         created     kubernetes-kube2iam clusterrolebinding
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.332728] bootkube[6]: I0612 12:03:33.620851       6 log.go:19] daemonset "kube2iam" created
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.333179] bootkube[6]:         created                kube2iam daemonset
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.340493] bootkube[6]: I0612 12:03:33.628677       6 log.go:19] daemonset "pod-checkpointer" created
Jun 12 12:03:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  212.340770] bootkube[6]:         created        pod-checkpointer daemonset
Jun 12 12:03:48 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  227.342178] bootkube[6]:         Pod Status:        pod-checkpointer        Pending
Jun 12 12:03:48 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  227.342579] bootkube[6]:         Pod Status:          kube-apiserver        Pending
Jun 12 12:03:48 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  227.342895] bootkube[6]:         Pod Status:          kube-scheduler        Pending
Jun 12 12:03:48 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  227.343146] bootkube[6]:         Pod Status: kube-controller-manager        Pending
Jun 12 12:03:48 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  227.343408] bootkube[6]:         Pod Status:           etcd-operator        Pending
Jun 12 12:03:48 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  227.343603] bootkube[6]:         Pod Status:                kube-dns        Pending
Jun 12 12:04:13 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  252.342294] bootkube[6]:         Pod Status: kube-controller-manager        Pending
Jun 12 12:04:13 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  252.342978] bootkube[6]:         Pod Status:           etcd-operator        Pending
Jun 12 12:04:13 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  252.343474] bootkube[6]:         Pod Status:                kube-dns        Pending
Jun 12 12:04:13 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  252.343860] bootkube[6]:         Pod Status:        pod-checkpointer        Running
Jun 12 12:04:13 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  252.344289] bootkube[6]:         Pod Status:          kube-apiserver        Pending
Jun 12 12:04:13 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  252.344663] bootkube[6]:         Pod Status:          kube-scheduler        Pending
Jun 12 12:04:23 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  262.342236] bootkube[6]:         Pod Status:          kube-apiserver        Running
Jun 12 12:04:23 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  262.342712] bootkube[6]:         Pod Status:          kube-scheduler        Pending
Jun 12 12:04:23 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  262.343001] bootkube[6]:         Pod Status: kube-controller-manager        Pending
Jun 12 12:04:23 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  262.343253] bootkube[6]:         Pod Status:           etcd-operator        Pending
Jun 12 12:04:23 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  262.343462] bootkube[6]:         Pod Status:                kube-dns        Pending
Jun 12 12:04:23 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  262.343662] bootkube[6]:         Pod Status:        pod-checkpointer        Running
Jun 12 12:04:28 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  267.342244] bootkube[6]:         Pod Status: kube-controller-manager        Pending
Jun 12 12:04:28 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  267.342687] bootkube[6]:         Pod Status:           etcd-operator        Pending
Jun 12 12:04:28 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  267.343415] bootkube[6]:         Pod Status:                kube-dns        Pending
Jun 12 12:04:28 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  267.344278] bootkube[6]:         Pod Status:        pod-checkpointer        Pending
Jun 12 12:04:28 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  267.344571] bootkube[6]:         Pod Status:          kube-apiserver        Running
Jun 12 12:04:28 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  267.345179] bootkube[6]:         Pod Status:          kube-scheduler        Pending
Jun 12 12:04:38 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  277.342252] bootkube[6]:         Pod Status:        pod-checkpointer        Running
Jun 12 12:04:38 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  277.342827] bootkube[6]:         Pod Status:          kube-apiserver        Running
Jun 12 12:04:38 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  277.343282] bootkube[6]:         Pod Status:          kube-scheduler        Pending
Jun 12 12:04:38 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  277.343706] bootkube[6]:         Pod Status: kube-controller-manager        Pending
Jun 12 12:04:38 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  277.344431] bootkube[6]:         Pod Status:           etcd-operator        Pending
Jun 12 12:04:38 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  277.345091] bootkube[6]:         Pod Status:                kube-dns        Pending
Jun 12 12:04:48 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  287.342237] bootkube[6]:         Pod Status:                kube-dns        Running
Jun 12 12:04:48 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  287.342662] bootkube[6]:         Pod Status:        pod-checkpointer        Running
Jun 12 12:04:48 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  287.342868] bootkube[6]:         Pod Status:          kube-apiserver        Running
Jun 12 12:04:48 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  287.343108] bootkube[6]:         Pod Status:          kube-scheduler        Pending
Jun 12 12:04:48 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  287.343321] bootkube[6]:         Pod Status: kube-controller-manager        Pending
Jun 12 12:04:48 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  287.343525] bootkube[6]:         Pod Status:           etcd-operator        Pending
Jun 12 12:04:53 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  292.342228] bootkube[6]:         Pod Status:        pod-checkpointer        Running
Jun 12 12:04:53 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  292.342611] bootkube[6]:         Pod Status:          kube-apiserver        Running
Jun 12 12:04:53 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  292.342872] bootkube[6]:         Pod Status:          kube-scheduler        Pending
Jun 12 12:04:53 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  292.343121] bootkube[6]:         Pod Status: kube-controller-manager        Pending
Jun 12 12:04:53 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  292.343327] bootkube[6]:         Pod Status:           etcd-operator        Pending
Jun 12 12:04:53 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  292.343531] bootkube[6]:         Pod Status:                kube-dns        Pending
Jun 12 12:05:03 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  302.342291] bootkube[6]:         Pod Status:        pod-checkpointer        Running
Jun 12 12:05:03 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  302.342759] bootkube[6]:         Pod Status:          kube-apiserver        Running
Jun 12 12:05:03 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  302.343075] bootkube[6]:         Pod Status:          kube-scheduler        Running
Jun 12 12:05:03 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  302.343360] bootkube[6]:         Pod Status: kube-controller-manager        Running
Jun 12 12:05:03 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  302.343654] bootkube[6]:         Pod Status:           etcd-operator        Pending
Jun 12 12:05:03 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  302.343958] bootkube[6]:         Pod Status:                kube-dns        Pending
Jun 12 12:05:08 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  307.342248] bootkube[6]:         Pod Status: kube-controller-manager        Running
Jun 12 12:05:08 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  307.342687] bootkube[6]:         Pod Status:           etcd-operator        Running
Jun 12 12:05:08 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  307.342914] bootkube[6]:         Pod Status:                kube-dns        Pending
Jun 12 12:05:08 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  307.343169] bootkube[6]:         Pod Status:        pod-checkpointer        Running
Jun 12 12:05:08 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  307.343415] bootkube[6]:         Pod Status:          kube-apiserver        Running
Jun 12 12:05:08 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  307.343623] bootkube[6]:         Pod Status:          kube-scheduler        Running
Jun 12 12:05:13 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  312.342232] bootkube[6]:         Pod Status:                kube-dns        Running
Jun 12 12:05:13 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  312.343016] bootkube[6]:         Pod Status:        pod-checkpointer        Running
Jun 12 12:05:13 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  312.344046] bootkube[6]:         Pod Status:          kube-apiserver        Running
Jun 12 12:05:13 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  312.344909] bootkube[6]:         Pod Status:          kube-scheduler        Running
Jun 12 12:05:13 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  312.345271] bootkube[6]:         Pod Status: kube-controller-manager        Running
Jun 12 12:05:13 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  312.345608] bootkube[6]:         Pod Status:           etcd-operator        Running
Jun 12 12:05:13 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  312.345946] bootkube[6]: All self-hosted control plane components successfully started
Jun 12 12:05:13 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  312.346305] bootkube[6]: Migrating to self-hosted etcd cluster...
Jun 12 12:05:13 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  312.346684] bootkube[6]: I0612 12:05:13.631325       6 log.go:19] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
Jun 12 12:05:18 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  317.347400] bootkube[6]: I0612 12:05:18.635559       6 migrate.go:65] created etcd cluster CRD
Jun 12 12:05:23 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  322.398350] bootkube[6]: I0612 12:05:23.686510       6 migrate.go:76] etcd-service IP is 10.241.0.15
Jun 12 12:05:23 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  322.403314] bootkube[6]: I0612 12:05:23.691482       6 migrate.go:81] created etcd cluster for migration
Jun 12 12:05:38 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  337.406958] bootkube[6]: I0612 12:05:38.695069       6 migrate.go:86] etcd cluster for migration is now running
Jun 12 12:05:48 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  347.408000] bootkube[6]: E0612 12:05:48.696154       6 migrate.go:201] failed to create etcd client, will retry: grpc: timed out when dialing
Jun 12 12:05:58 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  357.407460] bootkube[6]: E0612 12:05:58.695613       6 migrate.go:201] failed to create etcd client, will retry: grpc: timed out when dialing
Jun 12 12:06:08 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  367.407280] bootkube[6]: E0612 12:06:08.695433       6 migrate.go:201] failed to create etcd client, will retry: grpc: timed out when dialing
Jun 12 12:06:18 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  377.407370] bootkube[6]: E0612 12:06:18.695520       6 migrate.go:201] failed to create etcd client, will retry: grpc: timed out when dialing
Jun 12 12:06:28 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  387.407399] bootkube[6]: E0612 12:06:28.695554       6 migrate.go:201] failed to create etcd client, will retry: grpc: timed out when dialing
Jun 12 12:06:38 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  397.407297] bootkube[6]: E0612 12:06:38.695449       6 migrate.go:201] failed to create etcd client, will retry: grpc: timed out when dialing
Jun 12 12:06:44 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  402.793096] bootkube[6]: I0612 12:06:44.081254       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:06:48 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  407.418350] bootkube[6]: I0612 12:06:48.706491       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:06:53 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  412.422006] bootkube[6]: I0612 12:06:53.710169       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:06:58 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  417.422114] bootkube[6]: I0612 12:06:58.710272       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:07:03 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  422.421303] bootkube[6]: I0612 12:07:03.709467       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:07:08 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  427.421780] bootkube[6]: I0612 12:07:08.709945       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:07:13 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  432.420267] bootkube[6]: I0612 12:07:13.708350       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:07:18 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  437.421193] bootkube[6]: I0612 12:07:18.709354       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:07:23 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  442.427034] bootkube[6]: I0612 12:07:23.715199       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:07:28 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  447.425255] bootkube[6]: I0612 12:07:28.713306       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:07:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  452.421869] bootkube[6]: I0612 12:07:33.710030       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:07:38 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  457.418666] bootkube[6]: I0612 12:07:38.706830       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:07:43 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  462.418576] bootkube[6]: I0612 12:07:43.706645       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:07:48 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  467.419865] bootkube[6]: I0612 12:07:48.708001       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:07:53 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  472.418443] bootkube[6]: I0612 12:07:53.706595       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:07:58 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  477.419238] bootkube[6]: I0612 12:07:58.707408       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:08:03 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  482.419323] bootkube[6]: I0612 12:08:03.707417       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:08:08 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  487.421043] bootkube[6]: I0612 12:08:08.709199       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:08:13 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  492.425894] bootkube[6]: I0612 12:08:13.714054       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:08:18 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  497.420443] bootkube[6]: I0612 12:08:18.708567       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:08:23 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  502.420021] bootkube[6]: I0612 12:08:23.708095       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:08:28 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  507.420062] bootkube[6]: I0612 12:08:28.708160       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:08:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  512.419445] bootkube[6]: I0612 12:08:33.707571       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:08:38 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  517.418333] bootkube[6]: I0612 12:08:38.706493       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:08:43 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  522.418377] bootkube[6]: I0612 12:08:43.706460       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:08:48 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  527.419674] bootkube[6]: I0612 12:08:48.707834       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:08:53 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  532.418884] bootkube[6]: I0612 12:08:53.707016       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:08:58 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  537.418283] bootkube[6]: I0612 12:08:58.706443       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:09:03 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  542.419955] bootkube[6]: I0612 12:09:03.708091       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:09:08 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  547.418278] bootkube[6]: I0612 12:09:08.706412       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:09:13 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  552.418333] bootkube[6]: I0612 12:09:13.706493       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:09:18 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  557.419737] bootkube[6]: I0612 12:09:18.707893       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:09:23 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  562.419283] bootkube[6]: I0612 12:09:23.707441       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:09:28 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  567.418357] bootkube[6]: I0612 12:09:28.706441       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:09:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  572.419225] bootkube[6]: I0612 12:09:33.707367       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:09:38 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  577.418349] bootkube[6]: I0612 12:09:38.706514       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:09:43 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  582.421014] bootkube[6]: I0612 12:09:43.709171       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:09:48 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  587.419085] bootkube[6]: I0612 12:09:48.707250       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:09:53 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  592.418540] bootkube[6]: I0612 12:09:53.706686       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:09:58 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  597.418486] bootkube[6]: I0612 12:09:58.706638       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:10:03 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  602.418789] bootkube[6]: I0612 12:10:03.706943       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:10:08 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  607.418724] bootkube[6]: I0612 12:10:08.706890       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:10:13 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  612.418973] bootkube[6]: I0612 12:10:13.706994       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:10:18 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  617.418589] bootkube[6]: I0612 12:10:18.706711       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:10:23 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  622.419301] bootkube[6]: I0612 12:10:23.707458       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:10:28 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  627.422099] bootkube[6]: I0612 12:10:28.710234       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:10:33 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  632.418561] bootkube[6]: I0612 12:10:33.706726       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:10:38 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  637.418533] bootkube[6]: I0612 12:10:38.706704       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:10:38 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  637.430243] bootkube[6]: I0612 12:10:38.718423       6 migrate.go:215] still waiting for boot-etcd to be deleted...
Jun 12 12:10:43 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  642.460225] bootkube[6]: Error: failed to wait for boot-etcd to be removed: timed out waiting for the condition
Jun 12 12:10:43 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  642.460644] bootkube[6]: Tearing down temporary bootstrap control plane...
Jun 12 12:10:43 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  642.461553] bootkube[6]: Error: failed to wait for boot-etcd to be removed: timed out waiting for the condition
Jun 12 12:10:43 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  642.461884] bootkube[6]: Error: failed to wait for boot-etcd to be removed: timed out waiting for the condition
Jun 12 12:10:43 ip-10-251-161-198.eu-west-1.compute.internal rkt[3452]: [  642.462087] bootkube[6]: failed to wait for boot-etcd to be removed: timed out waiting for the condition
Jun 12 12:10:43 ip-10-251-161-198.eu-west-1.compute.internal systemd[1]: bootkube.service: Main process exited, code=exited, status=1/FAILURE
Jun 12 12:10:43 ip-10-251-161-198.eu-west-1.compute.internal systemd[1]: Failed to start Bootstrap a Kubernetes cluster.
Jun 12 12:10:43 ip-10-251-161-198.eu-west-1.compute.internal systemd[1]: bootkube.service: Unit entered failed state.
Jun 12 12:10:43 ip-10-251-161-198.eu-west-1.compute.internal systemd[1]: bootkube.service: Failed with result 'exit-code'.

The process etcd-operator is running but no further etcd processes. The Docker logs of the etcd-operator container are the following:

time="2018-06-12T12:14:44Z" level=info msg="DEPRECATION WARNING: the pv-provisioner flag will be removed in next release." 
time="2018-06-12T12:14:44Z" level=info msg="etcd-operator Version: 0.6.1" 
time="2018-06-12T12:14:44Z" level=info msg="Git SHA: ba44539" 
time="2018-06-12T12:14:44Z" level=info msg="Go Version: go1.9.1" 
time="2018-06-12T12:14:44Z" level=info msg="Go OS/Arch: linux/amd64" 
E0612 12:14:44.604912       1 leaderelection.go:224] error retrieving resource lock kube-system/etcd-operator: Get https://10.241.0.1:443/api/v1/namespaces/kube-system/endpoints/etcd-operator: dial tcp 10.241.0.1:443: getsockopt: connection refused
E0612 12:15:49.843625       1 leaderelection.go:224] error retrieving resource lock kube-system/etcd-operator: the server was unable to return a response in the time allotted, but may still be processing the request (get endpoints etcd-operator)
E0612 12:16:50.776395       1 leaderelection.go:224] error retrieving resource lock kube-system/etcd-operator: Get https://10.241.0.1:443/api/v1/namespaces/kube-system/endpoints/etcd-operator: unexpected EOF
E0612 12:16:54.377420       1 leaderelection.go:224] error retrieving resource lock kube-system/etcd-operator: Get https://10.241.0.1:443/api/v1/namespaces/kube-system/endpoints/etcd-operator: dial tcp 10.241.0.1:443: getsockopt: connection refused

And the errors go on for a while.

On master 1, neither etcd-operator or etcd are running.

On master 2, etcd is running but cannot complete an election because it's alone.

Error while running deploy kubify

We are trying to update Gardener to latest version. We have deployed Gardener using this method

When i run deploy kubify, i am facing this error

Error: module.instance.module.iaas.module.bastion.data.aws_ami.image: "owners": required field is not set
Error: module.instance.module.master.module.vms.data.aws_ami.image: "owners": required field is not set
Error: module.instance.module.worker.module.vms.data.aws_ami.image: "owners": required field is not set

Any idea what might be causing this?

terrform plan variant

Issue by C5253435
Saturday Dec 16, 2017 at 18:57 GMT
Originally opened as https://git.removed/kubernetes-attic/kubify/issues/20

My terraform init variant went well but after i do terraform plan and i have those issues.

I started with a fresh install and i follow the readme.

c5253435:dberuben-cluster2/ (master✗) $ terraform plan variant [13:55:08]

Error: Error asking for user input: 4 error(s) occurred:

module.instance.module.ingress_record.module.route53.module.route53_dns_hostedzone.local.value: local.value: element: element() may not be used with an empty list in:

${element(concat(compact(concat(list(var.value, var.default),var.defaults)),local.options[var.optional ? "optional" : "required"]),0)}

module.instance.module.apiserver_record.module.route53.module.route53_dns_hostedzone.local.value: local.value: element: element() may not be used with an empty list in:

${element(concat(compact(concat(list(var.value, var.default),var.defaults)),local.options[var.optional ? "optional" : "required"]),0)}

module.instance.module.bastion_record.module.route53.module.route53_dns_hostedzone.local.value: local.value: element: element() may not be used with an empty list in:

${element(concat(compact(concat(list(var.value, var.default),var.defaults)),local.options[var.optional ? "optional" : "required"]),0)}

module.instance.module.identity_record.module.route53.module.route53_dns_hostedzone.local.value: local.value: element: element() may not be used with an empty list in:

${element(concat(compact(concat(list(var.value, var.default),var.defaults)),local.options[var.optional ? "optional" : "required"]),0)}

My terraform.tfvars file

Azure access info

az_client_id="XXXX"
az_client_secret="XXXXX"
az_tenant_id="XXXX"
az_subscription_id="XXXX"
az_region="westeurope"

cluster_name="dberuben"
cluster_type="test"
#base_domain=".openstack.k8s.sapcloud.io"

create entries here with htpasswd

dashboard_creds = <<EOF
admin:$apr1$BgPuasJn$W7sw.khdm/VqoZirMe6uE1
EOF

DNS

dns= {
dns_type="route53"
route53_access_key="XXXX"
route53_secret_key="XXXXX"
route53_hosted_zone_id="XXXXX"
}

cluster size

master_count=3
worker_count=1

root_certs_file = "SAPcert.pem"

Monitoring Stack for Kubify

Issue by vlerenc
Wednesday Dec 20, 2017 at 10:37 GMT
Originally opened as https://git.removed/kubernetes-attic/kubify/issues/21

Can we please also have a Prometheus, AlertManager and Grafana monitoring stack for the Kubify-based clusters? That would give us monitoring for Garden and Seed clusters, too.

@d062553 implemented a metrics endpoint for the Garden operator in kubernetes/garden-operator#162 that we would like to visualise, but there is of course far more, ideally covering a similar scope like we have for our shoot clusters, but I would say that's another step/ticket and by then @i068969 may be back from vacation and can help here. So this ticket is primarily about the stack itself, maybe preconfigured with the Garden operator dashboard (as a simple/minimal acceptance criterion).

Documentation issue on page "https://github.com/gardener/kubify/blob/master/README.md"

emphasize a word in a MarkDown should be done with "*" or "**" instead of the single quote.
A single quote is for code and commands and some MarkDown rendered handle this in a special way. (e.g. adding copy to clipboard icon )

You can support the new landing page of the gardener if you change this.

Page: https://github.com/gardener/kubify/blob/master/README.md
don't remove the link to the page. It's required for the processor

Remove Terraform Deprecation

Warning: module.instance.module.iaas.openstack_networking_router_v2.cluster: "external_gateway": [DEPRECATED] use external_network_id instead

Help with the Landscape Setup

Could you please:

Setup a staging garden cluster (we already have the soil cluster)
Provide options to configure the OIDC configuration (remove dex addon and add OIDC configuration in current Garden clusters)
Update the dev and staging garden cluster landscapes with above changes
Can we tweak the kube-apiserver log output in a garden cluster? Or use a separate elasticsearch index for it with different roll-over? Then we could keep our gardener and dashboard logs maybe even for longer than a month? That would be really great.

Upgrade Kubernetes to 1.10.x

Kubelet Started with 100Mi Reserved Memory Only

What happened:
Node became NotReady, most likely because kubelet/docker was OOM killed (in our case this killed the Gardener API server just because a logging curator job/pod consumed too much memory).

What you expected to happen:
More buffer before this happens, so that the kubelet can take action in time..

Anything else we need to know?:
While trying to understand why this happens, we (@adracus and I) saw that the reserved memory is configured to only 100Ki, which may lead to the observed instability.

Api server is not up and running

Hello,
I am using OpenStack IaaS provider and I set up the terraform script accordingly through the guide on the official wiki. However after successful execution of "terraform apply variant" the commands "k8s/bin/k" are returning Unable to connect to the server: EOF.
I was also not able to use kubectl by directly using the floating IP of either load balancer or master node

Here is my tfvars file

os_user_name = ""

os_password = ""

os_auth_url = ""

os_tenant_name = ""

os_domain_name = ""

os_region = "eu-de-200"

os_fip_pool_name = ""

#os_lbaas_provider = "haproxy"
os_az = "rot_2_1"

cluster_name = "Kubernetes"

cluster_type = "seed"

# DNS
dns = {
  domain_name    = ""
  dns_type       = "route53"
  hosted_zone_id = ""

  access_key = ""
  secret_key = ""
}

# cluster size
master = {
  count       = 1
  flavor_name = "m1.large"
  assign_fips = "true"
}

worker = {
  count       = 2
  flavor_name = "m1.large"
  assign_fips = "true"
}

addons = {
  "dashboard" = {
    "app_name" = "kubernetes-dashboard"
  }

  heapster      = {}
  nginx-ingress = {}
  gardener      = {}
}

deploy_tiller = "false"

event_ttl = "168h0m0s"

selfhosted_etcd = "true"

#
# use htpasswd to create password entries
# example here: admin:admin
#
dashboard_creds = <<EOF
admin:$apr1$CrBJQtg9$A.BhwGjZ/Iii6KSO72SWQ0
EOF

Readme "setup a new cluster" instructions are not clear enough

Hi,

Currently we are trying to use kubify with Azure. I am trying to follow setup a new cluster instructions, but they are missing steps, and not clear enough.

For instance, for new terraform releases It mentions k8s/bin/prepare script to be run. On MacOs this script fails if greadlink (provided by brew coreutils formulae) is not installed. Also This script is looking for terraform.tfvars to be set which is not actually described as a previous step.

remove HTML tags from documentation

please remove native HTML tags from the documentation. Makes things easier to integrate
in the new gardener.cloud landing page.

not so good

<img align=right src="doc/[email protected]" height="200">

better

![image](doc/[email protected]?raw=true)

in https://github.com/gardener/kubify/blob/master/README.md

Kubernetes Drain Script before Destroying the Cluster

Before running a terraform destroy on the cluster resources, we need to allow the user to drain all cluster resources created by Kubernetes (services with LBs, PVCs etc.).

gardener-attic / kubify Goto Github PK

kubify's Introduction

Kubify

To start using or developing Kubify locally

Feedback and Support

kubify's People

Contributors

Stargazers

Watchers

Forkers

kubify's Issues

Current status:

Story

Motivation

Questions

Definition of Done

Release Notes

Azure access info

create entries here with htpasswd

DNS

cluster size

Recommend Projects

Recommend Topics

Recommend Org