improbable-eng / etcd-cluster-operator Goto Github PK

View Code? Open in Web Editor NEW

126.0 126.0 35.0 561 KB

A controller to deploy and manage etcd clusters inside of Kubernetes

License: MIT License

Dockerfile 1.36% Makefile 4.47% Go 93.17% Shell 1.01%

etcd kubernetes operator

etcd-cluster-operator's People

Contributors

Stargazers

Watchers

etcd-cluster-operator's Issues

Basic design document for EtcdPeer resource controller

So that we have a clear record of the purpose of an EtcdPeer resource, it'd be good to write up some words (probably borrowed from our main proposal doc) that explains the purpose of the EtcdPeer resource vs EtcdCluster, as well as defining its rationale.

At the least, this should define:

when an EtcdPeer resource should exist (i.e. before or after a new member is 'added' to the cluster)
when EtcdPeer resources should be created during bootstrap
how the EtcdPeer resource can be used to reconfigure parameters about the peer, and a not on what any other controller/operator can 'expect' when updating an EtcdPeer (i.e. the change will be applied by the controller to the pod, potentially causing an immediate restart)

We should iterate on this doc quickly, so I think we shouldn't get too hung up on the details from day 1 (let's get started and amend our thoughts along the way!).

The emphasis on defining it early though is to ensure we don't start to conflate the roles and responsibilities of different controllers, as this can lead to a brittle system that is difficult to extend in future (i.e. if we ever want to add the possibility of creating peers to join an existing cluster, or complex backup/restore dances that involve starting peers in order to either backup or restore data and form new clusters).

Etcd server version upgrades

We must be able to deploy updates to the etcd version running in clusters.

AC:

A version field is added to the cluster spec to allow users to specify the etcd version to use
The cluster controller is responsible for upgrading peer resources to specify the required version
The peer controller is responsible for rolling the etcd pod to the upgraded version
No cluster-wide downtime

Add operator metric for backup failures and successes

Users of the operator want to monitor backup failures and successes, in particular to alert on failed backups or a lack of successful ones.

Design

A metric will be added to the exposed operator metrics as a counter of successes and failures for EtcdBackupSchedule resources. This counter will be labelled by the namespace and name of the EtcdBackupSchedule resource.

Other options

Instrumenting all backups

All backups could be counted by building our counter from EtcdBackup resources directly. However as the backup resource has no unique name to operate on, and has only a list of endpoints, there's no good way to provide a unique identity of which cluster is being backed up.

Without labels on the metric it would be hard to identify from a dashboard or alert which etcd cluster (if there are multiple) is failing to backup.

Not using a metric

Alternatively, all of this information is available in the Kubernetes API anyway via a status field on EtcdBackup resources. However this relies on an Kuberntes administrator using and configuring something like kube-state-metrics to support alerts and dashboards on this data.

Persisted data for ETCD clusters

Allow the creation of Etcd clusters with persisted data.

AC:

a PVC is created for each etcdpeer
e2e tests demonstrate the behaviour
docs exist explaining the design

Implement `successfulBackupsHistoryLimit` and `failedBackupsHistoryLimit` in `EtcdBackupSchedule`

We would like to limit the amount of successful/failed EtcdBackup resources that are lying around in clusters.

The EtcdBackupSchedule controller should be adjusted to remove all but the x newest EtcdBackup resources that it was responsible for creating. This should be configurable to be different for successful/failed backups.

Add etcd liveness probes

Investigate what liveness and readiness probes are used by other etcd operators
- Do readiness probes make sense, since etcdclients talk to all nodes anyway.
Devise a test to simulate a "stuck" etcd node
Add liveness probe by calling etcdctl or by using etcd client library

Cluster data can be backed up

Cluster owners should be able to set a backup strategy on their etcd clusters in order to take automated backups of cluster data. Backups must be able to be taken on demand, and on an automated schedule. We must implement at least one backup strategy suitable for use in Improbable's infrastructure - likely dumping of data to some cloud storage bucket.

Acceptance criteria:

The API is extended with capability for performing backups
A backup strategy for dumping cluster data to a GCS bucket (or more generic) is implemented
An e2e test exists which requests a backup, and observes that one is created.
Unit tests exist for automated backups (using something like https://github.com/jonboulle/clockwork)
README is updated to reflect that backups are functional

Add defaulting and validation hooks for EtcdBackup/EtcdBackupSchedule resources

Defaulting is not currently required, we require all fields to be explicit, but there is a good chance it will be required in the future as the API is modified so it makes sense for the framework to exist.

Validation could include

Ensure that exactly one backup destination is set
etc...

TLS for client access

We must be able to serve etcd client traffic over TLS

AC:

The cluster resource contains options for supplying certs for serving traffic over TLS

Push docker images to docker hub on master commit

AC:

Images are published to improbable's public docker hub on successful build of master with a short SHA
Pushed image is tagged with latest

Implement etcd connection pooling

Currently when we connect to etcd from the cluster controller we create a new connection every time. We could save resources by keeping a pool of them available.

Setting up/configuring Prow for repo automation and CI

We should set up some repository automation as well as CI for this repo.

Jetstack run a Prow instance that we can 'piggy back' on for now, but we may need to explore spinning up some dedicated infrastructure for open source projects under this org 😄

To get this arranged, someone with repo-admin permissions will need to collaborate with myself to get a webhook configured to point at our Prow instance (or I'll need to get some more permissions for a short while 😄)

Kubernetes version matrix testing

Currently our end to end testing is executed via KIND (see make kind) and targets only the latest version of k8s. We are documented as supporting all versions of Kubernetes from v1.12 forwards. We should execute a 'matrix' test across all supported Kubernetes versions.

In order to save compute resource, this should not be done on every push to a PR, but only periodically on master when there are changes.

Expose metrics from the etcd operator

The operator should expose metrics on /metrics, port 80, to enable Prometheus to scrape information from the operator on it's performance.

Possible metrics:

Total number of reconcile loops since restart, per controller.

Bootstrapping project using kubebuilder

We will use kubebuilder to at least 'seed' the project. After that we will more than likely be able to work with controller-runtime directly to continue developing.

We should bootstrap a new kubebuilder project using the newly release 2.0.1 release (finally stable 🎉).

This PR should be the bare minimum, and not define any types, implementation logic or customisation (as much as possible at least!)

Scaffold kubebuider resources

Create EtcdCluster and EtcdPeer.

Test that data is persisted if a Pod is deleted

In #46 we documented that etcd-cluster-operator configures nodes with storage, in such a way that a Kubernetes node can be rebooted and when it reboots, a new Etcd node pod can be started on that node and it will have the same data as the Pod that was deleted when the node was rebooted.

We should add an e2e test that deletes a pod and verifies that the pod returns with the same data.

Create a 1-node cluster
Add some data to the cluster
Delete the pod
Query etcd and verify that the data is still there.

Create watch on etcd membership API

We currently use watches to automatically reconcile when something we are observing changes. For resources in the Kubernetes API such as EtcdCluster, EtcdPeer, Service, and ReplicaSet this is natively supported. However we also want this watch behaviour on the etcd membership list.

Without this, we've resorted to a simple 10-second reconcile loop to pick up changes to the membership list. This results in us reconciling far more often than necessary.

We should implement a custom watch on the API, and avoid reconciling unless something has actually changed.

Support modifying pod annotations

Editing an EtcdCluster resource to add, amend, or remove spec.podTemplate.metadata.annotations should push those changes down to the underlying pods.

Add defaults for optional CRD fields

For example, in #27 we added an optional Replicas field as an int32 pointer.
Instead of having to check for nil everytime we reference it, we should apply defaults early on so that we can safely assume that the pointer will never be nil.

We could add defaulting functions for EtcdCluster and for EtcdPeer and use them both in a mutating web hook, but also early in the Reconcile functions, in case the webhook has not been deployed.

Clusters can be scaled up

On receiving an update to an EtcdCluster resource, increasing the replica count, we should observe that the cluster controller creates a new peer resource, the peer resource starts etcd bootstrapping to the current cluster state, and the rest of the nodes recognise the peers existence.

Acceptance criteria:

A doc exists describing what happens when a cluster scale up is requested
The functionality described in this doc is implemented
An e2e test exists to demonstrate that this behaviour works

Add pod resource requirements to EtcdCluster

We should be able to control the resource requirements of pods running etcd peers

Only allow API changes which the operator can reconcile

In #46 for example, the operator is able to reconcile the storage requirements of a new EtcdCluster, but it is not (yet) capable of changing the storage settings of an EtcdCluster or its EtcdPeers.

We should have a white list of fields that are allowed to be changed.
Any other changes should be prevented by a validating webhook.

Update the docs to summarize the changes which are supported.

Support for passing pod affinity to EtcdPeers

We should be able to spread replica pods across AZs for durability.

Set up code lint checks in CI

AC:

We have generated code diff checks (go generate, kubebuilder, go mod tidy, etc)
Linting using golangci-lint or just go vet
Shellcheck?

0.2.0 checklist

This issue tracks the progress towards getting the project into state where we can start deploying it in Improbable's beta infrastructure. We should be reasonably comfortable with the API and anticipate that stability would be acceptable. This ticket does not intend to track the work associated with each of the items, just give a summary of what is left to get to the first beta release.

Please edit this checklist as more issues arise

Features

Observability

ETCD metrics are exposed (#109)
Operator metrics are exposed (#110, #112)

Deployment workflow

Release pipeline created (#37)
Release procedure followed

Misc

v0.3.0 checklist created

Clusters can be scaled down

The inverse to #34: On receiving an update to an EtcdCluster resource, decreasing the replica count, we should observe that the cluster controller selects an EtcdPeer to evict. The peer controller will remove the peer from the cluster, stop the pod running the etcd frontend, and remove the peer resource.

Acceptance criteria:

A doc exists describing what happens when a cluster is scaled down
The functionality in this doc is implemented
An e2e test exists to demonstrate this behaviour works
README is updated to reflect the scaling functionality works

etcd sometimes crashes on startup because it can't resolve its own DNS name

E.g. my-cluster-2 has been restarted during bootstrap here:

kubectl -n teste2e-parallel-scaledown get pod 
NAME                 READY   STATUS    RESTARTS   AGE
my-cluster-0-6lnwt   1/1     Running   1          2m43s
my-cluster-1-tx6gv   1/1     Running   0          2m42s
my-cluster-2-42ddm   1/1     Running   1          2m41s

And the logs of the previous container show that it failed because it can't resolve it's own DNS name as used in the ETCD_INITIAL_ADVERTISE_PEER_URLS and ETCD_INITIAL_CLUSTER variables:

2019-11-20 17:41:52.279072 E | pkg/netutil: could not resolve host my-cluster-2.my-cluster.teste2e-parallel-scaledown.svc:2380

I think the problem is that we're using a relative DNS name (missing the top level cluster domain)
But we're using a relative DNS name which is almost last in the search list defined in resolv.conf

$ kubectl -n teste2e-parallel-scaledown exec my-cluster-2-42ddm  cat /etc/resolv.conf

search teste2e-parallel-scaledown.svc.cluster.local svc.cluster.local cluster.local lan
nameserver 10.96.0.10
options ndots:5

And not a name which is defined in /etc/hosts:

$ kubectl -n teste2e-parallel-scaledown exec my-cluster-2-42ddm  cat /etc/hosts
# Kubernetes-managed hosts file.
127.0.0.1	localhost
::1	localhost ip6-localhost ip6-loopback
fe00::0	ip6-localnet
fe00::0	ip6-mcastprefix
fe00::1	ip6-allnodes
fe00::2	ip6-allrouters
10.244.0.102	my-cluster-2.my-cluster.teste2e-parallel-scaledown.svc.cluster.local	my-cluster-2

Possible solutions are:

add a hostAlias https://kubernetes.io/docs/concepts/services-networking/add-entries-to-pod-etc-hosts-with-host-aliases/
But hostalias doesn't support the downward API (kubernetes/kubernetes#74265)
add custom DNS search config: https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-config
Use discovery-srv with the headless service rather than setting ETCD_INITIAL_CLUSTER

$ kubectl -n teste2e-parallel-scaledown logs my-cluster-2-42ddm etcd  --previous

2019-11-20 17:41:22.277885 I | pkg/flags: recognized and used environment variable ETCD_ADVERTISE_CLIENT_URLS=http://my-cluster-2.my-cluster.teste2e-parallel-scaledown.svc:2379
2019-11-20 17:41:22.277945 I | pkg/flags: recognized and used environment variable ETCD_DATA_DIR=/var/lib/etcd
2019-11-20 17:41:22.277968 I | pkg/flags: recognized and used environment variable ETCD_INITIAL_ADVERTISE_PEER_URLS=http://my-cluster-2.my-cluster.teste2e-parallel-scaledown.svc:2380
2019-11-20 17:41:22.277973 I | pkg/flags: recognized and used environment variable ETCD_INITIAL_CLUSTER=my-cluster-0=http://my-cluster-0.my-cluster.teste2e-parallel-scaledown.svc:2380,my-cluster-1=http://my-cluster-1.my-cluster.teste2e-parallel-scaledown.svc:2380,my-cluster-2=http://my-cluster-2.my-cluster.teste2e-parallel-scaledown.svc:2380
2019-11-20 17:41:22.277979 I | pkg/flags: recognized and used environment variable ETCD_INITIAL_CLUSTER_STATE=new
2019-11-20 17:41:22.277983 I | pkg/flags: recognized and used environment variable ETCD_INITIAL_CLUSTER_TOKEN=my-cluster
2019-11-20 17:41:22.277992 I | pkg/flags: recognized and used environment variable ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379
2019-11-20 17:41:22.278000 I | pkg/flags: recognized and used environment variable ETCD_LISTEN_PEER_URLS=http://0.0.0.0:2380
2019-11-20 17:41:22.278012 I | pkg/flags: recognized and used environment variable ETCD_NAME=my-cluster-2
2019-11-20 17:41:22.278058 I | etcdmain: etcd Version: 3.2.27
2019-11-20 17:41:22.278068 I | etcdmain: Git SHA: bdd97d5ff
2019-11-20 17:41:22.278072 I | etcdmain: Go Version: go1.8.7
2019-11-20 17:41:22.278075 I | etcdmain: Go OS/Arch: linux/amd64
2019-11-20 17:41:22.278080 I | etcdmain: setting maximum number of CPUs to 8, total number of available CPUs is 8
2019-11-20 17:41:22.278212 I | embed: listening for peers on http://0.0.0.0:2380
2019-11-20 17:41:22.278256 I | embed: listening for client requests on 0.0.0.0:2379
2019-11-20 17:41:32.299199 W | pkg/netutil: failed resolving host my-cluster-2.my-cluster.teste2e-parallel-scaledown.svc:2380 (lookup my-cluster-2.my-cluster.teste2e-parallel-scaledown.svc on 10.96.0.10:53: no such host); retrying in 1s
2019-11-20 17:41:33.301398 I | pkg/netutil: resolving my-cluster-2.my-cluster.teste2e-parallel-scaledown.svc:2380 to 10.244.0.102:2380
2019-11-20 17:41:43.321848 W | pkg/netutil: failed resolving host my-cluster-2.my-cluster.teste2e-parallel-scaledown.svc:2380 (lookup my-cluster-2.my-cluster.teste2e-parallel-scaledown.svc on 10.96.0.10:53: no such host); retrying in 1s
2019-11-20 17:41:52.279020 W | pkg/netutil: failed resolving host my-cluster-2.my-cluster.teste2e-parallel-scaledown.svc:2380 (i/o timeout); retrying in 1s
2019-11-20 17:41:52.279072 E | pkg/netutil: could not resolve host my-cluster-2.my-cluster.teste2e-parallel-scaledown.svc:2380
2019-11-20 17:41:52.279901 I | etcdmain: --initial-cluster must include my-cluster-2=http://my-cluster-2.my-cluster.teste2e-parallel-scaledown.svc:2380 given --initial-advertise-peer-urls=http://my-cluster-2.my-cluster.teste2e-parallel-scaledown.svc:2380

Investigate behaviour on partial backup upload

The behaviour of failed backup uploading is currently undefined - some backends may leave partial backups around, which the operator will detect as a complete backup and will not retry.

We could checksum the file in the destination rather than simply checking for its existence, but that would involve downloading the entire backup whenever we check if it exists.

We'd like to have defined behaviour for this corner case which is consistent between all storage destinations.

Operator Error: events is forbidden: User cannot create resource "events"

I just noticed this in the controller-manager logs.

2019-11-01T15:31:58.643Z	DEBUG	controller-runtime.manager.events	Normal	{"object": {"kind":"EtcdCluster","namespace":"default","name":"my-cluster","uid":"1fa5a286-29cc-454a-a592-e2b6e65accd6","apiVersion":"etcd.improbable.io/v1alpha1","resourceVersion":"920"}, "reason": "PeerCreated", "message": "Created a new EtcdPeer with name 'my-cluster-2'"}
E1101 15:31:58.645046       1 event.go:240] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"my-cluster.15d313aaa7086966", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"EtcdCluster", Namespace:"default", Name:"my-cluster", UID:"1fa5a286-29cc-454a-a592-e2b6e65accd6", APIVersion:"etcd.improbable.io/v1alpha1", ResourceVersion:"920", FieldPath:""}, Reason:"PeerCreated", Message:"Created a new EtcdPeer with name 'my-cluster-2'", Source:v1.EventSource{Component:"etcdcluster-reconciler", Host:""}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xbf6731dba0ca9d66, ext:6564875539, loc:(*time.Location)(0x1f724c0)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbf6731dba0ca9d66, ext:6564875539, loc:(*time.Location)(0x1f724c0)}}, Count:1, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events is forbidden: User "system:serviceaccount:eco-system:default" cannot create resource "events" in API group "" in the namespace "default"' (will not retry!)

We need some more RBAC rules for the events in #58

/cc @JamesLaverack

Create a release pipeline

We must be able to create versioned releases of the operator. This includes major, minor & patch releases. A release includes:

At least one commit to this repository
Optional changes to the custom resource versions in line with breaking changes

We must have a documented (or automated) process for releasing, with steps including:

Sanity checking docs, README, etc
Compiling release notes
Tagging the repository
Building & publishing tagged docker images

Create Kubernetes events to show progress of the reconciliation

It'd be useful to generate Kubernetes events at various steps in the Reconcile functions.

This will make it easy to see the progress that the operator is making in reconciling each cluster.

Ensure that events rate limited, e.g. Don't create an event in an endless error loop, even if it's a backoff loop.

Run tests with race checks

running go test ./... -race shows a number of warnings. we should address these where possible, and run automated tests with race checking enabled.

Understanding etcd bootstrap procedure

It is essential that we have a shared understanding of etcd's bootstrap procedure and specifically what it means for:

Forming a new cluster
Recovering an existing cluster
Scaling up a cluster

There's some full information available in the etcd repository on 'clustering': https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/clustering.md.
It'd be great if we can all ensure we've got an understanding of this process before we move on to implementing controllers to automate it.

This ticket does not have a clear output from it, except a more educated team 😄 what we learn about and distill from the process here can be encoded into a design document for how we will bootstrap.

In some experimentation, the static bootstrapping method was sufficient for creating, recovering and resizing clusters (without external dependencies too), so it'd be good to take special note of how static config works (esp. what flags are ignored/not taken account of after the initial bootstrap is complete)

Intermittent unit test error: wrong number of peers

In #89 we got this error which seems to occur intermittently:

    --- FAIL: TestAPIs/ClusterControllers (1.52s)

        --- FAIL: TestAPIs/ClusterControllers/OnCreation (1.52s)

            test.go:26: Failed to update status: Operation cannot be fulfilled on etcdclusters.etcd.improbable.io "cluster1": the object has been modified; please apply your changes to the latest version and try again -- []

            --- FAIL: TestAPIs/ClusterControllers/OnCreation/CreatesPeers (0.50s)

                require.go:752: 

                    	Error Trace:	etcdcluster_controller_test.go:132

                    	Error:      	"[{{EtcdPeer etcd.improbable.io/v1alpha1} {cluster1-0  dear-ape /apis/etcd.improbable.io/v1alpha1/namespaces/dear-ape/etcdpeers/cluster1-0 2c9db0d2-0a1c-11ea-9438-0242c0a87003 71 %!s(int64=1) 2019-11-18 15:57:51 +0000 UTC <nil> %!s(*int64=<nil>) map[app.kubernetes.io/name:etcd etcd.improbable.io/cluster-name:cluster1] map[] [{etcd.improbable.io/v1alpha1 EtcdCluster cluster1 2c7e2ce9-0a1c-11ea-9438-0242c0a87003 %!s(*bool=0xc000f6a4f9) %!s(*bool=0xc000f6a4f8)}] nil []  []} {cluster1 %!s(*v1alpha1.Bootstrap=&{0xc0007a6ea0 New}) %!s(*v1alpha1.EtcdPeerStorage=&{0xc000d2e900})} {}}]" should have 3 item(s), but has 1

                    	Test:       	TestAPIs/ClusterControllers/OnCreation/CreatesPeers

                    	Messages:   	wrong number of peers: &v1alpha1.EtcdPeerList{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ListMeta:v1.ListMeta{SelfLink:"/apis/etcd.improbable.io/v1alpha1/namespaces/dear-ape/etcdpeers", ResourceVersion:"75", Continue:"", RemainingItemCount:(*int64)(nil)}, Items:[]v1alpha1.EtcdPeer{v1alpha1.EtcdPeer{TypeMeta:v1.TypeMeta{Kind:"EtcdPeer", APIVersion:"etcd.improbable.io/v1alpha1"}, ObjectMeta:v1.ObjectMeta{Name:"cluster1-0", GenerateName:"", Namespace:"dear-ape", SelfLink:"/apis/etcd.improbable.io/v1alpha1/namespaces/dear-ape/etcdpeers/cluster1-0", UID:"2c9db0d2-0a1c-11ea-9438-0242c0a87003", ResourceVersion:"71", Generation:1, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:63709689471, loc:(*time.Location)(0x21c8a00)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string{"app.kubernetes.io/name":"etcd", "etcd.improbable.io/cluster-name":"cluster1"}, Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference{v1.OwnerReference{APIVersion:"etcd.improbable.io/v1alpha1", Kind:"EtcdCluster", Name:"cluster1", UID:"2c7e2ce9-0a1c-11ea-9438-0242c0a87003", Controller:(*bool)(0xc000f6a4f9), BlockOwnerDeletion:(*bool)(0xc000f6a4f8)}}, Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:v1alpha1.EtcdPeerSpec{ClusterName:"cluster1", Bootstrap:(*v1alpha1.Bootstrap)(0xc0007a6e80), Storage:(*v1alpha1.EtcdPeerStorage)(0xc00069a140)}, Status:v1alpha1.EtcdPeerStatus{}}}}

            test.go:26: Failed to update status: Operation cannot be fulfilled on etcdclusters.etcd.improbable.io "cluster1": the object has been modified; please apply your changes to the latest version and try again -- []

FAIL

https://app.circleci.com/jobs/github/improbable-eng/etcd-cluster-operator/416/parallel-runs/0/steps/0-102

Perhaps because of the failure to EtcdCluster.Status.
Perhaps we need to use RetryOnConflict: https://github.com/kubernetes/client-go/blob/master/util/retry/util.go#L68

make docker-build fails due to shell invocation

https://github.com/improbable-eng/etcd-cluster-operator/pull/94/files#diff-3254677a7917c6c01f55212f86c57fbfR33 introduces a shell form of RUN to check for the debug variable. However, the distroless image does not have a shell, so it fails.

$ export IMG=etcd-cluster-operator:test; make docker-build
KUBEBUILDER_ASSETS="/Users/junsiang/projects/etcd-cluster-operator/bin/kubebuilder/bin" go test ./... -coverprofile cover.out
?   	github.com/improbable-eng/etcd-cluster-operator	[no test files]
ok  	github.com/improbable-eng/etcd-cluster-operator/api/v1alpha1	0.043s	coverage: 32.2% of statements
ok  	github.com/improbable-eng/etcd-cluster-operator/controllers	30.775s	coverage: 63.2% of statements
?   	github.com/improbable-eng/etcd-cluster-operator/internal/backup	[no test files]
?   	github.com/improbable-eng/etcd-cluster-operator/internal/etcd	[no test files]
?   	github.com/improbable-eng/etcd-cluster-operator/internal/etcdenvvar	[no test files]
?   	github.com/improbable-eng/etcd-cluster-operator/internal/reconcilerevent	[no test files]
ok  	github.com/improbable-eng/etcd-cluster-operator/internal/test	0.025s	coverage: 45.5% of statements
?   	github.com/improbable-eng/etcd-cluster-operator/internal/test/crontest	[no test files]
ok  	github.com/improbable-eng/etcd-cluster-operator/internal/test/e2e	0.052s	coverage: 0.0% of statements
ok  	github.com/improbable-eng/etcd-cluster-operator/internal/test/try	0.197s	coverage: 95.0% of statements
?   	github.com/improbable-eng/etcd-cluster-operator/webhooks	[no test files]
docker build . -t etcd-cluster-operator:test --build-arg image=gcr.io/distroless/static:nonroot --build-arg user=nonroot
Sending build context to Docker daemon  284.3MB
Step 1/20 : ARG image=alpine:3.10.3
Step 2/20 : ARG user=root
Step 3/20 : FROM golang:1.13.1 as builder
 ---> 52b59e9ead8e
Step 4/20 : WORKDIR /workspace
 ---> Using cache
 ---> 2fdcbef6b169
Step 5/20 : COPY go.mod go.mod
 ---> Using cache
 ---> 13b4df5a1341
Step 6/20 : COPY go.sum go.sum
 ---> Using cache
 ---> b68e7ceceef5
Step 7/20 : RUN go mod download
 ---> Using cache
 ---> 119fd485e2c7
Step 8/20 : COPY main.go main.go
 ---> Using cache
 ---> 1ea3e5b80908
Step 9/20 : COPY api/ api/
 ---> Using cache
 ---> bbf6ccd348dd
Step 10/20 : COPY controllers/ controllers/
 ---> Using cache
 ---> 69caa9e55f15
Step 11/20 : COPY internal/ internal/
 ---> Using cache
 ---> 0c23ba3de7d6
Step 12/20 : COPY webhooks/ webhooks/
 ---> Using cache
 ---> e0e3fb126e9f
Step 13/20 : RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 GO111MODULE=on go build -a -o manager main.go
 ---> Using cache
 ---> 75d0ddb5fddf
Step 14/20 : FROM $image
 ---> ef0bddd72d14
Step 15/20 : WORKDIR /
 ---> Using cache
 ---> a1f1bcd67fbf
Step 16/20 : COPY --from=builder /workspace/manager .
 ---> Using cache
 ---> ebcc58339863
Step 17/20 : USER $user:$user
 ---> Using cache
 ---> 041220424939
Step 18/20 : ARG debug=false
 ---> Using cache
 ---> 1d1972add533
Step 19/20 : RUN if [ "$debug" = "true" ] ; then apk update && apk add ca-certificates bash curl drill jq ; fi
 ---> Running in 140c2721af1a
OCI runtime create failed: container_linux.go:346: starting container process caused "exec: \"/bin/sh\": stat /bin/sh: no such file or directory": unknown
make: *** [docker-build] Error 1

Create Etcd cluster failed due to X509 error

Versions of relevant software used
etcd-cluster-operator: v0.1.0
kubernetes version: v1.13.10
cert-manager: v0.9.0

What happened
Deploy etcd-cluster failed due to X509 error

kubectl apply -f config/samples/etcd_v1alpha1_etcdcluster.yaml
Error from server (InternalError): error when creating "config/samples/etcd_v1alpha1_etcdcluster.yaml": Internal error occurred: failed calling webhook "default.etcdclusters.etcd.improbable.io": Post https://eco-webhook-service.eco-system.svc:443/mutate-etcd-improbable-io-v1alpha1-etcdcluster?timeout=30s: x509: certificate signed by unknown authority

What you expected to happen
etcd-cluster successfully deployed

How to reproduce it (as minimally and precisely as possible):

deploy etcd-cluster-operator through below yaml
deploy etcd-cluster in config/samples/etcd_v1alpha1_etcdcluster.yaml

Full logs to relevant components
Etcd-operator deploy yaml:
deploy.yaml.txt

Anything else we need to know

Setting 'crd.spec.preserveUnknownFields: false' on CRD resources

In order to make features like kubectl explain work to easily inspect the schema of custom resources, setting preserveUnknownFields to false is required.

It is required because the apiserver will only publish complete OpenAPI schemas for structural CRDs. The side effect of setting this field is that any fields submitted that are not recognised in the schema will be automatically rejected by kubectl unless --validate=false is set, and the apiserver will also drop fields it does not recognise automatically.

If we are concerned our schema is not accurate/complete due to potential bugs in our project, you can use the crd-schema-fuzz project to run fuzz tests against our CRDs: https://github.com/munnerz/crd-schema-fuzz

You can see an example of this in use here: https://github.com/jetstack/cert-manager/blob/ba354e40784fbed5a25e7796aa54472a3d38a058/pkg/internal/apis/certmanager/install/pruning_test.go#L29-L30

Update contributing docs

Including

Test guidelines for E2E/Integration/Unit tests
Getting started with development (building images, running tests, etc)
Release process

Intermittent E2E test timeout

In #46 an E2E test timed out without enough diagnostic information to know what had gone wrong:

https://circleci.com/gh/improbable-eng/etcd-cluster-operator/198?

go test ./internal/test/e2e --kind --repo-root /home/circleci/go/src/github.com/improbable-eng/etcd-cluster-operator -v --cleanup="true"
=== RUN   TestE2E_Kind
 • Ensuring node image (kindest/node:v1.15.3) 🖼  ...
time="2019-10-24T21:39:03Z" level=info msg="Pulling image: kindest/node:v1.15.3 ..."
 ✓ Ensuring node image (kindest/node:v1.15.3) 🖼
 • Preparing nodes 📦  ...
 ✓ Preparing nodes 📦
 • Creating kubeadm config 📜  ...
 ✓ Creating kubeadm config 📜
 • Starting control-plane 🕹️  ...
 ✓ Starting control-plane 🕹️
 • Installing CNI 🔌  ...
 ✓ Installing CNI 🔌
 • Installing StorageClass 💾  ...
 ✓ Installing StorageClass 💾
Cluster creation complete. You can now use the cluster with:

export KUBECONFIG="$(kind get kubeconfig-path --name="etcd-e2e")"
kubectl cluster-info
panic: test timed out after 10m0s

goroutine 23 [running]:
testing.(*M).startAlarm.func1()
	/usr/local/go/src/testing/testing.go:1377 +0xdf
created by time.goFunc
	/usr/local/go/src/time/sleep.go:168 +0x44

goroutine 1 [chan receive, 10 minutes]:
testing.(*T).Run(0xc0000f4700, 0x17ce67d, 0xc, 0x18646a8, 0x9a1636)
	/usr/local/go/src/testing/testing.go:961 +0x377
testing.runTests.func1(0xc0000f4600)
	/usr/local/go/src/testing/testing.go:1202 +0x78
testing.tRunner(0xc0000f4600, 0xc000105dc0)
	/usr/local/go/src/testing/testing.go:909 +0xc9
testing.runTests(0xc00051e860, 0x2384e60, 0x2, 0x2, 0x0)
	/usr/local/go/src/testing/testing.go:1200 +0x2a7
testing.(*M).Run(0xc0004d9880, 0x0)
	/usr/local/go/src/testing/testing.go:1117 +0x176
main.main()
	_testmain.go:46 +0x135

goroutine 6 [syscall, 10 minutes]:
os/signal.signal_recv(0xc000066787)
	/usr/local/go/src/runtime/sigqueue.go:147 +0x9c
os/signal.loop()
	/usr/local/go/src/os/signal/signal_unix.go:23 +0x22
created by os/signal.init.0
	/usr/local/go/src/os/signal/signal_unix.go:29 +0x41

goroutine 7 [chan receive]:
k8s.io/klog.(*loggingT).flushDaemon(0x23ed680)
	/home/circleci/go/pkg/mod/k8s.io/[email protected]/klog.go:1018 +0x8b
created by k8s.io/klog.init.0
	/home/circleci/go/pkg/mod/k8s.io/[email protected]/klog.go:404 +0x6c

goroutine 20 [syscall, 2 minutes]:
syscall.Syscall6(0xf7, 0x1, 0x3a3b, 0xc000589af0, 0x1000004, 0x0, 0x0, 0x9b5301, 0xc000116cc0, 0xc000589b30)
	/usr/local/go/src/syscall/asm_linux_amd64.s:44 +0x5
os.(*Process).blockUntilWaitable(0xc0004f6b70, 0x203000, 0x0, 0x1)
	/usr/local/go/src/os/wait_waitid.go:31 +0x98
os.(*Process).wait(0xc0004f6b70, 0x18651b0, 0x18651b8, 0x18651a8)
	/usr/local/go/src/os/exec_unix.go:22 +0x39
os.(*Process).Wait(...)
	/usr/local/go/src/os/exec.go:125
os/exec.(*Cmd).Wait(0xc0000d58c0, 0x0, 0x0)
	/usr/local/go/src/os/exec/exec.go:501 +0x60
os/exec.(*Cmd).Run(0xc0000d58c0, 0xc0004a8720, 0xc0000d58c0)
	/usr/local/go/src/os/exec/exec.go:341 +0x5c
os/exec.(*Cmd).CombinedOutput(0xc0000d58c0, 0x6, 0xc000589f20, 0x4, 0x4, 0xc0000d58c0)
	/usr/local/go/src/os/exec/exec.go:561 +0x91
github.com/improbable-eng/etcd-cluster-operator/internal/test/e2e.TestE2E_Kind(0xc0000f4700)
	/home/circleci/go/src/github.com/improbable-eng/etcd-cluster-operator/internal/test/e2e/e2e_test.go:79 +0x474
testing.tRunner(0xc0000f4700, 0x18646a8)
	/usr/local/go/src/testing/testing.go:909 +0xc9
created by testing.(*T).Run
	/usr/local/go/src/testing/testing.go:960 +0x350

goroutine 45 [chan receive, 2 minutes]:
github.com/improbable-eng/etcd-cluster-operator/internal/test/e2e.TestE2E_Kind.func1(0xc00055f3e0, 0xc000557e80)
	/home/circleci/go/src/github.com/improbable-eng/etcd-cluster-operator/internal/test/e2e/e2e_test.go:63 +0x34
created by github.com/improbable-eng/etcd-cluster-operator/internal/test/e2e.TestE2E_Kind
	/home/circleci/go/src/github.com/improbable-eng/etcd-cluster-operator/internal/test/e2e/e2e_test.go:62 +0x15b

goroutine 47 [IO wait]:
internal/poll.runtime_pollWait(0x2adf45d208b0, 0x72, 0xffffffffffffffff)
	/usr/local/go/src/runtime/netpoll.go:184 +0x55
internal/poll.(*pollDesc).wait(0xc000116c18, 0x72, 0x801, 0x8ad, 0xffffffffffffffff)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x45
internal/poll.(*pollDesc).waitRead(...)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0xc000116c00, 0xc000027553, 0x8ad, 0x8ad, 0x0, 0x0, 0x0)
	/usr/local/go/src/internal/poll/fd_unix.go:169 +0x1cf
os.(*File).read(...)
	/usr/local/go/src/os/file_unix.go:259
os.(*File).Read(0xc0000103f0, 0xc000027553, 0x8ad, 0x8ad, 0x2d, 0x0, 0x0)
	/usr/local/go/src/os/file.go:116 +0x71
bytes.(*Buffer).ReadFrom(0xc0004a8720, 0x1963480, 0xc0000103f0, 0x2adf45d16198, 0xc0004a8720, 0xc00054af01)
	/usr/local/go/src/bytes/buffer.go:204 +0xb4
io.copyBuffer(0x1961ea0, 0xc0004a8720, 0x1963480, 0xc0000103f0, 0x0, 0x0, 0x0, 0x91caa5, 0xc000116b40, 0xc00054afb0)
	/usr/local/go/src/io/io.go:388 +0x2ed
io.Copy(...)
	/usr/local/go/src/io/io.go:364
os/exec.(*Cmd).writerDescriptor.func1(0xc000116b40, 0xc00054afb0)
	/usr/local/go/src/os/exec/exec.go:311 +0x63
os/exec.(*Cmd).Start.func1(0xc0000d58c0, 0xc00044a0a0)
	/usr/local/go/src/os/exec/exec.go:435 +0x27
created by os/exec.(*Cmd).Start
	/usr/local/go/src/os/exec/exec.go:434 +0x608
FAIL	github.com/improbable-eng/etcd-cluster-operator/internal/test/e2e	600.034s
FAIL
make: *** [kind] Error 1
Exited with code 2

The test was stopped after 10 minutes.
The Kind cluster appears to have been created
We're using t.Log to print progress messages, but these do not get printed if the test times out. (see golang/go#23213)
There are no other stdout stderr messages to see which of the e2e steps was running when the test timed out.
There are very few timestamps in the output to allow us to know how long the Kind cluster took to start up. We can see approximately when it began but not when it finished.

• Ensuring node image (kindest/node:v1.15.3) 🖼  ...
time="2019-10-24T21:39:03Z" level=info msg="Pulling image: kindest/node:v1.15.3 ..."

TLS between cluster peers

We must be able to encrypt/authenticate traffic sent between cluster peers.

Flakey test: TestEventually/TestEventually_InitiallyErroring_EventuallySucceeds

I occasionally get this error when running make test.

richard   39-crd-defaults  ~  projects  improbable-eng  etcd-cluster-operator  make test
KUBEBUILDER_ASSETS="/home/richard/projects/improbable-eng/etcd-cluster-operator/bin/kubebuilder/bin" go test ./... -coverprofile cover.out
?   	github.com/improbable-eng/etcd-cluster-operator	[no test files]
ok  	github.com/improbable-eng/etcd-cluster-operator/api/v1alpha1	0.046s	coverage: 52.2% of statements
ok  	github.com/improbable-eng/etcd-cluster-operator/controllers	6.633s	coverage: 79.5% of statements
?   	github.com/improbable-eng/etcd-cluster-operator/internal/etcdenvvar	[no test files]
ok  	github.com/improbable-eng/etcd-cluster-operator/internal/test	0.030s	coverage: 65.2% of statements
ok  	github.com/improbable-eng/etcd-cluster-operator/internal/test/e2e	0.038s	coverage: 0.0% of statements
--- FAIL: TestEventually (0.11s)
    --- FAIL: TestEventually/TestEventually_InitiallyErroring_EventuallySucceeds (0.05s)
        require.go:794: 
            	Error Trace:	try_test.go:130
            	Error:      	Received unexpected error:
            	            	foo
            	Test:       	TestEventually/TestEventually_InitiallyErroring_EventuallySucceeds
            	Messages:   	an error was found, but not expected
FAIL
coverage: 94.4% of statements
FAIL	github.com/improbable-eng/etcd-cluster-operator/internal/test/try	0.227s
?   	github.com/improbable-eng/etcd-cluster-operator/webhooks	[no test files]
FAIL
make: *** [Makefile:36: test] Error 1

Support Prometheus annotations on etcd pods

Etcd itself, in the standard Docker images used by this operator (quay.io/coreos/etcd) exposes metrics on /metrics. Prometheus in some installations requires annotations on the pods themselves to scrape metric information.

You can't scale-up a cluster that has been scaled down

When we scale down (in #93) we leave behind PVCs.
This means that if you scale-up again, the etcdpeer controller will try and use that lingering PVC, and it will fail because the data corresponds to the old member ID.

We don't want to delete the PVC for all deleted EtcdPeers, (as described in the Delete a cluster documentation)

But what we could do is add a finalizer to the EtcdPeer, which will prevent it being instantly deleted.

And a new EtcdPeer.Spec.Decommissioned field, which will be set by the etcdcluster_controller before it deletes the EtcdPeer.

Then the etcdpeer_controller will be able to safely delete PVCs, only for Decommissioned EtcdPeer resources and finally remove the finalizer which will allow the EtcdPeer to be deleted along with its replicaset.

Part of #35

Reconciler error : unable to create service : context deadline exceeded

I keep getting this error in the controller-manager logs when running the E2E tests locally.

I think because we're setting a 10 second limit for all the operations performed in the Reconcile function.

I can increase it to some arbitrary value, or better still would be to have much smaller timeouts on the individual API operations, I think.

2019-11-08T14:54:17.240Z	ERROR	controller-runtime.controller	Reconciler error	{"controller": "etcdcluster", "request": "default/my-cluster", "error": "unable to create service: Post https://10.96.0.1:443/api/v1/namespaces/default/services: context deadline exceeded", "errorCauses": [{"error": "unable to create service: Post https://10.96.0.1:443/api/v1/namespaces/default/services: context deadline exceeded"}]}
github.com/go-logr/zapr.(*zapLogger).Error
	/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:128
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:218
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:192
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:171
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152
k8s.io/apimachinery/pkg/util/wait.JitterUntil
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153
k8s.io/apimachinery/pkg/util/wait.Until
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88

EtcdCluster resource controller

Create a controller for the EtcdCluster resource. This controller should behave as described in the design documentation, and allow a user to create a running etcd cluster using a single resource. For example:

apiVersion: etcd.improbable.io/v1alpha1
kind: EtcdCluster
metadata:
  name: my-cluster
  namespace: default
spec:
  size: 3

Removing a leader member during scale-down causes cluster downtime

In #93 remove the etcd member whose name contains the largest ordinal, but this member may well be the cluster leader.
This forces a leader election which prevents write requests https://github.com/etcd-io/etcd/blob/master/Documentation/faq.md#why-does-etcd-lose-its-leader-from-disk-latency-spikes

This is compounded if we are removing multiple members and the next new leader also happens to have the next largest ordinal.

Instead, if we removed only non-leader members, we might avoid these disruptions.

docs: README does not match examples.

Versions of relevant software used
master / e22d7a7

What happened
README.md references examples/operator.yaml but there's no such file.

What you expected to happen
This file would be really useful as I tried to use this project about a week ago and really struggled to get the RBAC right for the operator - it needs a lot of permissions and it took a long time to find them all.

I think the kubebuilder machinery can produce the relevant YAMLs, but I'm not familiar with kube-builder yet so I don't know how to do that - either documentation for that or an easy pre-rendered file would be great :)

How to reproduce it (as minimally and precisely as possible):

Full logs to relevant components

Anything else we need to know

Writing end-to-end test framework

In order to make it easy to write end-to-end/conformance tests, we should define a simple framework/helper to make it easy to define tests.

I'm uncertain what the norm is across other Improbable projects, but kubernetes itself and kubebuilder use Gingko and Gomega for this sort of thing.

I think a lot of the operator's verifications can be encoded into unit/integration tests, but we should definitely have a suite of e2e checks that go through some common 'happy paths'.

Support for enabling auto-compaction in etcd

We should be able to pass config variables down to etcd, eg auto-compaction-retention

improbable-eng / etcd-cluster-operator Goto Github PK

etcd-cluster-operator's People

Contributors

Stargazers

Watchers

Forkers

etcd-cluster-operator's Issues

Design

Other options

Instrumenting all backups

Not using a metric

Features

Observability

Deployment workflow

Misc

Recommend Projects

Recommend Topics

Recommend Org