Coder Social home page Coder Social logo

giantswarm / operatorkit Goto Github PK

View Code? Open in Web Editor NEW
198.0 18.0 17.0 38.91 MB

An opinionated Go framework for developing Kubernetes operators

Home Page: https://godoc.org/github.com/giantswarm/operatorkit

License: Apache License 2.0

Go 97.06% Makefile 2.94%
library framework go kubernetes operator

operatorkit's Introduction

Go Reference CircleCI

operatorkit

โš ๏ธ Important: This library has been deprecated in favor of kubebuilder. Please, use kubebuilder if you want to create a new operator.

Package operatorkit implements an opinionated framework for developing Kubernetes operators. It emerged as we extracted common functionality from a number of the operators we developed at Giant Swarm. The goal of this library is to provide a common structure for operator projects and to encapsulate best practices we learned while running operators in production.

Features

  • CRD primitives to reliably create, watch and delete custom resources, as well as any Kubernetes runtime object.
  • Managing finalizers on reconciled objects, making sure the code is executed at least once for each create/delete/update event.
  • Guarantees to perform at least one successful deletion event reconciliation to avoid unnecessary, possibly expensive interactions with third party systems.
  • Resource wrapping to gain ability of composing resources like middlewares.
  • Control Flow Primitives that allow cancellation and repetition of resource implementations.
  • Independent packages. It is possible to use only certain parts of the library without being bound to all primitives it provides.
  • Ability to change behaviour that is often specific to an organization like logging and error handling.
  • Pause Reconciliation using pausing annotations on runtime objects to stop and resume reconciliation on demand.

Roadmap

For future planned features and breaking changes see the roadmap.

Docs

Integration Tests

You can simply create a kind cluster to run the integration tests.

kind create cluster

The tests need to figure out how to connect to the Kubernetes cluster. Therefore we need to set an environment variable pointing to your local kube config.

export E2E_KUBECONFIG=~/.kube/config

Now you can easily run the integration tests.

go test -v -tags=k8srequired ./integration/test/<test-name>

Once you did your testing you may want to delete your local test cluster again.

kind delete cluster

Projects using operatorkit

Giant Swarm operators using operatorkit.

Example

For a detailed state of art implementation, please see giantswarm/aws-operator.

Contributing & Reporting Bugs

See CONTRIBUTING for details on submitting patches, the contribution workflow as well as reporting bugs.

License

operatorkit is under the Apache 2.0 license. See the LICENSE file for details.

operatorkit's People

Contributors

anvddriesch avatar architectbot avatar asymmetric avatar averagemarcus avatar dependabot[bot] avatar fiunchinho avatar github-actions[bot] avatar headcr4sh avatar josephsalisbury avatar kopiczko avatar ljakimczuk avatar marcelmue avatar marians avatar njuettner avatar oponder avatar renovate[bot] avatar rossf7 avatar stone-z avatar taylorbot avatar tfussell avatar theobrigitte avatar tomahawk28 avatar tuommaki avatar xh3b4sd avatar yulianedyalkova avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

operatorkit's Issues

Support for reconciliation loops

We have a general idea of operators that they should attempt to move actual state to desired state.

For example, a pingdom operator:

  • watches for a pingdom resource in the cluster
  • if the resource is created, creates an appropriate pingdom check
  • if the resource is updated, updates the appropriate pingdom check
  • if the resource is deleted, deletes the appropriate pingdom check

this approach (basically, using informers) works well for basic operators, in that they perform one action on informer action

more complicated operators may need to perform multiple actions on events.
for example, a cluster operator:

  • when a resource is created, creates a set of vms, and some elb
    if the vms are created successfully, and the elb not, then the operator should attempt to create the elb in the future. this doesn't really work with the informer based approach (maybe with relisting? but you need to ensure idempotency throughout)

there is a slightly higher idea of reconcilliation. my rough idea is that an operator should define some mapping between resource and underlying resource. e.g:
cluster == 3 vms, elb
and the framework then handles that these resources are present (or in the case of resource deletion, not present)

this would allow us to move the overall reconcilliation logic / resource management to one location, and make operators much simpler, essentially being some function from resource to underlying resources.

Documentation hard to follow when instantiating k8srestconfig.Config

Hey everyone!

I've started looking at operator kit as I've forked giantswarm/net-exporter to write my own exporter, which uses operatorkit to connect to a cluster.

The example encountered in k8srestconfig#Config was sadly not straightforward enough to follow for someone without in-depth knowledge of operator-kit.

Could it be possible to have an example with the actual raw values in the code/doc? Or perhaps a clearer explanation on how to obtain those values for people who are getting started with operator-kit?

For instance, I think kubernetes/client-go has a very good example of an example: main.go

My current use: Connecting to a cluster on GCP from my local network.

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

This repository currently has no open or pending branches.

Detected dependencies

circleci
.circleci/config.yml
  • architect 4.35.6
gomod
go.mod
  • go 1.21
  • github.com/getsentry/sentry-go v0.25.0
  • github.com/giantswarm/backoff v1.0.0
  • github.com/giantswarm/exporterkit v1.1.0
  • github.com/giantswarm/k8sclient/v7 v7.2.0
  • github.com/giantswarm/microerror v0.4.1
  • github.com/giantswarm/micrologger v1.1.1
  • github.com/giantswarm/to v0.4.0
  • github.com/patrickmn/go-cache v2.1.0+incompatible
  • github.com/prometheus/client_golang v1.17.0
  • github.com/prometheus/client_model v0.5.0
  • github.com/stretchr/testify v1.8.4
  • k8s.io/api v0.28.3
  • k8s.io/apiextensions-apiserver v0.28.3
  • k8s.io/apimachinery v0.28.3
  • k8s.io/client-go v0.28.3
  • sigs.k8s.io/controller-runtime v0.16.3
  • sigs.k8s.io/yaml v1.4.0
  • github.com/gin-gonic/gin v1.9.1
hack/tools/controller-gen/go.mod
  • go 1.21
  • sigs.k8s.io/controller-tools v0.13.0
kubernetes
config/crd/testing.giantswarm.io_examples.yaml
  • CustomResourceDefinition apiextensions.k8s.io/v1

  • Check this box to trigger a request for Renovate to run again on this repository

Discuss: Using reconciler framework in AWS and Azure operators

The aws-operator currently has its own resources implementation. Adapting this to use the operatorkit framework would standardise our operators.

The main issues I see are.

Sequence of resources

For AWS we need to create 15 resources in sequence. Either the framework needs to handle the sequence or each resource needs to check all its dependencies exist before creation. The Retry logic can be used to retry until the dependencies exist. But checking for all dependencies could be cumbersome.

Handling IDs generated by AWS

We try to identify resources with predictable names including the unique cluster ID. But several resources like VPC need to be referenced by AWS assigned IDs.

These IDs are needed when creating later resources. e.g Internet Gateway. To handle this a resource would need to check the dependency exists and get its ID. We usually tag AWS resources with the cluster ID so this is possible.

Higher level tools

On AWS we could reduce the number of resources needed by using Cloud Formation stacks. The resource would reconcile the state from the cluster custom object to the CF stack.

On Azure this doesn't look possible because Resource Manager templates only support provisioning not updates. So we'll need to create granular resources with the Azure Go SDK. As we do in aws-operator currently.

Adding Example

Currently there is no example. It would be grate if there is a small one that would add a tpr and when wait for the add/update/delete.

Remove certificatetpr dependency

We should move certificate secrets watching functionality from operatorkit. As a first iteration we move it to the certificatetpr project.

Reporting a vulnerability

Hello!

I hope you are doing well!

We are a security research team. Our tool automatically detected a vulnerability in this repository. We want to disclose it responsibly. GitHub has a feature called Private vulnerability reporting, which enables security research to privately disclose a vulnerability. Unfortunately, it is not enabled for this repository.

Can you enable it, so that we can report it?

Thanks in advance!

PS: you can read about how to enable private vulnerability reporting here: https://docs.github.com/en/code-security/security-advisories/repository-security-advisories/configuring-private-vulnerability-reporting-for-a-repository

Reconnect issue

In kvm-operator i noticed that after following error kvm-operator did not try to reinitiate connection. I was forced to recreate pod in order to kvm-operator proceed with my cluster (that was created before restart.)

{"action":"start","caller":"github.com/giantswarm/kvm-operator/vendor/github.com/giantswarm/operatorkit/framework/resource/logresource/resource.go:152","component":"operatorkit","function":"ApplyUpdatePatch","time":"2017-11-09 08:07:24.532","underlyingResource":"service"}
{"action":"end","caller":"github.com/giantswarm/kvm-operator/vendor/github.com/giantswarm/operatorkit/framework/resource/logresource/resource.go:160","component":"operatorkit","function":"ApplyUpdatePatch","time":"2017-11-09 08:07:24.532","underlyingResource":"service"}
{"action":"end","caller":"github.com/giantswarm/kvm-operator/vendor/github.com/giantswarm/operatorkit/framework/framework.go:195","component":"operatorkit","function":"ProcessUpdate","time":"2017-11-09 08:07:24.532"}
{"caller":"github.com/giantswarm/kvm-operator/vendor/github.com/giantswarm/operatorkit/framework/framework.go:334","error":"[{/go/src/github.com/giantswarm/kvm-operator/vendor/github.com/giantswarm/operatorkit/framework/framework.go:321: } {/go/src/github.com/giantswarm/kvm-operator/vendor/github.com/giantswarm/operatorkit/informer/informer.go:205: } {/go/src/github.com/giantswarm/kvm-operator/vendor/github.com/giantswarm/operatorkit/informer/informer.go:377: } {/go/src/github.com/giantswarm/kvm-operator/vendor/github.com/giantswarm/operatorkit/informer/watcher_factory.go:13: } {Get https://172.31.0.1:443/apis/cluster.giantswarm.io/v1/watch/kvms: read tcp 172.20.190.27:40842-\u003e172.31.0.1:443: read: connection reset by peer}]","time":"2017-11-09 08:09:28.192"}

Discuss: Handling updates using the reconciler framwork

This issue is to get consensus on how we handle updates using the reconciler framework.

Issues

Idempotency

  • Operators should be idempotent. So should the addFunc and updateFunc implementations be the same?
  • Another benefit is simplicity. Only the addFunc and deleteFunc logic needs to implemented. The update logic is more complex and needs to be implemented in the framework and the operators.

Ordering of updates

  • Due to the resync period and outside factors like the operator being restarted we cannot guarantee the ordering of add and update events.
  • Since the ordering is not guaranteed can we rely on using the updateFunc?

Monitor kubernetes events latency

We had an issue where an event for an operator was seriously delayed (> 20 minutes).

It would be very useful for system stability to expose rates of events being listwatched by operators, so we can alert on it.

e.g: a certain operator is expected to get x events per second. if this drops majorly, it implies system issues.

Dependency problem with fixed resource order for creation and deletion

During the aws-operator migration to CloudFormation based templates we are finding some dependency issues because of the order in which the resources are being managed.

We have a legacy resource that creates the AWS components imperatively (the current production version of the operator), the basic components are created first and the components that depend on those are created next. Now we have defined a Cloudformation resource, and we are porting components gradually from the legacy resource to the CloudFormation resource, starting from the "most dependent" (least basic) ones. In the resources slice passed to the router, the CloudFormation resource is declared after the legacy resource.

This worked well until we try to port a resource that has a dependency on AWS side. For instance, a NAT gateway depends on a subnet. If we move the NAT gateway from legacy to CF things work well on creation, (the order in which resources are managed is first legacy, then CF) but not on deletion, when we try to delete the subnet on legacy we can't because the gateway is there until CF is processed. This is just an example, happens the same with many other related components. If we start porting the most basic components, we would need to reverse the order of the resources slice (first CloudFormation, then legacy), and again, this would work well on creation but not on deletion (we could not, for instance, delete the CF stack including a VPC if there are still instances attached to it).

So, would it be possible to have a different order in which resources are managed for creation and for deletion? IMO this could solve this specific problem (only two resources related in a very specific, kind of linear way), not sure if it would solve more complex relationships between resources, let me know WDYT.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.