Coder Social home page Coder Social logo

cincinnati-operator's Introduction

update-service-operator

This operator is developed using the operator SDK, version 1.9.0. Installation docs are here.

Run locally

To run the operator built from the code using a kubeconfig with cluster-admin permissions:

export RELATED_IMAGE_OPERAND="quay.io/app-sre/cincinnati:2873c6b"
export OPERATOR_NAME=updateservice-operator
export POD_NAMESPACE=openshift-update-service
### Ensure above namespace exists on the cluster and is the current active
oc create namespace --dry-run=client -o yaml "${POD_NAMESPACE}" | oc apply -f -
oc project "${POD_NAMESPACE}"
KUBECONFIG=path/to/kubeconfig make run

Using an init container to load graph data

The UpdateService graph data is loaded from an init container. Before deploying the update-service-operator, you will need to build and push an init container containing the graph data.

Build operator image

podman build -f ./Dockerfile --platform=linux/amd64 -t your-registry/your-repo/your-update-service-operator:tag
podman push your-registry/your-repo/your-update-service-operator:tag

Deploy operator

make deploy

By default, operator will be deployed using the default operator image controller:latest. If you want to override the default operator image with your image, set

export RELATED_IMAGE_OPERATOR="your-registry/your-repo/your-update-service-opertor-image:tag"

Run functional tests

make func-test

To run the functional testcases locally, you must set below environment variables as shown below along with optional RELATED_IMAGE_OPERAND and RELATED_IMAGE_OPERATOR.

export KUBECONFIG="path-for-kubeconfig-file"
export GRAPH_DATA_IMAGE="your-registry/your-repo/your-init-container:tag"

Run unit tests

make unit-test

Generating OLM manifests

Here are the steps to generate the operator-framework manifests in the bundle format

  • Set the OPERATOR_VERSION value in the shell
  • Set the IMG value pointing to the OSUS operator which should be part of the operator bundle.
  • Run make bundle.

Example:

OPERATOR_VERSION=4.9.0
IMG=registry.com/cincinnati-openshift-update-service-operator:v4.6.0
make bundle

Test a PR with a cluster-bot cluster

Follow ci-docs. E.g., issuing the following message to "Cluster Bot" in Slack

launch 4.16,openshift/cincinnati-operator#185 aws

will launch a 4.16 cluster on aws and install the built operator from PR#185.

Documentation

cincinnati-operator's People

Contributors

brenton avatar davoska avatar dependabot[bot] avatar djzager avatar dobbymoodge avatar hongkailiu avatar jeckersb avatar jottofar avatar karampok avatar lalatendumohanty avatar mhrivnak avatar mvazquezc avatar openshift-ci[bot] avatar openshift-merge-bot[bot] avatar openshift-merge-robot avatar pawanpinjarkar avatar petr-muller avatar pratikmahajan avatar rthallisey avatar rwsu avatar sacharya avatar vrutkovs avatar wking avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cincinnati-operator's Issues

Document the cincinnati operator API

Document configuration options the operator can accept:

  • Binary parameters like registry location, repository, and other configuration...
  • CA cert injection
    etc ...

Support cluster-wide proxy for external registries

If a user inserts their external registry CA cert as an additionalTrustBundle in the install-config and configures a proxy, their external registry CA cert will be delivered to the cincinnati pods. This means two things:
1) Documentation change - A user should not add their CA Cert to the imageConfig API and CertConfigMapKey should be empty. No longer needed after: #26
2) Code change - The cincinnati-operator needs to watch the configmap created by the cluster-network-operator and restart cincinnati pods if their are changes (cert rotation).

Additional context around cluster-wide proxies: https://docs.openshift.com/container-platform/4.3/networking/configuring-a-custom-pki.html#installation-configure-proxy_configuring-a-custom-pki

An updated UpdateService resource does not take effect

When applying an updated UpdateService resource which references a new graph image digest in .spec.graphDataImage, the associated deployment is not restarted.

Without manually restarting the deployment, the pods will continue to reference an incorrect graph imagedigest in initContainers

Bundle CA Certs and Mount into Cincinnati

After #26 merges, the user will be required to use the ConfigMap key 'cincinnati-registry' in the openshift-config ConfigMap in order to have their external registry CA Cert injected into the Cincinnati pod. One way to remove this requirement would be to bundle together all the CA Certs from all the keys in the openshift-config ConfigMap and mount them into the Cincinnati pod. The consequence of bundling is that the admin will have to place trust in all the CA Certs that exist in the openshift-config ConfigMap. Whether the API enforces trust or the API expects the admin to enforce trust, if trust in the CA store is a reasonable expectation from Cincinnati, then this would be a good feature to add.

No matches for kind PodDisruptionBudget in OKD 4.12

When running openshift-update-service v5.0.0 (cincinnati-operator) on the latest OKD 4.12 release, installed through "registry.redhat.io/redhat/redhat-operator-index:v4.12" catalogsource.

I get the following error:

ERROR	cmd	Manager exited non-zero	{"error": "no matches for kind \"PodDisruptionBudget\" in version \"policy/v1beta1\""}

https://catalog.redhat.com/software/operators/detail/5f0f35842991b4207fcdb202 - Here it is stated that 4.12 is supported; Is there api differences between OCP 4.12 & OKD 4.12 that could cause this?

Reconciling deployment changes creates a second replicaset

When running the cincinnati-operator and operand in cluster everything starts up just fine. However, if you edit a config option on the cincinnati API, the operator will create a second cincinnati replicaset.

Steps to reproduce:

  • In an OpenShift cluster, start the cincinnati-operator and create an instance of cincinnati
  • Change the 'registry' field

Something I noticed is that many reconcile events occur in quick succession:

{"level":"info","ts":1587079038.504181,"logger":"controller_cincinnati","msg":"Reconciling Cincinnati","Request.Namespace":"openshift-cincinnati","Request.Name":"example-cincinnati"}
{"level":"info","ts":1587079038.5199764,"logger":"controller_cincinnati","msg":"Reconciling Cincinnati","Request.Namespace":"openshift-cincinnati","Request.Name":"example-cincinnati"}
{"level":"info","ts":1587079038.5387392,"logger":"controller_cincinnati","msg":"Reconciling Cincinnati","Request.Namespace":"openshift-cincinnati","Request.Name":"example-cincinnati"}
{"level":"info","ts":1587079038.5540364,"logger":"controller_cincinnati","msg":"Reconciling Cincinnati","Request.Namespace":"openshift-cincinnati","Request.Name":"example-cincinnati"}
{"level":"info","ts":1587079038.5716403,"logger":"controller_cincinnati","msg":"Reconciling Cincinnati","Request.Namespace":"openshift-cincinnati","Request.Name":"example-cincinnati"}

And an error reconciling the deployment status always shows up:

{"level":"error","ts":1587079009.7270434,"logger":"controller_cincinnati","msg":"Failed to update Status","Request.Namespace":"openshift-cincinnati","Request.Name":"example-cincinnati","error":"Operation cannot be fulfilled on cincinnatis.cincinnati.openshift.io \"example-cincinnati\": the object has been modified; please apply your changes to the latest version and try again","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/openshift/cincinnati-operator/vendor/github.com/go-logr/zapr/zapr.go:128\ngithub.com/openshift/cincinnati-operator/pkg/controller/cincinnati.(*ReconcileCincinnati).Reconcile\n\t/go/src/github.com/openshift/cincinnati-operator/pkg/controller/cincinnati/cincinnati_controller.go:169\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/openshift/cincinnati-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:256\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/openshift/cincinnati-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/github.com/openshift/cincinnati-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/openshift/cincinnati-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/openshift/cincinnati-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/openshift/cincinnati-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}

Add Watches on secondary resources

Copying this comment over to an issue.

I think this needs two more Watches, both with a new handler.

  1. Watch the Image resource. The generation change predicate would be a good idea to use here.
  2. Watch ConfigMap resources in the openshift-config namespace. Might have to use a predicate to limit it to that namespace.

You'll need a new handler of the EnqueueRequestsFromMapFunc type. I think the logic will go like this:

If object is Image:
  return Requests for all Cincinnati instances
If object is ConfigMap:
  If ConfigMap is the one referenced from the Image:
    return Requests for all Cincinnati instances

That'll make sure that if either changes, it'll enqueue reconciliation for all Cincinnati resources, and then they can each react accordingly based on the code you've already written.

Here is an example of a simple mapper and the associated Watch.

remove git fields from CRD

The Cincinnati team wants to move away from having cincinnati obtain graph data directly from git, and instead distribute graph data as a container image. As such they prefer to remove git from the CRD and have the operator only focus on loading data via the init container.

For this task:

  • remove fields related to git from the CRD
  • remove those parts of the config template
  • ensure that it's clear and easy how to make and utilize an init container

Create a condition to track the status of an external CA cert

There are 3 states the cincinnati-operator can be in when looking for an external CA Cert:

  1. ImageConfig.Spec.AdditionalTrustedCA.Name doesn't exist
  2. ImageConfig.Spec.AdditionalTrustedCA.Name exists, but the key cincinniat-registry is not found in the ConfigMap
  3. ImageConfig.Spec.AdditionalTrustedCA.Name exists and the key cincinniat-registry is found in the ConfigMap

We can use conditions on the operator to reflect the what the status is.

Investigate scoping the cincinnati-operator down to multiple-namespaces

After #11 merges, the cincinnati operator will watch all namespaces when really it only needs to watch two: openshift-config and openshift-cincinnati. Since operators have the ability to support multi-namespaces instead of all namespaces, let's see if we can scope down the cincinnati operator.

  • Investigate using a multi-namespace watch as a long term solution
  • Implement a solution using a caches from the openshift-config and openshift-cincinnati namespaces

Update service unable to resolve DNS or access network outside of cluster

After the latest update (5.0.1) OSUS has been unable to resolve DNS or access the network outside of the cluster, I see the following error when trying to use quay (or any other registry local or in the cloud) as the releases repo.

apiVersion: updateservice.operator.openshift.io/v1
kind: UpdateService
metadata:
  name: openshift-update-service
  namespace: openshift-update-service
spec:
  graphDataImage: quay.io/my-repo/my-graph-data-image:v0.0.1
  releases: quay.io/openshift/okd
  replicas: 1
DEBUG graph_builder::graph] graph update triggered
TRACE cincinnati::plugins] Running next plugin 'release-scrape-dockerv2'
ERROR graph_builder::graph] failed to fetch all release metadata from quay.io/openshift/okd
ERROR graph_builder::graph] http transport error: error sending request for url (https://quay.io/v2/): error trying to connect: dns error: failed to lookup address information: Name or service not known
ERROR graph_builder::graph] error sending request for url (https://quay.io/v2/): error trying to connect: dns error: failed to lookup address information: Name or service not known
ERROR graph_builder::graph] error trying to connect: dns error: failed to lookup address information: Name or service not known
ERROR graph_builder::graph] dns error: failed to lookup address information: Name or service not known
ERROR graph_builder::graph] failed to lookup address information: Name or service not known

From a rsh:

sh-4.4$ getent hosts quay.io

Returns nothing from the OSUS operator, or instance pods while it returns a valid IP from other pods.

I tried to manipulate the Operator Deployment with hostnetwork: true (for debugging), but it immediatly gets reverted because of the Subscription or SCC. Other pods in other namespaces are able to access network, and resolve DNS. The working pods use restricted-v2 SCC just like the OSUS pods.

I am using OKD 4.12 / Kubernetes 1.25.4, and there are no Firewall rules, NetworkPolicies or other manifests that would be blocking the outgoing traffic.

This previously worked in version 5.0.0.

restart Pods on configmap changes

If one of the ConfigMaps gets changed, we need to ensure those changes are consumed by graph-builder and policy-engine. They may or may not notice changes to the configmap that's mounted on the filesystem. But the other configmap is consumed with the envFrom feature, which definitely requires a pod restart.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.