Coder Social home page Coder Social logo

catalogd's People

Contributors

anik120 avatar dependabot[bot] avatar dtfranz avatar everettraven avatar grokspawn avatar joelanford avatar justinkuli avatar kevinrizza avatar m1kola avatar michaelryanpeter avatar ncdc avatar oceanc80 avatar rashmigottipati avatar varshaprasad96 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

catalogd's Issues

Ensure HTTP endpoints are accessible outside the cluster

Following #113 we should ensure that the necessary HTTP endpoints for getting catalog contents are accessible from outside the cluster. A user shouldn't have to perform any magic to curl the endpoint containing a Catalog's contents.

In essence, a user should be able to fetch the catalog content via the command line by:

  • Running kubectl get catalog {catalogName} -o yaml to fetch Catalog details
  • Running curl {Catalog.Status.contentURL} to fetch all the contents of the Catalog

Disable etcd and apiserver

For now the custom apiserver and it's corresponding etcd instance are not being used. To temporarily limit overhead we should disable them until we update catalogd to be properly using the custom apiserver.

We could simply comment out the following sections with some TODO comments to uncomment them when we are back to using an apiserver:

  • - ../apiserver
    - ../etcd
  • catalogd/Makefile

    Lines 81 to 83 in bc7778c

    .PHONY: build-server
    build-server: fmt vet ## Build api-server binary.
    CGO_ENABLED=0 GOOS=linux go build -tags $(GO_BUILD_TAGS) $(VERSION_FLAGS) -o bin/apiserver cmd/apiserver/main.go
  • catalogd/Makefile

    Lines 97 to 103 in bc7778c

    .PHONY: docker-build-server
    docker-build-server: build-server test ## Build docker image with the apiserver.
    docker build -f apiserver.Dockerfile -t ${SERVER_IMG}:${IMG_TAG} .
    .PHONY: docker-push-server
    docker-push-server: ## Push docker image with the apiserver.
    docker push ${SERVER_IMG}
  • $(KIND) load docker-image $(SERVER_IMG):${IMG_TAG} --name $(KIND_CLUSTER_NAME)
  • cd config/apiserver && $(KUSTOMIZE) edit set image apiserver=${SERVER_IMG}:${IMG_TAG}
  • kubectl wait --for=condition=Available --namespace=$(CATALOGD_NAMESPACE) deployment/catalogd-apiserver --timeout=60s
  • catalogd/.goreleaser.yml

    Lines 24 to 36 in bc7778c

    - id: catalogd-server
    main: ./cmd/apiserver/
    binary: bin/apiserver
    tags: $GO_BUILD_TAGS
    goos:
    - linux
    goarch:
    - amd64
    - arm64
    - ppc64le
    - s390x
    ldflags:
    - -X main.Version={{ .Version }}
  • catalogd/.goreleaser.yml

    Lines 58 to 77 in bc7778c

    - image_templates:
    - "{{ .Env.APISERVER_IMAGE_REPO }}:{{ .Env.IMAGE_TAG }}-amd64"
    dockerfile: apiserver.Dockerfile
    goos: linux
    goarch: amd64
    - image_templates:
    - "{{ .Env.APISERVER_IMAGE_REPO }}:{{ .Env.IMAGE_TAG }}-arm64"
    dockerfile: apiserver.Dockerfile
    goos: linux
    goarch: arm64
    - image_templates:
    - "{{ .Env.APISERVER_IMAGE_REPO }}:{{ .Env.IMAGE_TAG }}-ppc64le"
    dockerfile: apiserver.Dockerfile
    goos: linux
    goarch: ppc64le
    - image_templates:
    - "{{ .Env.APISERVER_IMAGE_REPO }}:{{ .Env.IMAGE_TAG }}-s390x"
    dockerfile: apiserver.Dockerfile
    goos: linux
    goarch: s390x

(I've tried to capture all the places but there could be more)

Flesh out CONTRIBUTING.md file

To help new contributors get started we should include a CONTRIBUTING.md file that outlines the process for creating a contribution to catalogd

catalogd doesn't support multiple catalog

When there are more than one catalogs, "oc get packages" doesn't show all packages.

  1. create catalog test-catalog-xzha
zhaoxia@xzha-mac catalogd % cat catalogd-xzha.yaml 
apiVersion: catalogd.operatorframework.io/v1alpha1
kind: Catalog
metadata:
  name: test-catalog-xzha
spec:
  source:
    type: image
    image:
      ref: quay.io/olmqe/nginxolm-operator-index:catalogd-1
  1. get packages
zhaoxia@xzha-mac catalogd % oc get packages
NAME                               AGE
test-catalog-xzha-nginx-operator   19s
  1. create catalog community-operator
zhaoxia@xzha-mac catalogd % cat catalogd-community.yaml 
apiVersion: catalogd.operatorframework.io/v1alpha1
kind: Catalog
metadata:
  name: community-operator
spec:
  source:
    type: image
    image:
      ref: registry.redhat.io/redhat/community-operator-index:v4.14 
  1. get packages
zhaoxia@xzha-mac catalogd % oc get packages
NAME                                                           AGE
community-operator-3scale-community-operator                   3s
community-operator-ack-acm-controller                          3s
community-operator-ack-apigatewayv2-controller                 3s
community-operator-ack-applicationautoscaling-controller       3s
...

zhaoxia@xzha-mac catalogd % oc get packages| grep test-catalog-xzha
zhaoxia@xzha-mac catalogd % 

cannot get packages from test-catalog-xzha

  1. If I delete pod of all catalogs
zhaoxia@xzha-mac catalogd % oc delete pod test-catalog-xzha  
pod "test-catalog-xzha" deleted
zhaoxia@xzha-mac catalogd % oc delete pod community-operator 
pod "community-operator" deleted

After the pod is restart, the output of "oc get packages" are different
zhaoxia@xzha-mac catalogd % oc get pod
NAME                                           READY   STATUS      RESTARTS   AGE
catalogd-controller-manager-5c7768b8b4-vc7bx   2/2     Running     0          5m30s
community-operator                             0/1     Completed   0          26s
test-catalog-xzha                              0/1     Completed   0          26s
zhaoxia@xzha-mac catalogd % oc get packages
NAME                               AGE
test-catalog-xzha-nginx-operator   17s

The example catalogd unpack failed --> "status.phase: Failing"

Install the catalogd according to the docs/demo : https://github.com/operator-framework/catalogd/blob/main/docs/demo.gif
But the status of catalog operatorhubio failed at first , then the bundle content can be unpacked but the status is still failing.
The steps:
1.$git clone https://github.com/operator-framework/catalogd.git
2.$cd catalogd/
3.

$make kind-cluster
......
Set kubectl context to "kind-catalogd"
You can now use your cluster with:

kubectl cluster-info --context kind-catalogd

Thanks for using kind! ๐Ÿ˜Š
/home/jfan/projects/src/github.com/catalogd/hack/tools/bin/kind export kubeconfig --name catalogd
Set kubectl context to "kind-catalogd"
$kubectl cluster-info --context kind-catalogd
Kubernetes control plane is running at https://127.0.0.1:46095
CoreDNS is running at https://127.0.0.1:46095/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
$make install
......
deployment.apps/catalogd-controller-manager created
kubectl wait --for=condition=Available --namespace=catalogd-system deployment/catalogd-controller-manager --timeout=60s
deployment.apps/catalogd-controller-manager condition met
$kubectl get crds -A
NAME                                           CREATED AT
bundlemetadata.catalogd.operatorframework.io   2023-05-24T07:25:58Z
catalogs.catalogd.operatorframework.io         2023-05-24T07:25:58Z
packages.catalogd.operatorframework.io         2023-05-24T07:25:58Z
$kubectl apply -f config/samples/core_v1beta1_catalog.yaml
catalog.catalogd.operatorframework.io/operatorhubio created
$kubectl get catalog -A
NAME             AGE
operatorhubio   21s
$kubectl wait --for=condition=Ready catalog operatorhubio
error: timed out waiting for the condition on catalogs/operatorhubio
$kubectl get catalog operatorhubio -o yaml
......
status:
  conditions:
  - lastTransitionTime: "2023-05-25T08:38:18Z"
    message: 'create bundle metadata objects: creating bundlemetadata "cloud-native-postgresql.v1.10.0":
      BundleMetadata.catalogd.operatorframework.io "cloud-native-postgresql.v1.10.0"
      is invalid: spec.properties[4].value: Invalid value: "string": spec.properties[4].value
      in body must be of type object: "string"'
    reason: UnpackFailed
    status: "False"
    type: Unpacked
  phase: Failing


$kubectl get catalog operatorhubio -o yaml
apiVersion: catalogd.operatorframework.io/v1beta1
kind: Catalog
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"catalogd.operatorframework.io/v1beta1","kind":"Catalog","metadata":{"annotations":{},"name":"operatorhubio"},"spec":{"source":{"image":{"ref":"quay.io/operatorhubio/catalog:latest"},"type":"image"}}}
  creationTimestamp: "2023-05-25T08:38:18Z"
  generation: 1
  name: operatorhubio
  resourceVersion: "2019"
  uid: 891f895b-4926-4fa8-aa20-92670e392783
spec:
  source:
    image:
      ref: quay.io/operatorhubio/catalog:latest
    type: image
status:
  conditions:
  - lastTransitionTime: "2023-05-25T08:38:18Z"
    message: 'create package objects: creating package "ack-acm-controller": packages.catalogd.operatorframework.io
      "ack-acm-controller" already exists'
    reason: UnpackFailed
    status: "False"
    type: Unpacked
  phase: Failing
$kubectl wait --for=condition=Ready catalog operatorhubio
error: timed out waiting for the condition on catalogs/operatorhubio

Write e2e tests

There are currently no e2e tests for catalogd, we should write some

Create a minimal client library

Following #113 - to help catalogd clients transition from the old world of Kubernetes CustomResources for serving catalog contents, we should create a minimal client library that can be used to fetch catalog information and return it in a list of serialized Go objects.

(Idea): Make catalogd CRs "schema based"

In the Operator Framework community call, one of the things mentioned and briefly discussed was creating the CRs in a way that allowed for easier extensibility.

I'm going to try and break down my proposed solution for how the CRs could look if we went this route into a section for each CR.

CatalogSource

Taking a schema based approach here would allow us to more easily extend the ways in which catalog information can be retrieved. This will loosely tie in the concept of "sources" for catalogs (git, http, image, etc.), but that concept is likely to be discussed in a separate issue. If the change to the CatalogSource CR is better suited as part of that work we can move this item there.

All that being said, here is what a schema based CatalogSource CR in it's simplest form could look like:

apiVersion: catalogsource.catalogd.io/v1
kind: CatalogSource
metadata:
  name: sample-catalogsource
spec:
  source:
    type: "catalogd.source.image"
    config:
      image: docker.io/repo/some-index-image:tag

The benefit of taking this approach is that it makes the way in which a catalog is fetched highly extensible and configurable. rukpak already has a similar concept of sources.

Package

Taking a schema based approach to the Package CR would allow for a wider range of use cases of catalogd. This can allow a "package" to have a standard interface to determine basic package information (like it's name) but allow each package to have its own information structure. This more generic interface inherently makes the Package CR more resilient to changes in the way package data is formatted (i.e if the olm.package spec was changed we wouldn't have to change anything).

A schema based Package CR in it's simplest form could look like:

apiVersion: package.catalogd.io/v1
kind: Package
metadata:
  name: some-package
spec:
  schema: "olm.package"
  config:
    channels: ...
    description: ...
    icon: ...
    ...

BundleMetadata

Taking a schema based approach to the BundleMetadata CR has a similar benefit to making the Package CR schema based. If the olm.bundle spec were to suddenly change we would be able to handle it the same as before.

A schema based BundleMetadata CR in it's simplest form could look like:

apiVersion: bundlemetadata.catalogd.io/v1
kind: BundleMetadata
metadata:
  name: some-bundle
spec:
  schema: "olm.bundle"
  config:
    properties: ...
    constraints: ...
    ...

Enhance the logic for syncing changes to catalog contents

Follow up to #65 as per #65 (comment)

The current logic doesn't allow for issuing updates to already existing unpacked catalog contents (we always try to create the resource and fail if it already exists) and should be improved to have the ability to update the contents when they have been modified in the source for the parent Catalog resource

Ensure the `CatalogSource` controller is properly setting owner references

Currently the CatalogSource controller creates Package and BundleMetadata resources as part of reconciling a CatalogSource CR but doesn't set the owner references on the children resources.

We should update the CatalogSource controller to properly set owner references on the resources it creates to ensure the Kubernetes garbage collection will automatically delete these resources if the owning CatalogSource CR is deleted.

Additional Context

Optimize memory usage of unpacking FBC from pod logs

In these lines, we copy the logs to a byte buffer, and then turn around and create a reader from the byte buffer.

We may be able to avoid the buffer entirely by feeding the podLog stream directly into declcfg.LoadReader.

Acceptance Criteria:

Update the `Catalog` controller to use a `Storage` interface

As part of implementing the RFC for catalogd's content storage and serving mechanism and following #113 we need to update the Catalog controller to use the Storage interface for storing catalog content after it is unpacked.

This functionality should be feature gated.

Acceptance Criteria

  • Updates Catalog controller to have and use a Storage interface
  • Adds a feature gate to enable use of the functionality
  • Updates unit/e2e tests as necessary

How to handle non-unique catalog items?

In #76 , we updated catalogd to prefix the names of Package and BundleMetadata objects with the metadata.name of the Catalog from which they are derived.

This solved the problem of making it possible to distinguish between these duplicate names, but it raises another question: Do we need mechanisms to help clients disambiguate? For example, the operator-controller project has an Operator API with spec.package which is expected to be a simple package name (e.g. etcd, not operatorhubio-catalog-etcd).

Does catalogd need to do anything here to help clients disambiguate or get a non-ambiguous view (e.g. maybe via filtering?)

`Package` resource naming clash if same named `Package` exists in multiple `CatalogSource`s

Currently the CatalogSource controller creates Package resources when reconciling a CatalogSource CR. It doesn't do any special naming logic to ensure uniqueness of Package resources being created which means that if multiple catalogs define a package of the same name (but are technically different), there will be a naming clash resulting in an error creating the corresponding Package resource.

We should evaluate what our options are to prevent this naming clash.

One suggestion is to see what options we have to specify the Package admission criteria on our custom apiserver. For more context see: #1 (comment)

Non-atomic syncing of Package/BundleMetadata/CatalogMetadata resources for a particular catalog.

When the resources (Package+BundleMetadata / CatalogMetadata) are being synced for one catalog, if an error is encountered in the middle of an operation, eg in the middle of a loop where existing Packages are being deleted, the cluster ends up in a state where some of the resources have been operated on, while the remaining resources are in the previous/stale state. This gives an incorrect view of the world to the clients.

Instead, operations should be atomic. Either An operation succeeds wholly in operating on all of the resources it was supposed to be operating on, or : none of the resources are updated at all + an error is surfaced about the failure in operating on the resources.

Don't unpack again if unchanged

Creating this issue to continue discussion from #30 (comment)

Currently the CatalogSource controller will attempt to unpack the catalog contents and recreate the metadata on cluster every reconciliation, even if it has already been unpacked previously. This causes some unnecessary overhead by running the unpacking logic every single reconciliation loop. Instead, we should explore ways to signal that the unpacking process has already been completed and that the unpacking logic does not need to be run.

Some potential solutions:

  • Update the CatalogSource status to also communicate the observed generation and/or the hash of the catalog contents (maybe image sha if we can easily fetch that info?). If there is a change in generation / catalog contents then unpack again
    • Downside: We can't determine the hash of the catalog contents without running the unpack job again so that would still be a process that is run every time (although maybe we gate it with a generation change?)
  • Add a label/annotation to signal it has been unpacked and doesn't need to run the unpack process again

Catalogd installation fails on 1.26 k8s cluster

What did you do?

Followed the steps in readme to deploy this project. To be specific, the exact steps are:

1. Create a kind cluster.
2. Install cert-manager
3. Apply crds: k apply -f config/crd/bases
4. Install config files: k apply -f config/

The etcd and catalogd-apiserver- pod fails with OOM error. This happens even before creating a CR that specify the catalog image we want to use.

This does not happen after downgrading clusters to k8s 1.24 or 1.25.

Introduce spec.Source type in the CatalogSource CRD

Catalog images are exclusively container images hosted in image registries today. There's growing evidence to suggest that allowing catalog content to be packaged and published in other ways (such as as OCI images, or in git repositories, or even added to the cluster as configmaps) could streamline the process of building and maintaining these clusters, improving quality of life for catalog authors.

Introducing a new field spec.Source in the CatalogSource CRD, and adding the value image that calls the unpacking job that exists today, will lay the ground work for introducing new spec.Source type in the near term.

Rename `CatalogSource` to `OperatorCatalog`

Motivation:

CatalogSource is a hangover from OLM v0.

"Here's how you can add this operator catalog to your cluster" sounds less confusing than "Here's how you can add this catalog source to your cluster", possibly since catalogSource is trying to encompass engineering implementation details, that our customers don't need to be exposed to.

Goal:

This is proposal to change the name CatalogSource to OperatorCatalog for catalogd.

Refactor `CatalogSource` controller

Currently the CatalogSource controller has it's own implementation structure that doesn't match that of existing OLMv1 controllers. In the interest of consistency between the OLMv1 controller implementations, we should update the CatalogSource controller's implementation to be more like:

Introduce a means to track catalog content changes

Currently, catalogs provide no easy way to tell when their contents change on an update. This makes it difficult to know when updates may be available on a catalog for operators on cluster. This may be possible by periodically running resolution for all installed operators blindly, but having a field to reference to check if catalog contents have changed will cut down on the need for this wasted periodic resolution.

Requirements:
Provide an easy way to track catalog content change, such as a field or annotation with a content hash.

Goals:
Communicate catalog content change without requiring catalog consumers to go through all of the catalog contents

Non-goals:
Provide specifics of contents affected by a catalog change

The catalog example 'catalog-sample' can't be Ready

Install the catalogd according to the docs/demo : https://github.com/operator-framework/catalogd/blob/main/docs/demo.gif
But the status of catalog catalog-sample can't be Ready.
The steps:
1.$git clone https://github.com/operator-framework/catalogd.git
2.$cd catalogd/
3.

$make kind-cluster
......
Set kubectl context to "kind-catalogd"
You can now use your cluster with:

kubectl cluster-info --context kind-catalogd

Thanks for using kind! ๐Ÿ˜Š
/home/jfan/projects/src/github.com/catalogd/hack/tools/bin/kind export kubeconfig --name catalogd
Set kubectl context to "kind-catalogd"
$kubectl cluster-info --context kind-catalogd
Kubernetes control plane is running at https://127.0.0.1:46095
CoreDNS is running at https://127.0.0.1:46095/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
$make install
......
cd config/manager && /home/jfan/projects/src/github.com/catalogd/hack/tools/bin/kustomize edit set image controller=quay.io/operator-framework/catalogd-controller:devel
/home/jfan/projects/src/github.com/catalogd/hack/tools/bin/kustomize build config/default | kubectl apply -f -
namespace/catalogd-system created
customresourcedefinition.apiextensions.k8s.io/bundlemetadata.catalogd.operatorframework.io created
customresourcedefinition.apiextensions.k8s.io/catalogs.catalogd.operatorframework.io created
customresourcedefinition.apiextensions.k8s.io/packages.catalogd.operatorframework.io created
serviceaccount/catalogd-controller-manager created
role.rbac.authorization.k8s.io/catalogd-leader-election-role created
clusterrole.rbac.authorization.k8s.io/catalogd-manager-role created
clusterrole.rbac.authorization.k8s.io/catalogd-metrics-reader created
clusterrole.rbac.authorization.k8s.io/catalogd-proxy-role created
rolebinding.rbac.authorization.k8s.io/catalogd-leader-election-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/catalogd-manager-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/catalogd-proxy-rolebinding created
service/catalogd-controller-manager-metrics-service created
deployment.apps/catalogd-controller-manager created
kubectl wait --for=condition=Available --namespace=catalogd-system deployment/catalogd-controller-manager --timeout=60s
deployment.apps/catalogd-controller-manager condition met
$kubectl get crds -A
NAME                                           CREATED AT
bundlemetadata.catalogd.operatorframework.io   2023-05-24T07:25:58Z
catalogs.catalogd.operatorframework.io         2023-05-24T07:25:58Z
packages.catalogd.operatorframework.io         2023-05-24T07:25:58Z
$kubectl apply -f config/samples/core_v1beta1_catalog.yaml
catalog.catalogd.operatorframework.io/catalog-sample created
$kubectl get catalog -A
NAME             AGE
catalog-sample   21s
$kubectl wait --for=condition=Ready catalog catalog-sample
error: timed out waiting for the condition on catalogs/catalog-sample
$kubectl get packages
No resources found
$kubectl get bundlemetadata
No resources found
$kubectl get catalog catalog-sample -o yaml
apiVersion: catalogd.operatorframework.io/v1beta1
kind: Catalog
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"catalogd.operatorframework.io/v1beta1","kind":"Catalog","metadata":{"annotations":{},"labels":{"app.kuberentes.io/managed-by":"kustomize","app.kubernetes.io/created-by":"catalogd","app.kubernetes.io/instance":"catalog-sample","app.kubernetes.io/name":"catalog","app.kubernetes.io/part-of":"catalogd"},"name":"catalog-sample"},"spec":{"image":"quay.io/operatorhubio/catalog:latest","pollingInterval":"45m"}}
  creationTimestamp: "2023-05-24T07:30:00Z"
  generation: 2
  labels:
    app.kuberentes.io/managed-by: kustomize
    app.kubernetes.io/created-by: catalogd
    app.kubernetes.io/instance: catalog-sample
    app.kubernetes.io/name: catalog
    app.kubernetes.io/part-of: catalogd
  name: catalog-sample
  resourceVersion: "1398"
  uid: 6e0fe471-2698-4ed9-af59-e93a1c3fd146
spec:
  image: quay.io/operatorhubio/catalog:latest
  pollingInterval: 45m0s

Add version information

Currently both the catalogd-controller and catalogd-server don't contain version information. We should have a way for the version information to be printed (flag, subcommand, etc.) and we should capture the version information at build time.

References:

Improve `CatalogSource` controller error handling during reconciliation

Upon reconciliation of a CatalogSource resource, the CatalogSource controller does a few things:

  • Creates a Job to unpack catalog contents
  • Once unpack Job finishes, reads logs from the Job Pod to get the catalog contents
  • Creates Package CRs for each package in the catalog
  • Creates BundleMetadata CRs for each bundle in the catalog

Currently, the CatalogSource controller's error handling is very basic and should be updated to handle the following scenarios:

  • IF the unpack Job can not be created
    • Update the CatalogSource resource status indicating the unpack failure
  • IF the unpack Job's Pod's logs can not be read
    • Update the CatalogSource resource status indicating the unpack failure
    • IF the error is because the Pod no longer exists, requeue to attempt unpacking process again as the Job could have been cleaned up by another process
  • IF a Package CR can not be created
    • IF caused by the Package already existing - continue
    • ELSE (?)
      • Update the CatalogSource resource status to indicate failure to create children resources
      • Cleanup already created child Package CRs
  • IF a BundleMetadata CR can not be created
    • IF caused by the BundleMetadata already existing - continue
    • ELSE (?)
      • Update the CatalogSource resource status to indicate failure to create children resources
      • Cleanup already created children BundleMetadata & Package CRs

Note: All the scenarios here are just proposed solutions. Ones marked with (?) are ones I feel could have better solutions

Update the README

Looking at the README it is starting to become a bit out of date. We should spend some time to update the README to be consistent to the current state of catalogd

Remove Catalogmetadata CRD and serve CatalogMetadata with aggregated apiserver

Motivation:

The child resources created by the Catalog controller are registered as CRDs, which are served by the main kube apiserver. These resources should instead be served by an aggregated apiservice, so that the consumption and delivery of these API objects can be customized according to catalogd's use cases (eg custom storage for the objects so that the main etcd storage is not being misused, custom CRUD permissions for the objects like "only the catalog controller can CRUD the objects, etc).

Additional background: https://hackmd.io/-Rrvhgb3Q7m9HYZW2gxQtQ

Goal:

Configure the apiserver to administer the child resources, and get rid of the CRDs (expect for the catalogsource CRD, since that doesn't pose any resource constraint risk for the main apiserver)

Create an `OwnerReference` based watch on the unpack job

In the controller's Reconcile function, the existing logic creates the unpack job and obtains the pods for the job and if there is an error during the parsing of unpack logs to obtain the pods, then we check if the error is due to pod not being present and if it is, then we requeue after 10s. This is essentially polling for every 10s even when no events come in, and it also leads to ignoring any events that come between 0-10s.

Ideally, we should reconcile when the state of the job changes instead of polling. Therefore, the better solution is to put an OwnerReference on the job (which we already do) and create an OwnerReference based watch on the job such that we reconcile when state of the job changes. We may also need to do a similar thing for the underlying pod itself.

Also, instead of doing a RequeueAfter for every 10s, we should just return the error as that would trigger exponential backoff.

See: #34 (comment)

Explore alternative methods for unpacking catalog data

Following #65 it is probably a good idea to explore what alternatives exist for unpacking methods separate from copying implementations from rukpak.

Some potential alternatives:

  • Make catalogd a rukpak provisioner
  • Make the rukpak unpacking implementation a library

Unpack job does not honor cluster pull configuration

When we create the unpack job, we directly run opm render <catalogImage> in a single container. This means that that container directly pulls and reads the catalog image. If we want to take advantage of cluster image pull configurations, we should refactor this job into two containers. One container is the catalog image, from which we copy the FBCs into a shared empty dir. The other container renders that shared empty dir.

Note that this only makes sense in the case that the catalog image is a standard docker or OCI image containing FBC data.
If we're using other container formats or FBC data sources, this strategy would not work.

Longer term, I think this highlights the need for rukpak-esque spec.source union type in the catalog source API.

Update unit tests for the catalog controller

Existing unit tests for the catalog controller doesn't have any testing for the sync logic of bundlemetadata and packages (and catalogmetadata when catalogmetadata API PR gets merged.

It would be nice to have the following based on this review suggestion to fully test out the sync logic:

  • Add some other blobs to the test catalog that have (a) just schema, (b) schema and name, (c) schema and package, (d) schema, name, and package, and where all of those have other non-meta fields. This would make sure that we treat FBC meta objects opaquely
  • Update this test to check the full contents of what is synced.
  • Change the Catalog to point to a different image, which would result in existing catalog metadatas being deleted/updated, and at least one new metadata created.

Refactor manifests and manifest generation

Currently there are a lot of manual changes that need to be made with the manifests (mostly related to RBAC) as the controller and API implementations are modified. We should probably try to use kustomize to make this process as automated as possible (i.e just have to run a make target)

The Operator-SDK testdata could be used as a reference to understand how the end result of this refactor could look:

Update `CatalogSource` controller logging

We should make sure that we are logging useful information throughout the reconciliation process. The zap logger that is used has configurable levels of verbosity and we should make sure we are taking advantage of that to allow for more or less verbose log configuration.

Issue created due to #30 (comment)

Create an unpacking interface

Add a Rukpak-like spec.source union type to the Catalog API, which would allow catalog filesystems to be sourced in a variety of ways. It would also give catalogd developers an extensible API surface to add new source mechanisms in the future.

The above quote was an idea mentioned in the OLMv1 Milestone 3 ideation discussion. We should implement an unpacking interface that allows us to easily support different sources of information for catalogs

An example of what this could look like on a CatalogSource CR in yaml:

apiVersion: catalogd.operatorframework.io/v1beta1
kind: CatalogSource
metadata: 
  name: catalogsource-sample
spec:
  source:
    type: catalogd.source.image
    image: quay.io/operatorhubio/catalog:latest

Acceptance Criteria:

  • Unit tests
  • An interface implemented for an unpacking source
  • An image unpacking source implementation
  • Updating the CatalogSource controller to use the unpacking interface instead of the currently hardcoded unpacking process

Optimize `CatalogSource` controller memory consumption

Based on the initial findings from #3 we noticed that there is a fairly large and persistent memory consumption increase (~232Mi) in the CatalogSource controller after reconciling a CatalogSource CR and creating the children Package and BundleMetadata CRs.

This increase in memory consumption is likely caused by the CatalogSource controller caching these child resources (which there can be a large number of). We should explore methods we can use to decrease the memory consumption to prevent the CatalogSource controller from hogging large amounts of memory.

Additional Context:

Serve locally stored fbc content via a server

As part of implementing the RFC for catalogd's content storage and serving mechanism we should import the RukPak storage library (dependent on operator-framework/rukpak#656). Once imported we will need to create a custom implementation of the Storage interface.

The custom Storage implementation should:

  • Store all FBC JSON blobs in a singular {catalogName}/all.json file
  • Serve the {catalogName}/all.json file via a go http.FileServer handler

Note If operator-framework/rukpak#656 is not accepted by the RukPak maintainers it would be acceptable to copy the Storage interface to a catalogd package. This is not ideal but is rather a last resort if RukPak maintainers deny the request to externalize the storage package.

Acceptance Criteria

  • Storage interface implementation that:
    • Store all FBC JSON blobs in a singular {catalogName}/all.json file
    • Serve the {catalogName}/all.json file via a go http.FileServer handler
  • Unit tests

This is a bug in catalogd. We should not require `relatedImages` to be populated. In fact, I would argue that `[{}]` is more invalid because it is saying "here's a related image. it's image is `""` and it's name is `""`.

          This is a bug in catalogd. We should not require `relatedImages` to be populated. In fact, I would argue that `[{}]` is more invalid because it is saying "here's a related image. it's image is `""` and it's name is `""`.

Originally posted by @joelanford in operator-framework/operator-controller#250 (comment)

Use FBC as the only API for exposing catalog content on cluster

Motivation:

Right now, the content of a file-based catalog(distributed in an OCI container image) is exposed on cluster using Package and BundleMetadata APIs. However, this translation layer adds another layer of APIs, that must be maintained and supported, besides the FBC API, without adding any extra value to the goal of exposing the catalog metadata on cluster. Since the FBC API itself is
a) Already an API we will be maintaining and providing support guarantees for
b) Is sufficient to communicate all of the information in the catalog to the on-cluster consumers (as opposed to the legacy sqlite db API that needed the translation in order to be interpretable/consumable)

the FBC API should be exposed without any translation on cluster.

Goal:

One way to go about this is consolidating the Package/BundleMetadata APIs to a single CatalogMetadata API, that contains the schemas in it's spec field.

For eg, for a package foo that has bundles foo.v0.1.0 and foo.v0.2.0, the following CatalogMetadata objects would be created:

kind: CatalogMetdata 
metadata: 
    labels: 
       catalog: bar
       name: foo.v0.1.0
       package: foo
       schema: olm.bundle
    name: foo.v0.1.0
spec:
   catalog: bar
   package: foo
   schema: olm.bundle
   content: 
      <fbc blob rendered as yaml>
       .
       .
kind: CatalogMetdata 
metadata: 
    labels: 
       catalog: bar
       name: foo.v0.2.0
       package: foo
       schema: olm.bundle
    name: foo.v0.2.0
spec:
   catalog: bar
   package: foo
   schema: olm.bundle
   content: 
      <fbc blob rendered as yaml>
       .
       .
kind: CatalogMetdata 
metadata: 
    labels: 
       catalog: bar
       name: foo.v0.1.0
       package: foo
       schema: olm.package
    name: foo
spec:
   catalog: bar
   package: foo
   schema: olm.package
   content: 
      <fbc blob rendered as yaml>
       .
       .

This would allow clients to query for the content on cluster in a kube native way by taking advantage of the label selectors, and read the content off of the CatalogMetadata objects.

$ kubectl get catalogmetadata -l package=foo
NAME         CATALOG       SCHEMA        PACKAGE              
foo           bar         olm.package    foo   
foo.v0.1.0    bar         olm.bundle     foo 
foo.v0.2.0    bar         olm.bundle     foo   

Acceptance Criteria:

  • Package and bundle metadata api are removed and replaced with catalog metadata api
  • spec.content of the api has individual FBC blobs
  • labels mentioned in description are implemented for each CR

Cleanup after catalogsource is deleted

Issue:
The packages and bundlemetadata doesn't seem to be deleted when the catalogsource is deleted.

Steps to reproduce:

  • Follow the steps in the README.md.
  • Create a catsrc CR with image being: docker.io/anik120/community-operator-index:v4.11
  • Delete the catsrc CR
  • The packages and bundlemetadata is expected to cleanup, but it still persists.

The owner references doesn't seem to be set in the packages:

โžœ  catalogd git:(poc/catsrc-v2) โœ— k describe packages.core.catalogd.io zookeeper-operator
Name:         zookeeper-operator
Namespace:
Labels:       <none>
Annotations:  <none>
API Version:  core.catalogd.io/v1beta1
Kind:         Package
Metadata:
  Creation Timestamp:  2023-03-30T16:30:51Z
  Generation:          1
  Managed Fields:
    API Version:  core.catalogd.io/v1beta1
    Fields Type:  FieldsV1
    fieldsV1:
      f:spec:
        .:
        f:catalogSource:
        f:channels:
        f:defaultChannel:
        f:description:
        f:icon:
      f:status:
    Manager:         manager
    Operation:       Update
    Time:            2023-03-30T16:30:51Z
  Resource Version:  1489
  UID:               8eab1b43-2623-46f8-b423-9e264c8c8a4f
Spec:
  Catalog Source:  catalogsource-sample
  Channels:
    Entries:
      Name:         zookeeper-operator.v0.10.3
      Name:         zookeeper-operator.v0.11.0
      Replaces:     zookeeper-operator.v0.10.3
    Name:           alpha
  Default Channel:  alpha
  Description:
  Icon:
Status:
Events:  <none>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.