odpi / egeria-charts Goto Github PK

View Code? Open in Web Editor NEW

13.0 5.0 10.0 2.03 MB

Helm chart repository

Home Page: https://odpi.github.io/egeria-charts

License: Apache License 2.0

Shell 82.43% Mustache 9.39% Smarty 8.18%

operator kubernetes egeria hacktoberfest

egeria-charts's People

Stargazers

Watchers

Forkers

cmgrote planetf1 fei-shen tcnt max-simon davidradl lpalashevski atruvia dwolfson pdr-associates

egeria-charts's Issues

Allow certificates/CA to be easily configured for docker images & charts

Followon from odpi/egeria#4670

We need to document & make easy the approach to configure the certificate authority & CA etc to be used across all components when making use of our docker image (extra JVM parms are a minimum) and helm charts. This also extends to the egeria operator

Currently the image/charts rely on the default behaviour and file locations

Optional resource specification

On some K8s clusters like IBM ICP we must explicitly specify resource for each pod. It is ICP requirement. Also, performance wise some Egeria components work much better with more resources available. On ICP, Atlas pod, for example, wouldn't even start without at least 2 cpus, but significant improvement is visible when cpu = 4, while igcproxy requires 2 cpus.
It should be possible to optionally specify resources for each pod in values.yaml E.g. (suggested def values)
resources:
limits:
cpu: 1
memory: 2Gi
requests:
cpu: 10m
memory: .01Gi

Create openlineage demo

Is there an existing issue for this?

I have searched the existing issues

Please describe the new behavior that that will improve Egeria

Create an end to end demonstation of openlineage

Based on coco pharma labs
demonstrates event source creating openlineage events, being received in egeria
demonstrates openlineage events being fed to downstream system (ie marquez)

This would be based on the coco helm charts, and would make use of the marquez docker image to show this integration

Alternatives

No response

Any Further Information?

No response

Would you be prepared to be assigned this issue to work on?

I can work on this

Add docs on cleaning up kubernetes resources manually

If helm gets out of sync with kubernetes -- for example due to a bug or timeout, various egeria resources are left active.

It's helpful to document how these can be cleaned up.

For example

kubectl delete pods,services,deployments,statefulsets -l app.kubernetes.io/name=odpi-egeria-lab
kubectl delete pods,services,deployments,statefulsets -l app.kubernetes.io/name=kafka
kubectl delete pods,services,deployments,statefulsets -l app.kubernetes.io/name=zookeeper

Similarly use odpi-egeria-vdc but add in whatever is needed for ldap

Helm charts - pull request limits on nginx

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

Dockerhub has limits (for free accounds) on pulling container images

This can mean an attempt to use a Kubernetes Helm chart which uses nginx might fail as:

  Warning  Failed          56s                kubelet            Failed to pull image "docker.io/nginx": rpc error: code = Unknown desc = Error reading manifest latest in docker.io/library/nginx: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit

Expected Behavior

Image is pulled without error

Steps To Reproduce

No response

Environment

n/a

Any Further Information?

See:

Simplify description of helm charts

Current helm charts are :

NAME                  	CHART VERSION     	APP VERSION 	DESCRIPTION
egeria/egeria-base    	3.3-prerelease.1  	3.3-SNAPSHOT	Egeria simple deployment to Kubernetes
egeria/egeria-cts     	3.3-prerelease.0  	3.3-SNAPSHOT	Egeria Conformance Test Suite deployment to Kub...
egeria/egeria-pts     	3.3-prerelease.0  	3.3-SNAPSHOT	Egeria Performance Test Suite deployment to Kub...
egeria/odpi-egeria-lab	3.3.0-prerelease.6	3.3-SNAPSHOT	Egeria lab environment

The 'to Kubernetes' part of the description is not needed.

Maybe

Egeria single-server example
Egeria Conformance Test Suite
Egeria Performance Test Suite
Egeria Coco Pharmaceuticals lab

Improve docs on getting started with egeria lab k8s environment

Improve the docs for getting started with the egerla lab k8s environment, in particular locally.

Once a kubernetes environment is available, installing the helm chart, notebooks etc goes well.
However it would be useful to have a simple on-boarding route for those new to k8s.

Previously docker-for-desktop has been suggested, but this can be problematic
Look at options, possibly provide a few....

cc: @grahamwallis @CDaRip2U

Insertable yaml fragments for charts (extra container runtimes, volumes etc)

Our egeria-base chart is a good starting point for a simple egeria deployment (with persistence)

However it's become clear when trying to add a new postgres chart (one example of many) that we may wish to
easily add

additional configuration variables
additional scripts (for example to configure integration service)
additional volumes (for example for new connectors)
additional containers (for example postgres)

These are common practice when developing helm charts and some useful tips are documented at https://dzone.com/articles/the-art-of-the-helm-chart-patterns-from-the-offici

Egeria 3.3 prerelease chart - egeria-ui does not load

In the egeria 3.3 prerelease (.2) chart, the Egeria UI is not loading under either OpenShift 4.8 or microk8s (macOS)

In both cases, going to the nginx-server endpoint results in a single webpage showing 'Loading ...'

The nginx container looks ok and is showing page loads. See https://gist.github.com/05f5833105da7b59cb5687f3fff012ab
The ui-chassis container has the regular 'ready' message, no errors seen. See https://gist.github.com/5311f2d064ff1c69fd187476b211b0c1
The static ui container also looks ok, and is serving pages as expected. see https://gist.github.com/95dcae9bad53851153cada5e5582d5d3

It's possible there is a version discrepancy. Egeria ui chassis/egeria is at 3.3-SNAPSHOT, whilst egeria-ui is at 3.0.1
Note that this is the latest release (from August) as tested with egeria 3.2. However the release version is still set to 3.0.1 in main, meaning the images are constantly being updated rather than being set to a future prerelease (though this may not be the cause of this issue)

@lpalashevski @sarbull can you take a look? Let me know if you need anything. If we are to have the UI enabled for 3.3 we need to get a stable UI configuration in the charts.

Use fixed pod names for egeria-ui

When analysing egeria logs using LogDNA I noticed that whilst some of our pods have a fixed name in the helm chart, others are still uniquely generated ie:

For ease of filtering it would be useful if pod names were constant.

I also noticed the base chart has a pod with a unique name ie

base                                               base-kafka-0                                              1/1     Running     1          5d3h
base                                               base-zookeeper-0                                          1/1     Running     0          5d3h
base                                               egeria-base-config-m88vh                                  0/1     Completed   0          5d3h
base                                               egeria-base-platform-0

Constant names would make it easier to set up filters.
The pod name is already qualified by helm chart deployment name, which I think is suitably unique already.

NodePort setup not working

Attempting to override the default ClusterIP services with a NodePort type instead, to ease external exposure for vanilla k8s clusters, does not work. Helm notes an error when rendering the templates:

Error: YAML parse error on odpi-egeria-lab/templates/egeria-core.yaml: error converting YAML to JSON: yaml: line 16: mapping values are not allowed in this context

Investigating the debug output of Helm indicates that the line-chomping - }} on the conditionals that wrap the NodePort configuration are a bit too aggressive, and are causing the nodePort: .... line to be directly concatenated to the end of the targetPort: ... value.

Figure out how to incorporate egeria-ui into demo scenario - coco or other

Currently the 'odpi-egeria-lab' contains

kafka
zookeeper
egeria platforms (*4)
react UI (node based app)
egeria-ui (static content for polymer UI)
ui-chassis
nginx (proxy for UI)

However

We don't currently have any 'lab' oriented content for the UI
The sample data loaded by our 'data catalog' notebook doesn't typically work with searching in the UI (it doesn't work in 3.0 for example)
There is no clear documentation on use of rex/tex (which do work in the react UI)
We don't have any open lineage functionality integrated in the lab demo

In part this is because the drive for this UI has focussed on different use cases, and has been more oriented to initial production use where it offers a Business level interface supporting functionality such as Open Lineage

For this reason it doesn't fit well in the current lab demo.

Therefore I propose to remove

egeria-ui
ui-chassis
nginx

From the 'odpi-egeria-lab'

Then, we can offer an 'odpi-egeria-lab-withui' or similar chart, which will have the odpi-egeria-lab chart as a dependency (which will include the other core elements), and then add in the ui static/api content & proxy

Furthermore since the release cycle of the UI & team ownership differs this allows for appropriate delegation and asynchronous development of the two pieces. The egeria release focussing on the core, and a UI release where the additional features can be developed & tested, notebooks updated & docs clarified.

Finally this tightens up a scenario where the operator can be dropped in more cleanly when ready.

This should add more flexibility, clarity & still retain the container-based demo deployment approach.

cc: @lpalashevski

Is coexistance of multiple charts using strimzi possible

Installed egeria-base repo

Then tried to install odpi-egeria-lab

Install fails with:

Error: INSTALLATION FAILED: rendered manifests contain a resource that already exists. Unable to continue with install: ClusterRole "strimzi-cluster-operator-namespaced" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "meta.helm.sh/release-name" must equal "lab": current value is "base"; annotation validation error: key "meta.helm.sh/release-namespace" must equal "lab": current value is "base"

Consolidate web-ui access for coco pharma tutorial

Currently coco pharma exposes http/https for web ui via

egeria static ui content (via nginx)
egeria ui api (via nginx)
presentation server (node)
jupyter UI

If this is deployed to kubernetes in the cloud - or indeed locally, it requires at least 3 distinct load balancer configurations or port forwards, making the setup more convoluted (and possibly more costly) [feedback received]

We could consolidate these APIs all behind an nginx configuration, especially as we already use nginx. This would mean only a single port would need forwarding.

Additionally we should clarify in the docs the approaches to exposing this port

via nodeport
via kube proxy
via ingress (which will vary by cloud platform, but azure, IBM cloud (ROKS), GCP are known to be used,

(Will move this issue to the docs repo once any code changes are complete)

We could further include routing to the egeria platforms (also web) via nginx
Kafka is best left distinct as it does not use http, but clarifying how to read/write to the topic would be useful

Document/setup use of prometheus with egeria container deployment in k8s (helm)

Tools like prometheus are used extensively to understand how an application is behaving/performing in large scale environments

We already have many pieces to make this work for egeria

Spring can be configured with the appropriate endpoints for prometheus
The RedHat UBI8 openjdk image that we use to run egeria exposes endpoints

We should add documentation on what is already possible.

In addition should consider

The kafka/zookeeper image currently used from bitnami (in our lab) may not be integrated. How could we do this? Can we add prometheus support if needed? Do we just offer an alternative config that uses an existing cloud messaging bus (we did this with the vdc chart, using IBM event streams as an option)
- We may learn more about additional data egeria itself may need to provide

Egeria UI failing in 3.1 (and 3.0)

Environment: 3.1 Egeria release, and specifically the coco pharma lab environment. Using Egeria-UI 3.0.1

Asset Search does not appear to work. The drop down list for EntityType offers no choices, and does not accept known type names. There is no documentation hinting at what should be selected
Tex, Rex, fail. The default configuration of https://lab-datalake:9443 cocoMDS4 does not work, but not do any other the other platforms or servers. There's no docs addressing if this has changed, or what OMASs are needed

As such the UI is broken and undocumented in this demo environment.

For release 3.1 I will disable these components until such time as we fix, or refactor/drop the support as per #19

Microk8s (& other k8s implementation) support/doc

A user recently had problems running the odpi-egeria-vdc helm chart on 'microk8s' + linux.

Kafka failed to start up reporting name resolution issues - other containers would likely experience similar issues.

It appears microk8s may not enable DNS by default - however after trying an initial fix to enable DNS we switched to docker-for-desktop instead.

To a large extent we have to assume a 'working' k8s environment, and indeed most cloud environments just work at this level (openshift, IBM iks, azure, civo ...)

But as developers experiment with egeria some may have little knowledge of k8s - whilst docker-for-desktop works well on macOS, and to some extent on Windows, on Linux you are on your own. Even for developers using mac/windows they may need to run k8s on linux for resource reasons (ram etc)

The most likely environments may be

minikube
openshift OKD
microk8s
k3s

Raising issue

for visibility - so any microk8s users are directed here
To investigate which options work the best
To address any issues in our charts if found
To add appropriate documentation

nginx requires container setuid/setguid

In the latest 3.3 prerelease chart, the egeria-ui cannot be accessed via :443 (directed at nginx service) due to

$ kubectl logs lab-odpi-egeria-lab-nginx-5d874c84d7-8q4bt                                                      [10:35:51]
/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
10-listen-on-ipv6-by-default.sh: info: Getting the checksum of /etc/nginx/conf.d/default.conf
10-listen-on-ipv6-by-default.sh: info: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf
/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
20-envsubst-on-templates.sh: Running envsubst on /etc/nginx/templates/..data/default.conf.template to /etc/nginx/conf.d/..data/default.conf
20-envsubst-on-templates.sh: Running envsubst on /etc/nginx/templates/default.conf.template to /etc/nginx/conf.d/default.conf
20-envsubst-on-templates.sh: Running envsubst on /etc/nginx/templates/..2021_10_12_09_35_45.797156721/default.conf.template to /etc/nginx/conf.d/..2021_10_12_09_35_45.797156721/default.conf
/docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh
/docker-entrypoint.sh: Configuration complete; ready for start up
2021/10/12 09:35:53 [notice] 1#1: using the "epoll" event method
2021/10/12 09:35:53 [notice] 1#1: nginx/1.21.3
2021/10/12 09:35:53 [notice] 1#1: built by gcc 8.3.0 (Debian 8.3.0-6)
2021/10/12 09:35:53 [notice] 1#1: OS: Linux 3.10.0-1160.42.2.el7.x86_64
2021/10/12 09:35:53 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2021/10/12 09:35:53 [notice] 1#1: start worker processes
2021/10/12 09:35:53 [notice] 1#1: start worker process 42
2021/10/12 09:35:53 [notice] 1#1: start worker process 43
2021/10/12 09:35:53 [notice] 1#1: start worker process 44
2021/10/12 09:35:53 [notice] 1#1: start worker process 45
2021/10/12 09:35:53 [emerg] 42#42: setgid(101) failed (1: Operation not permitted)
2021/10/12 09:35:53 [notice] 1#1: signal 17 (SIGCHLD) received from 42
2021/10/12 09:35:53 [notice] 1#1: worker process 42 exited with code 2
2021/10/12 09:35:53 [alert] 1#1: worker process 42 exited with fatal code 2 and cannot be respawned
2021/10/12 09:35:53 [emerg] 43#43: setgid(101) failed (1: Operation not permitted)
2021/10/12 09:35:53 [notice] 1#1: signal 17 (SIGCHLD) received from 43
2021/10/12 09:35:53 [notice] 1#1: worker process 43 exited with code 2
2021/10/12 09:35:53 [alert] 1#1: worker process 43 exited with fatal code 2 and cannot be respawned
2021/10/12 09:35:53 [emerg] 44#44: setgid(101) failed (1: Operation not permitted)
2021/10/12 09:35:53 [notice] 1#1: signal 17 (SIGCHLD) received from 44
2021/10/12 09:35:53 [notice] 1#1: worker process 44 exited with code 2
2021/10/12 09:35:53 [alert] 1#1: worker process 44 exited with fatal code 2 and cannot be respawned
2021/10/12 09:35:53 [emerg] 45#45: setgid(101) failed (1: Operation not permitted)
2021/10/12 09:35:53 [notice] 1#1: signal 29 (SIGIO) received
2021/10/12 09:35:53 [notice] 1#1: signal 17 (SIGCHLD) received from 45
2021/10/12 09:35:53 [notice] 1#1: worker process 45 exited with code 2
2021/10/12 09:35:53 [alert] 1#1: worker process 45 exited with fatal code 2 and cannot be respawned

cc: @lpalashevski

New base chart to support dojo or modify existing base chart

For day 1 of the egeria dojo we want to run a containerized version of egeria with zero configuration
done, so that the user can be guided through that configuration as part of the dojo

The egeria-base chart currently runs an OOTB configuration with a single metadata server and all OMASs enabled.

Either we need an alternative chart based on this one

may be clearer as no parameters to worry about
OR

Configuration

changing the default may affect existing chart users
adding parms is extra confusion for a newbie
single chart means less maintenance

kafka for arm64 (pi/apple m1) - move to Strimzi

Our helm charts use kafka for cohort communication.

We currently use kafka charts provided by bitnami - https://bitnami.com/stack/kafka/helm . These do not yet support arm64, which are needed to run without emulation (if available!) on apple m1 & raspberry pi (64 bit raspbian)

Options to allow our charts to work include

Rely on emulation -- what is the state of this in the various container runtimes?
Wait for bitnami to support arm64 -- we don't know when
locate another helm chart to depend on
Find other docker images, and include in our chart
Build our own kafka on arm64 & embed in a docker image/chart
Make it easier to specify external kafka (see below)

On the last point - changing the chart to do this is relatively easy. However it adds a lot of complexity upon the user -- and these charts are intended to provide a simple out of the box experience to support tutorials and demos... One of the complexities is the network environment, especially if everything else is running in a restricted container-runtime managed network ....

References

Strimzi - no plans to provide an image, but hope to make it buildable - strimzi/strimzi-kafka-operator#3357

Configurable PTS chart

Provide a chart that can be used to run the PTS (performance test suite) against any repository, simply by overriding some of the inputs to the chart.

Fully qualify all image names for containers

Our helm charts refer to images such as 'nginx:latest'. This assumes a default registry of registry-1.docker.io

When using alternate container build tools such as 'podman', there is no default registry (by default....) and best practice is to fully qualify all image names

update charts to use 3.5 react ui

Publish helm charts to helm chart repository

The egeria-palisade project needs to make use of our 'lab' helm chart
To enable this easily, the egeria lab chart needs to be served via a helm repo as per https://helm.sh/docs/chart_repository/

Ultimately this needs to be done as part of the build process, to ensure these charts are always current. However for expediency in integration the generated file is being checked in to enable this reuse quickly and will be refined in time.

For now only the 'lab' chart is exposed.

Part 1 - create repo (can only be tested after merge due to the way github pages works....)

Add support for inmemory repository as a native provider in cts/pts

Currently if the cts/pts charts are setup to use the 'native' providers and we set

 tut:
   connectorProvider: "org.odpi.openmetadata.adapters.repositoryservices.inmemory.repositoryconnector.                                     InMemoryOMRSRepositoryConnectorProvider"

then the cts (for example) will fail to initialize with

 > Configuring technology under test:

{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/server-url-root?url=https://cts-platform:9443)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/server-type?typeName=TUT)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/organization-name?name=Egeria)
{"class":"VoidResponse","relatedHTTPCode":400,"exceptionClassName":"org.odpi.openmetadata.adminservices.ffdc.exception.OMAGConfigurationErrorException","actionDescription":"setLocalMetadataCollectionName","exceptionErrorMessage":"OMAG-ADMIN-400-008 The local repository mode has not been set for OMAG server tut","exceptionErrorMessageId":"OMAG-ADMIN-400-008","exceptionErrorMessageParameters":["tut"],"exceptionSystemAction":"The local repository mode must be enabled before the event mapper connection is set.  The system is unable to configure the local server.","exceptionUserAction":"The local repository mode is supplied by the caller to the OMAG server. This call to enable the local repository needs to be made before the call to set the event mapper connection."}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/local-repository/metadata-collection-name/TUT_MDR)
{"class":"VoidResponse","relatedHTTPCode":200}
   (200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/event-bus?topicURLRoot=egeria)
-- Unknown native repository provider: org.odpi.openmetadata.adapters.repositoryservices.inmemory.repositoryconnector.InMemoryOMRSRepositoryConnectorProvider -- exiting.

This is because the logic in cts/pts for native is

if [ "${TUT_TYPE}" = "native" ]; then
  if [ "${CONNECTOR_PROVIDER}" = "org.odpi.openmetadata.adapters.repositoryservices.graphrepository.repositoryconnector.GraphOMRSRepositoryConnectorProvider" ]; then
    curl -f -k -w "\n   (%{http_code} - %{url_effective})\n" --silent -X POST \
      "${EGERIA_ENDPOINT}/open-metadata/admin-services/users/${EGERIA_USER}/servers/${TUT_SERVER}/local-repository/mode/local-graph-repository" || exit $?
  else
    echo "-- Unknown native repository provider: ${CONNECTOR_PROVIDER} -- exiting."
    exit 1
  fi

This condition should be extended to allow for the in-mem repository & perform the appropriate configuration

Pull-policy: Use IfNotExist for released images, Always for SNAPSHOTS

Currently the default pullPolicy for images is set to IfNotExist to reduce impact on container registries & specifically
running out of limits on dockerhub. For final/released code which does not change this is fine.

However when running charts that refer to our -SNAPSHOT builds (ie in latter stages of testing a release, or in development) one should override the setting with Always to get the latest container image. This is easily done but can be forgotten.

Rather than keep changing charts to switch the default between prerelease and final, it would makes sense to
modify the logic so that the default pullPolicy is set for an image based on whether it is a SNAPSHOT build. We have done
this in other helm charts previously.

odpi-egeria-lab 3.9 fails on openshift: zookeeper: permission denied

When running the lab chart on a completely freshly installed (v 4.10) open shift cluster I get:|

➜  master git:(master) kubectl logs lab-strimzi-zookeeper-0        
Detected Zookeeper ID 1
mkdir: cannot create directory '/var/lib/zookeeper/data': Permission denied

This is from Strimzi 0.29, with the image 'quay.io/strimzi/kafka:0.29.0-kafka-3.0.0'

Update lab chart docs with notes on openshift security context

I try to get started with ODPi using the descriptions found here: https://egeria.odpi.org/open-metadata-resources/open-metadata-labs/

I installed the lab in my Openshift cluster using helm. On startup of the pod lab-odpi-egeria-lab-jupyter I get the event:

"Error creating: pods "lab-odpi-egeria-lab-jupyter-56f7fb969f-" is forbidden: unable to validate against any security context constraint: [fsGroup: Invalid value: []int64{100}: 100 is not an allowed group]"

and the pod is not starting.

Any suggestions?

Separate PTS server from TUT server

Currently the PTS chart configure the PTS server within the same OMAG Server Platform as the technology under test (TUT) server is configured. While for relatively small scales (up to ~300 000 metadata instances) this seems to work, once we grow beyond this (to 500 000+) it seems we start to hit scenarios where we overrun the Java heap.

My hunch is that this is likely due to a combination of the TUT's memory footprint increasing with the increased volume of metadata, while at the same time the PTS itself will be consuming significantly more memory given that its workpad is storing all of its results in Maps (that reside on the JVM heap) -- and thus as the volume of metadata increases, the number of tests (and thus results) stored therein also significantly increases.

This issue is therefore a suggestion that we separate out these two servers into their own OMAG Server Platforms, each therefore having its own dedicated JVM heap space to allocate accordingly.

(We will still hit these limits again at some point, but hopefully not until we reach into multiple millions of metadata instances -- and it will be good to confirm at that point whether it is the PTS server itself that runs out of heap or the TUT.)

Correct broken link to k8s docs & bad install instruction

The k8s docs on egeria-docs were changed in

Need to correct the link in the top level readme (currently https://odpi.github.io/egeria-docs/guides/admin/kubernetes/intro/

See odpi/egeria-docs#120 3a0874068e8b0a75353789c33fe8780077c91da8

Occasional failure for cts tests to start

On a few occurances when testing CTS, the tests never start.

The symptom is

Mon Mar 28 16:43:15 GMT 2022 tut Cohort OCF-FILE-REGISTRY-STORE-CONNECTOR-0115 Creating new cohort registry store ./data/servers/tut/cohorts/cts.registrystore
Mon Mar 28 16:43:15 GMT 2022 tut Startup OCF-KAFKA-TOPIC-CONNECTOR-0010 The Apache Kafka producer for topic egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.instances is starting up with 0 buffered messages
Mon Mar 28 16:43:15 GMT 2022 tut Startup OMRS-AUDIT-0015 The listener thread for an OMRS Topic Connector for topic egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.instances has started
Mon Mar 28 16:43:15 GMT 2022 tut Cohort OMRS-AUDIT-0060 Registering with open metadata repository cohort cts using metadata collection id 278cf6da-3f76-4cf0-9153-eebc93b25366
Mon Mar 28 16:43:15 GMT 2022 tut Event OMRS-AUDIT-8009 The Open Metadata Repository Services (OMRS) has sent event of type Registry Event to the cohort topic egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.registration
Mon Mar 28 16:43:15 GMT 2022 tut Cohort OMRS-AUDIT-0062 Requesting registration information from other members of the open metadata repository cohort cts
Mon Mar 28 16:43:15 GMT 2022 tut Event OMRS-AUDIT-8009 The Open Metadata Repository Services (OMRS) has sent event of type Registry Event to the cohort topic egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.registration
Mon Mar 28 16:43:15 GMT 2022 tut Startup OMRS-AUDIT-0031 The local repository outbound event manager is starting with 1 type definition event consumer(s) and 1 instance event consumer(s)
Mon Mar 28 16:43:15 GMT 2022 tut Startup OMRS-AUDIT-0032 The local repository outbound event manager is sending out the 874 type definition events that were generated and buffered during server initialization
Mon Mar 28 16:43:16 GMT 2022 tut Startup OMAG-ADMIN-0004 The tut server has successfully completed start up.  The following services are running: [Open Metadata Repository Services (OMRS)]
Mon Mar 28 16:44:55 GMT 2022 cts Information CONFORMANCE-SUITE-0008 The Open Metadata Repository Conformance Workbench repository-workbench is waiting for server tut to join the cohort
Mon Mar 28 16:46:36 GMT 2022 cts Information CONFORMANCE-SUITE-0008 The Open Metadata Repository Conformance Workbench repository-workbench is waiting for server tut to join the cohort

It's likely this is some kind of kafka/cohort registry issue

CTS / PTS chart enhancements

Unbundle the init-and-report pod of the CTS and PTS charts to separate out the configuration and startup of the CTS/PTS and the busy-wait loop that eventually collects the detailed outputs.

This is primarily to ensure the configuration and startup is only ever done once, while if the busy-wait loop / results collection should fail for some reason it can safely be re-run without impacting the actual results.

(Currently if the init-and-report pod fails it will be restarted, causing a reconfiguration and new instance calls to be made against the CTS/PTS itself which will destroy any results that may have been collected -- in high volume PTS scenarios that may take days to run, this is a significant loss!)

correct server author config in base egeria chart

Document coding standard for helm charts

Document standards for helm charts
(#107 is a good example where they are approaches we've taken to-date that are not clear for others contributing charts -- and documenting also provides a mechanism for review & updating guidelines)

storageClass in values.yaml not used correctly in platform.yaml

https://github.com/odpi/egeria-charts/blob/main/charts/egeria-base/templates/platform.yaml
In above link the last lines contain a typo.

the first line below has storageClass (start with lowercase)
then 2nd line below has StorageClass (start with uppercase).
Since it is case sensitive the value will not be used to set the storageClassName

{{ if .Values.egeria.storageClass }}
storageClassName: {{ .Values.egeria.StorageClass }}
{{ end }}

Example helm chart for postgres

Provide an example helm chart that

uses the 'egeria-base' chart as a starting point
adds postgres image
adds appropriate configuration for postgres

Depends on:
https://github.com/odpi/egeria/issues/5379
https://github.com/odpi/egeria/issues/1514

Planning to move this to new 'egeria-charts' repo once available.

PR verification - enforce SPDX headers

SPDX headers should be enforced

Add ssh capability between deployed containers (odpi-egeria-lab)

When building demos in our coco pharma environment, on occasion we need to similar things happening on different systems - outside the scope of egeria operations. In particular this is often in the context of running a demo 'script' via a notebook

For example

Files being created/deleted in a filesystem
Manipulation of a third party tool (for example, postgres) to simulate activity
manipulating files being they are made available to the server, and there there is no client access (intentionally, ie with content packs)
Directly running egeria samples and utilities
viewing logs, files, perhaps grepping out for errors or demonstrating what one might expect to see - not just for egeria

In some cases there are also API possibilities (files, postgres), sometimes not (egeria utilities -- if we want to demonstrate the utility itself, rather than the API)

Ensuring ssh access between these containers USED FOR A DEMO, particularly from the jupyter environment to the other containers, will make it possible to more quickly develop these scenarios, often before more fundamental changes (adding a file server, finding the best python libraries to use etc) is in place - ie more adaptable.

Note that access to the k8s cli is another option (kubectl exec) but in general I would err on using ssh as it's more understandable for most -- unless it is k8s itself being demoed.

Also note that our containers DO NOT RUN AS ROOT, so there are some limits in what can be run. Also containers in general are cut down in what commands are installed.

Data Lineage - helm chart update

Tested latest helm chart for lineage - 3.3.0-prerelease.0

However the data linage chart is not included since the included jupyter image (which includes our notebooks) is at version 3.2

As this is a 3.3 pre-release, updating version of all images to 3.3

Create updated prerelease chart with the 3.7 UI

The reactUI provided in version 3.7.0 of the charts is based off 3.5.0
This was due to needing a fix for : odpi/egeria-react-ui#391

This has now been tested, so I will create some new prerelease charts (3.7.1-prerelease.0 probably) which will deploy the new UI
.. .and test. These will use UI version 3.8.0-rc.0 & only advised to any users on request.

The other components in this chart will remain at the same levels

cts (pts?)openshift updates - security, memory

The current cts chart fails due to security limitations on openshift

  Warning  FailedCreate  77s (x15 over 2m40s)  statefulset-controller  create Pod cts-kafka-0 in StatefulSet cts-kafka failed error: pods "cts-kafka-0" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, provider restricted: .spec.securityContext.fsGroup: Invalid value: []int64{1001}: 1001 is not an allowed group, spec.containers[0].securityContext.runAsUser: Invalid value: 1001: must be in the ranges: [1000670000, 1000679999], provider "ibm-restricted-scc": Forbidden: not usable by user or serviceaccount, provider "nonroot": Forbidden: not usable by user or serviceaccount, provider "sparkscc": Forbidden: not usable by user or serviceaccount, provider "ibm-anyuid-scc": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "ibm-anyuid-hostpath-scc": Forbidden: not usable by user or serviceaccount, provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "ibm-anyuid-hostaccess-scc": Forbidden: not usable by user or serviceaccount, provider "node-exporter": Forbidden: not usable by user or serviceaccount, provider "ibm-privileged-scc": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount]

Approaches to fix include

Use strimzi, as with our other charts. This also means cts will work on apple silicon & pi - though the startup does take significantly longer (several mintutes). However given CTS can take a long time to run this is likely not an issue
Try and fix up the default security context etc as setup by the current cts chart
Add the service account used for this chart in openshift to a less restrictive scc. This involves no code changes, but may impact anyone trying to run these charts

I will propose the first option, as it is aligned with our changes for the base/lab charts. In future it would also allow for better/more realistic kafka throughput with multiple brokers easily scaled.

Some hostnames etc may change
cc: @cmgrote

Configurable CTS chart

Provide a chart that can be used to run the CTS (repository workbench) against any repository, simply by overriding some of the inputs to the chart.

Shell typo in config-egeria.sh

Reported by snyk:

ShellCheck
Fix rate: > 60%
Command name starts with =. Bad line break?

 charts/.../config-egeria.sh 57:1
 Shellcheck

CTS chart clobbering LOADER_PATH of base image

The default CTS values.yaml file is insufficient to successfully run the CTS with only the default values therein, because it refers to the Graph repository connector and this connector must now be downloaded separately (no longer embedded in the Egeria images).

Fix should be relatively straightforward:

Add a downloads.url value that points to the location for downloading the latest released version of the graph connector (from https://search.maven.org/artifact/org.odpi.egeria/graph-repository-connector)

(Release 3.3) Egeria container fails to start

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

When testing the egeria 3.3 release, the egeria container is failing to start with the error below

$ kubectl logs lab-odpi-egeria-lab-dev-0 [14:57:39]
Starting the Java application using /opt/jboss/container/java/run/run-java.sh ...
INFO exec java -XX:+UseParallelOldGC -XX:MinHeapFreeRatio=10 -XX:MaxHeapFreeRatio=20 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90 -XX:+ExitOnOutOfMemoryError -XX:MaxMetaspaceSize=1g -cp "." -jar /deployments/server/server-chassis-spring-3.3-SNAPSHOT.jar
Project Egeria - Open Metadata and Governance
____ __ ___ ___ ______ _____ ____ _ _ ___
/ __ \ / |/ // | / / / / ___ ____ _ __ ___ ____ / _ \ / / __ / / / _ / ____ _ _
/ / / // /|/ // /| | / / __ _ \ / _ \ / __/| | / // _ \ / __/ / // // // | / \ / / / | / // || |
/ // // / / // ___ |/ // / / // _// / | |/ // // / / __ // // / \ / / / // / // / / / / /
_/// //// ||_/ // ___/// |/ _/// // // _////// _/// // /_/

:: Powered by Spring Boot (v2.5.5) ::

2021-10-28 13:57:15.111 INFO 1 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat initialized with port(s): 9443 (https)
2021-10-28 13:57:29.413 ERROR 1 --- [ main] o.s.boot.SpringApplication : Application run failed

org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'modelConverterRegistrar' defined in class path resource [org/springdoc/core/SpringDocConfiguration.class]: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [org.springdoc.core.converters.ModelConverterRegistrar]: Factory method 'modelConverterRegistrar' threw exception; nested exception is java.lang.NoClassDefFoundError: com/fasterxml/jackson/dataformat/yaml/YAMLFactory
at org.springframework.beans.factory.support.ConstructorResolver.instantiate(ConstructorResolver.java:658) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:638) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateUsingFactoryMethod(AbstractAutowireCapableBeanFactory.java:1352) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutowireCapableBeanFactory.java:1195) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:582) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:542) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:335) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:234) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:333) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:208) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:944) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:918) ~[spring-context-5.3.10.jar!/:5.3.10]
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:583) ~[spring-context-5.3.10.jar!/:5.3.10]
at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.refresh(ServletWebServerApplicationContext.java:145) ~[spring-boot-2.5.5.jar!/:2.5.5]
at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:754) ~[spring-boot-2.5.5.jar!/:2.5.5]
at org.springframework.boot.SpringApplication.refreshContext(SpringApplication.java:434) ~[spring-boot-2.5.5.jar!/:2.5.5]
at org.springframework.boot.SpringApplication.run(SpringApplication.java:338) ~[spring-boot-2.5.5.jar!/:2.5.5]
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1343) ~[spring-boot-2.5.5.jar!/:2.5.5]
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1332) ~[spring-boot-2.5.5.jar!/:2.5.5]
at org.odpi.openmetadata.serverchassis.springboot.OMAGServerPlatform.main(OMAGServerPlatform.java:93) ~[classes!/:na]
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:na]
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:na]
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:na]
at java.base/java.lang.reflect.Method.invoke(Method.java:566) ~[na:na]
at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:49) ~[server-chassis-spring-3.3-SNAPSHOT.jar:na]
at org.springframework.boot.loader.Launcher.launch(Launcher.java:108) ~[server-chassis-spring-3.3-SNAPSHOT.jar:na]
at org.springframework.boot.loader.Launcher.launch(Launcher.java:58) ~[server-chassis-spring-3.3-SNAPSHOT.jar:na]
at org.springframework.boot.loader.PropertiesLauncher.main(PropertiesLauncher.java:467) ~[server-chassis-spring-3.3-SNAPSHOT.jar:na]
Caused by: org.springframework.beans.BeanInstantiationException: Failed to instantiate [org.springdoc.core.converters.ModelConverterRegistrar]: Factory method 'modelConverterRegistrar' threw exception; nested exception is java.lang.NoClassDefFoundError: com/fasterxml/jackson/dataformat/yaml/YAMLFactory
at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:185) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.ConstructorResolver.instantiate(ConstructorResolver.java:653) ~[spring-beans-5.3.10.jar!/:5.3.10]
... 27 common frames omitted
Caused by: java.lang.NoClassDefFoundError: com/fasterxml/jackson/dataformat/yaml/YAMLFactory
at io.swagger.v3.core.util.Json.mapper(Json.java:13) ~[swagger-core-2.1.11.jar!/:2.1.11]
at io.swagger.v3.core.converter.ModelConverters.(ModelConverters.java:31) ~[swagger-core-2.1.11.jar!/:2.1.11]
at io.swagger.v3.core.converter.ModelConverters.(ModelConverters.java:23) ~[swagger-core-2.1.11.jar!/:2.1.11]
at org.springdoc.core.converters.ModelConverterRegistrar.(ModelConverterRegistrar.java:42) ~[springdoc-openapi-common-1.5.12.jar!/:1.5.12]
at org.springdoc.core.SpringDocConfiguration.modelConverterRegistrar(SpringDocConfiguration.java:229) ~[springdoc-openapi-common-1.5.12.jar!/:1.5.12]
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:na]
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:na]
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:na]
at java.base/java.lang.reflect.Method.invoke(Method.java:566) ~[na:na]
at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:154) ~[spring-beans-5.3.10.jar!/:5.3.10]
... 28 common frames omitted
Caused by: java.lang.ClassNotFoundException: com.fasterxml.jackson.dataformat.yaml.YAMLFactory
at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:476) ~[na:na]
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:589) ~[na:na]
at org.springframework.boot.loader.LaunchedURLClassLoader.loadClass(LaunchedURLClassLoader.java:151) ~[server-chassis-spring-3.3-SNAPSHOT.jar:na]
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522) ~[na:na]
... 38 common frames omitted

Expected Behavior

chassis should launch ok

Steps To Reproduce

No response

Environment

- Egeria:
- OS:
- Java:
- Browser (for UI issues):
- Additional connectors and integration:

Any Further Information?

No response

Incorporate enhancements from contributed charts

Tracking the PR proposals as #107

Review needs to be completed & any followup work identified

(opening to allow prioritization via zenhub)

Add option to use XTDB in coco/base charts

Currently both charts use inmemory or the graph repo. In memory isn't persistent - fine for the first steps of a demo, but can be frustrating if developers use the environment as a springboard for further investigation. The graph repo is slow.

Meanwhile xtdb is compelling, the performance figures look good, so we could point people down a better path by including xtdb in these charts

Additionally I need xtdb to support my operator work. Whilst the charts using the operator will be rather different in terms of egeria, the setup of xtdb would follow a similar pattern

egeria-base chart: Server author fails with "Error getting all servers"

The 3.4 pre-release helm chart 'base' fails to work with Server author, reporting 'Error getting all servers' when this is
clicked in the UI . This was also present in 3.3

The lab chart works ok, so it's likely missing configuration that has since been added to the notebooks, but we need to add to this chart
See odpi/egeria#5903 where the fix was added to that notebook environment only for 3.3

egeria-base chart fails on clean openshift install

In addition to #157 which affected all charts on a clean openshift install, the egeria-base chart 3.9.1 fails during configuration with:

{"class":"VoidResponse","relatedHTTPCode":500,"exceptionClassName":"org.odpi.openmetadata.adminservices.ffdc.exception.OMAGConfigurationErrorException","exceptionCausedBy":"org.odpi.openmetadata.frameworks.connectors.ffdc.OCFRuntimeException","actionDescription":"setServerURLRoot","exceptionErrorMessage":"OMAG-ADMIN-500-001 Method setServerURLRoot for OMAG server mds1 returned an unexpected exception of org.odpi.openmetadata.frameworks.connectors.ffdc.OCFRuntimeException with message ENCRYPTED-DOC-STORE-400-008  Unable to create secure location for storing encryption key.","exceptionErrorMessageId":"OMAG-ADMIN-500-001","exceptionErrorMessageParameters":["mds1","setServerURLRoot","org.odpi.openmetadata.frameworks.connectors.ffdc.OCFRuntimeException","ENCRYPTED-DOC-STORE-400-008  Unable to create secure location for storing encryption key."],"exceptionSystemAction":"The system is unable to configure the OMAG server.  No change was made to the server's configuration document.","exceptionUserAction":"This is likely to be either an operational or logic error. Look for other errors.  Validate the request.  If you are stuck, raise an issue."} [1160 bytes data]
^M100  1153    0  1153    0     0   3117      0 --:--:-- --:--:-- --:--:--  3141
* Connection #0 to host base-platform left intact

Also the configuration script does not check the response, so the configuration continues & it is not clear it is not working until one tries to use it...

The same chart works fine on rancher desktop.

There are no errors recorded in the audit log

The issue is likely permisions/effective userids used for accessing the data volume

A workaround is to use a more liberal scc than the default. However the chart should work 'as is'

Validate charts by deploying and checking pods ready

Currently there is no validation testing of PRs.

We could deploy each chart against a kind environment (see the postgres database connector pipelines)
and wait to at least check all pods are ready

odpi / egeria-charts Goto Github PK

egeria-charts's People

Stargazers

Watchers

Forkers

egeria-charts's Issues

Is there an existing issue for this?

Please describe the new behavior that that will improve Egeria

Alternatives

Any Further Information?

Would you be prepared to be assigned this issue to work on?

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Any Further Information?

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Any Further Information?

Recommend Projects

Recommend Topics

Recommend Org