odpi / egeria-charts Goto Github PK
View Code? Open in Web Editor NEWHelm chart repository
Home Page: https://odpi.github.io/egeria-charts
License: Apache License 2.0
Helm chart repository
Home Page: https://odpi.github.io/egeria-charts
License: Apache License 2.0
Followon from odpi/egeria#4670
We need to document & make easy the approach to configure the certificate authority & CA etc to be used across all components when making use of our docker image (extra JVM parms are a minimum) and helm charts. This also extends to the egeria operator
Currently the image/charts rely on the default behaviour and file locations
On some K8s clusters like IBM ICP we must explicitly specify resource for each pod. It is ICP requirement. Also, performance wise some Egeria components work much better with more resources available. On ICP, Atlas pod, for example, wouldn't even start without at least 2 cpus, but significant improvement is visible when cpu = 4, while igcproxy requires 2 cpus.
It should be possible to optionally specify resources for each pod in values.yaml E.g. (suggested def values)
resources:
limits:
cpu: 1
memory: 2Gi
requests:
cpu: 10m
memory: .01Gi
Create an end to end demonstation of openlineage
This would be based on the coco helm charts, and would make use of the marquez docker image to show this integration
No response
No response
If helm gets out of sync with kubernetes -- for example due to a bug or timeout, various egeria resources are left active.
It's helpful to document how these can be cleaned up.
For example
kubectl delete pods,services,deployments,statefulsets -l app.kubernetes.io/name=odpi-egeria-lab
kubectl delete pods,services,deployments,statefulsets -l app.kubernetes.io/name=kafka
kubectl delete pods,services,deployments,statefulsets -l app.kubernetes.io/name=zookeeper
Similarly use odpi-egeria-vdc but add in whatever is needed for ldap
Dockerhub has limits (for free accounds) on pulling container images
This can mean an attempt to use a Kubernetes Helm chart which uses nginx might fail as:
Warning Failed 56s kubelet Failed to pull image "docker.io/nginx": rpc error: code = Unknown desc = Error reading manifest latest in docker.io/library/nginx: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
Image is pulled without error
No response
n/a
See:
Current helm charts are :
NAME CHART VERSION APP VERSION DESCRIPTION
egeria/egeria-base 3.3-prerelease.1 3.3-SNAPSHOT Egeria simple deployment to Kubernetes
egeria/egeria-cts 3.3-prerelease.0 3.3-SNAPSHOT Egeria Conformance Test Suite deployment to Kub...
egeria/egeria-pts 3.3-prerelease.0 3.3-SNAPSHOT Egeria Performance Test Suite deployment to Kub...
egeria/odpi-egeria-lab 3.3.0-prerelease.6 3.3-SNAPSHOT Egeria lab environment
The 'to Kubernetes' part of the description is not needed.
Maybe
Egeria single-server example
Egeria Conformance Test Suite
Egeria Performance Test Suite
Egeria Coco Pharmaceuticals lab
Improve the docs for getting started with the egerla lab k8s environment, in particular locally.
Once a kubernetes environment is available, installing the helm chart, notebooks etc goes well.
However it would be useful to have a simple on-boarding route for those new to k8s.
Previously docker-for-desktop has been suggested, but this can be problematic
Look at options, possibly provide a few....
Our egeria-base chart is a good starting point for a simple egeria deployment (with persistence)
However it's become clear when trying to add a new postgres chart (one example of many) that we may wish to
easily add
These are common practice when developing helm charts and some useful tips are documented at https://dzone.com/articles/the-art-of-the-helm-chart-patterns-from-the-offici
In the egeria 3.3 prerelease (.2) chart, the Egeria UI is not loading under either OpenShift 4.8 or microk8s (macOS)
In both cases, going to the nginx-server endpoint results in a single webpage showing 'Loading ...'
It's possible there is a version discrepancy. Egeria ui chassis/egeria is at 3.3-SNAPSHOT, whilst egeria-ui is at 3.0.1
Note that this is the latest release (from August) as tested with egeria 3.2. However the release version is still set to 3.0.1 in main, meaning the images are constantly being updated rather than being set to a future prerelease (though this may not be the cause of this issue)
@lpalashevski @sarbull can you take a look? Let me know if you need anything. If we are to have the UI enabled for 3.3 we need to get a stable UI configuration in the charts.
When analysing egeria logs using LogDNA I noticed that whilst some of our pods have a fixed name in the helm chart, others are still uniquely generated ie:
For ease of filtering it would be useful if pod names were constant.
I also noticed the base chart has a pod with a unique name ie
base base-kafka-0 1/1 Running 1 5d3h
base base-zookeeper-0 1/1 Running 0 5d3h
base egeria-base-config-m88vh 0/1 Completed 0 5d3h
base egeria-base-platform-0
Constant names would make it easier to set up filters.
The pod name is already qualified by helm chart deployment name, which I think is suitably unique already.
Attempting to override the default ClusterIP
services with a NodePort
type instead, to ease external exposure for vanilla k8s clusters, does not work. Helm notes an error when rendering the templates:
Error: YAML parse error on odpi-egeria-lab/templates/egeria-core.yaml: error converting YAML to JSON: yaml: line 16: mapping values are not allowed in this context
Investigating the debug output of Helm indicates that the line-chomping - }}
on the conditionals that wrap the NodePort
configuration are a bit too aggressive, and are causing the nodePort: ....
line to be directly concatenated to the end of the targetPort: ...
value.
Currently the 'odpi-egeria-lab' contains
However
In part this is because the drive for this UI has focussed on different use cases, and has been more oriented to initial production use where it offers a Business level interface supporting functionality such as Open Lineage
For this reason it doesn't fit well in the current lab demo.
Therefore I propose to remove
From the 'odpi-egeria-lab'
Then, we can offer an 'odpi-egeria-lab-withui' or similar chart, which will have the odpi-egeria-lab chart as a dependency (which will include the other core elements), and then add in the ui static/api content & proxy
Furthermore since the release cycle of the UI & team ownership differs this allows for appropriate delegation and asynchronous development of the two pieces. The egeria release focussing on the core, and a UI release where the additional features can be developed & tested, notebooks updated & docs clarified.
Finally this tightens up a scenario where the operator can be dropped in more cleanly when ready.
This should add more flexibility, clarity & still retain the container-based demo deployment approach.
cc: @lpalashevski
Then tried to install odpi-egeria-lab
Install fails with:
Error: INSTALLATION FAILED: rendered manifests contain a resource that already exists. Unable to continue with install: ClusterRole "strimzi-cluster-operator-namespaced" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "meta.helm.sh/release-name" must equal "lab": current value is "base"; annotation validation error: key "meta.helm.sh/release-namespace" must equal "lab": current value is "base"
Currently coco pharma exposes http/https for web ui via
If this is deployed to kubernetes in the cloud - or indeed locally, it requires at least 3 distinct load balancer configurations or port forwards, making the setup more convoluted (and possibly more costly) [feedback received]
We could consolidate these APIs all behind an nginx configuration, especially as we already use nginx. This would mean only a single port would need forwarding.
Additionally we should clarify in the docs the approaches to exposing this port
(Will move this issue to the docs repo once any code changes are complete)
Tools like prometheus are used extensively to understand how an application is behaving/performing in large scale environments
We already have many pieces to make this work for egeria
We should add documentation on what is already possible.
In addition should consider
See also odpi/egeria#5066
Environment: 3.1 Egeria release, and specifically the coco pharma lab environment. Using Egeria-UI 3.0.1
As such the UI is broken and undocumented in this demo environment.
For release 3.1 I will disable these components until such time as we fix, or refactor/drop the support as per #19
A user recently had problems running the odpi-egeria-vdc helm chart on 'microk8s' + linux.
Kafka failed to start up reporting name resolution issues - other containers would likely experience similar issues.
It appears microk8s may not enable DNS by default - however after trying an initial fix to enable DNS we switched to docker-for-desktop instead.
To a large extent we have to assume a 'working' k8s environment, and indeed most cloud environments just work at this level (openshift, IBM iks, azure, civo ...)
But as developers experiment with egeria some may have little knowledge of k8s - whilst docker-for-desktop works well on macOS, and to some extent on Windows, on Linux you are on your own. Even for developers using mac/windows they may need to run k8s on linux for resource reasons (ram etc)
The most likely environments may be
Raising issue
In the latest 3.3 prerelease chart, the egeria-ui cannot be accessed via :443 (directed at nginx service) due to
$ kubectl logs lab-odpi-egeria-lab-nginx-5d874c84d7-8q4bt [10:35:51]
/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
10-listen-on-ipv6-by-default.sh: info: Getting the checksum of /etc/nginx/conf.d/default.conf
10-listen-on-ipv6-by-default.sh: info: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf
/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
20-envsubst-on-templates.sh: Running envsubst on /etc/nginx/templates/..data/default.conf.template to /etc/nginx/conf.d/..data/default.conf
20-envsubst-on-templates.sh: Running envsubst on /etc/nginx/templates/default.conf.template to /etc/nginx/conf.d/default.conf
20-envsubst-on-templates.sh: Running envsubst on /etc/nginx/templates/..2021_10_12_09_35_45.797156721/default.conf.template to /etc/nginx/conf.d/..2021_10_12_09_35_45.797156721/default.conf
/docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh
/docker-entrypoint.sh: Configuration complete; ready for start up
2021/10/12 09:35:53 [notice] 1#1: using the "epoll" event method
2021/10/12 09:35:53 [notice] 1#1: nginx/1.21.3
2021/10/12 09:35:53 [notice] 1#1: built by gcc 8.3.0 (Debian 8.3.0-6)
2021/10/12 09:35:53 [notice] 1#1: OS: Linux 3.10.0-1160.42.2.el7.x86_64
2021/10/12 09:35:53 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2021/10/12 09:35:53 [notice] 1#1: start worker processes
2021/10/12 09:35:53 [notice] 1#1: start worker process 42
2021/10/12 09:35:53 [notice] 1#1: start worker process 43
2021/10/12 09:35:53 [notice] 1#1: start worker process 44
2021/10/12 09:35:53 [notice] 1#1: start worker process 45
2021/10/12 09:35:53 [emerg] 42#42: setgid(101) failed (1: Operation not permitted)
2021/10/12 09:35:53 [notice] 1#1: signal 17 (SIGCHLD) received from 42
2021/10/12 09:35:53 [notice] 1#1: worker process 42 exited with code 2
2021/10/12 09:35:53 [alert] 1#1: worker process 42 exited with fatal code 2 and cannot be respawned
2021/10/12 09:35:53 [emerg] 43#43: setgid(101) failed (1: Operation not permitted)
2021/10/12 09:35:53 [notice] 1#1: signal 17 (SIGCHLD) received from 43
2021/10/12 09:35:53 [notice] 1#1: worker process 43 exited with code 2
2021/10/12 09:35:53 [alert] 1#1: worker process 43 exited with fatal code 2 and cannot be respawned
2021/10/12 09:35:53 [emerg] 44#44: setgid(101) failed (1: Operation not permitted)
2021/10/12 09:35:53 [notice] 1#1: signal 17 (SIGCHLD) received from 44
2021/10/12 09:35:53 [notice] 1#1: worker process 44 exited with code 2
2021/10/12 09:35:53 [alert] 1#1: worker process 44 exited with fatal code 2 and cannot be respawned
2021/10/12 09:35:53 [emerg] 45#45: setgid(101) failed (1: Operation not permitted)
2021/10/12 09:35:53 [notice] 1#1: signal 29 (SIGIO) received
2021/10/12 09:35:53 [notice] 1#1: signal 17 (SIGCHLD) received from 45
2021/10/12 09:35:53 [notice] 1#1: worker process 45 exited with code 2
2021/10/12 09:35:53 [alert] 1#1: worker process 45 exited with fatal code 2 and cannot be respawned
cc: @lpalashevski
For day 1 of the egeria dojo we want to run a containerized version of egeria with zero configuration
done, so that the user can be guided through that configuration as part of the dojo
The egeria-base chart currently runs an OOTB configuration with a single metadata server and all OMASs enabled.
Either we need an alternative chart based on this one
Configuration
Our helm charts use kafka for cohort communication.
We currently use kafka charts provided by bitnami - https://bitnami.com/stack/kafka/helm . These do not yet support arm64, which are needed to run without emulation (if available!) on apple m1 & raspberry pi (64 bit raspbian)
Options to allow our charts to work include
On the last point - changing the chart to do this is relatively easy. However it adds a lot of complexity upon the user -- and these charts are intended to provide a simple out of the box experience to support tutorials and demos... One of the complexities is the network environment, especially if everything else is running in a restricted container-runtime managed network ....
References
Provide a chart that can be used to run the PTS (performance test suite) against any repository, simply by overriding some of the inputs to the chart.
Our helm charts refer to images such as 'nginx:latest'. This assumes a default registry of registry-1.docker.io
When using alternate container build tools such as 'podman', there is no default registry (by default....) and best practice is to fully qualify all image names
The egeria-palisade project needs to make use of our 'lab' helm chart
To enable this easily, the egeria lab chart needs to be served via a helm repo as per https://helm.sh/docs/chart_repository/
Ultimately this needs to be done as part of the build process, to ensure these charts are always current. However for expediency in integration the generated file is being checked in to enable this reuse quickly and will be refined in time.
For now only the 'lab' chart is exposed.
Part 1 - create repo (can only be tested after merge due to the way github pages works....)
Currently if the cts/pts charts are setup to use the 'native' providers and we set
tut:
connectorProvider: "org.odpi.openmetadata.adapters.repositoryservices.inmemory.repositoryconnector. InMemoryOMRSRepositoryConnectorProvider"
then the cts (for example) will fail to initialize with
> Configuring technology under test:
{"class":"VoidResponse","relatedHTTPCode":200}
(200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/server-url-root?url=https://cts-platform:9443)
{"class":"VoidResponse","relatedHTTPCode":200}
(200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/server-type?typeName=TUT)
{"class":"VoidResponse","relatedHTTPCode":200}
(200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/organization-name?name=Egeria)
{"class":"VoidResponse","relatedHTTPCode":400,"exceptionClassName":"org.odpi.openmetadata.adminservices.ffdc.exception.OMAGConfigurationErrorException","actionDescription":"setLocalMetadataCollectionName","exceptionErrorMessage":"OMAG-ADMIN-400-008 The local repository mode has not been set for OMAG server tut","exceptionErrorMessageId":"OMAG-ADMIN-400-008","exceptionErrorMessageParameters":["tut"],"exceptionSystemAction":"The local repository mode must be enabled before the event mapper connection is set. The system is unable to configure the local server.","exceptionUserAction":"The local repository mode is supplied by the caller to the OMAG server. This call to enable the local repository needs to be made before the call to set the event mapper connection."}
(200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/local-repository/metadata-collection-name/TUT_MDR)
{"class":"VoidResponse","relatedHTTPCode":200}
(200 - https://cts-platform:9443/open-metadata/admin-services/users/admin/servers/tut/event-bus?topicURLRoot=egeria)
-- Unknown native repository provider: org.odpi.openmetadata.adapters.repositoryservices.inmemory.repositoryconnector.InMemoryOMRSRepositoryConnectorProvider -- exiting.
This is because the logic in cts/pts for native is
if [ "${TUT_TYPE}" = "native" ]; then
if [ "${CONNECTOR_PROVIDER}" = "org.odpi.openmetadata.adapters.repositoryservices.graphrepository.repositoryconnector.GraphOMRSRepositoryConnectorProvider" ]; then
curl -f -k -w "\n (%{http_code} - %{url_effective})\n" --silent -X POST \
"${EGERIA_ENDPOINT}/open-metadata/admin-services/users/${EGERIA_USER}/servers/${TUT_SERVER}/local-repository/mode/local-graph-repository" || exit $?
else
echo "-- Unknown native repository provider: ${CONNECTOR_PROVIDER} -- exiting."
exit 1
fi
This condition should be extended to allow for the in-mem repository & perform the appropriate configuration
Currently the default pullPolicy for images is set to IfNotExist to reduce impact on container registries & specifically
running out of limits on dockerhub. For final/released code which does not change this is fine.
However when running charts that refer to our -SNAPSHOT builds (ie in latter stages of testing a release, or in development) one should override the setting with Always to get the latest container image. This is easily done but can be forgotten.
Rather than keep changing charts to switch the default between prerelease and final, it would makes sense to
modify the logic so that the default pullPolicy is set for an image based on whether it is a SNAPSHOT build. We have done
this in other helm charts previously.
When running the lab chart on a completely freshly installed (v 4.10) open shift cluster I get:|
โ master git:(master) kubectl logs lab-strimzi-zookeeper-0
Detected Zookeeper ID 1
mkdir: cannot create directory '/var/lib/zookeeper/data': Permission denied
This is from Strimzi 0.29, with the image 'quay.io/strimzi/kafka:0.29.0-kafka-3.0.0'
I try to get started with ODPi using the descriptions found here: https://egeria.odpi.org/open-metadata-resources/open-metadata-labs/
I installed the lab in my Openshift cluster using helm. On startup of the pod lab-odpi-egeria-lab-jupyter I get the event:
"Error creating: pods "lab-odpi-egeria-lab-jupyter-56f7fb969f-" is forbidden: unable to validate against any security context constraint: [fsGroup: Invalid value: []int64{100}: 100 is not an allowed group]"
and the pod is not starting.
Any suggestions?
Currently the PTS chart configure the PTS server within the same OMAG Server Platform as the technology under test (TUT) server is configured. While for relatively small scales (up to ~300 000 metadata instances) this seems to work, once we grow beyond this (to 500 000+) it seems we start to hit scenarios where we overrun the Java heap.
My hunch is that this is likely due to a combination of the TUT's memory footprint increasing with the increased volume of metadata, while at the same time the PTS itself will be consuming significantly more memory given that its workpad is storing all of its results in Maps (that reside on the JVM heap) -- and thus as the volume of metadata increases, the number of tests (and thus results) stored therein also significantly increases.
This issue is therefore a suggestion that we separate out these two servers into their own OMAG Server Platforms, each therefore having its own dedicated JVM heap space to allocate accordingly.
(We will still hit these limits again at some point, but hopefully not until we reach into multiple millions of metadata instances -- and it will be good to confirm at that point whether it is the PTS server itself that runs out of heap or the TUT.)
The k8s docs on egeria-docs were changed in
Need to correct the link in the top level readme (currently https://odpi.github.io/egeria-docs/guides/admin/kubernetes/intro/
See odpi/egeria-docs#120 3a0874068e8b0a75353789c33fe8780077c91da8
On a few occurances when testing CTS, the tests never start.
The symptom is
Mon Mar 28 16:43:15 GMT 2022 tut Cohort OCF-FILE-REGISTRY-STORE-CONNECTOR-0115 Creating new cohort registry store ./data/servers/tut/cohorts/cts.registrystore
Mon Mar 28 16:43:15 GMT 2022 tut Startup OCF-KAFKA-TOPIC-CONNECTOR-0010 The Apache Kafka producer for topic egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.instances is starting up with 0 buffered messages
Mon Mar 28 16:43:15 GMT 2022 tut Startup OMRS-AUDIT-0015 The listener thread for an OMRS Topic Connector for topic egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.instances has started
Mon Mar 28 16:43:15 GMT 2022 tut Cohort OMRS-AUDIT-0060 Registering with open metadata repository cohort cts using metadata collection id 278cf6da-3f76-4cf0-9153-eebc93b25366
Mon Mar 28 16:43:15 GMT 2022 tut Event OMRS-AUDIT-8009 The Open Metadata Repository Services (OMRS) has sent event of type Registry Event to the cohort topic egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.registration
Mon Mar 28 16:43:15 GMT 2022 tut Cohort OMRS-AUDIT-0062 Requesting registration information from other members of the open metadata repository cohort cts
Mon Mar 28 16:43:15 GMT 2022 tut Event OMRS-AUDIT-8009 The Open Metadata Repository Services (OMRS) has sent event of type Registry Event to the cohort topic egeria.openmetadata.repositoryservices.cohort.cts.OMRSTopic.registration
Mon Mar 28 16:43:15 GMT 2022 tut Startup OMRS-AUDIT-0031 The local repository outbound event manager is starting with 1 type definition event consumer(s) and 1 instance event consumer(s)
Mon Mar 28 16:43:15 GMT 2022 tut Startup OMRS-AUDIT-0032 The local repository outbound event manager is sending out the 874 type definition events that were generated and buffered during server initialization
Mon Mar 28 16:43:16 GMT 2022 tut Startup OMAG-ADMIN-0004 The tut server has successfully completed start up. The following services are running: [Open Metadata Repository Services (OMRS)]
Mon Mar 28 16:44:55 GMT 2022 cts Information CONFORMANCE-SUITE-0008 The Open Metadata Repository Conformance Workbench repository-workbench is waiting for server tut to join the cohort
Mon Mar 28 16:46:36 GMT 2022 cts Information CONFORMANCE-SUITE-0008 The Open Metadata Repository Conformance Workbench repository-workbench is waiting for server tut to join the cohort
It's likely this is some kind of kafka/cohort registry issue
Unbundle the init-and-report
pod of the CTS and PTS charts to separate out the configuration and startup of the CTS/PTS and the busy-wait loop that eventually collects the detailed outputs.
This is primarily to ensure the configuration and startup is only ever done once, while if the busy-wait loop / results collection should fail for some reason it can safely be re-run without impacting the actual results.
(Currently if the init-and-report
pod fails it will be restarted, causing a reconfiguration and new instance
calls to be made against the CTS/PTS itself which will destroy any results that may have been collected -- in high volume PTS scenarios that may take days to run, this is a significant loss!)
Document standards for helm charts
(#107 is a good example where they are approaches we've taken to-date that are not clear for others contributing charts -- and documenting also provides a mechanism for review & updating guidelines)
https://github.com/odpi/egeria-charts/blob/main/charts/egeria-base/templates/platform.yaml
In above link the last lines contain a typo.
the first line below has storageClass (start with lowercase)
then 2nd line below has StorageClass (start with uppercase).
Since it is case sensitive the value will not be used to set the storageClassName
{{ if .Values.egeria.storageClass }}
storageClassName: {{ .Values.egeria.StorageClass }}
{{ end }}
Provide an example helm chart that
Depends on:
https://github.com/odpi/egeria/issues/5379
https://github.com/odpi/egeria/issues/1514
Planning to move this to new 'egeria-charts' repo once available.
SPDX headers should be enforced
When building demos in our coco pharma environment, on occasion we need to similar things happening on different systems - outside the scope of egeria operations. In particular this is often in the context of running a demo 'script' via a notebook
For example
In some cases there are also API possibilities (files, postgres), sometimes not (egeria utilities -- if we want to demonstrate the utility itself, rather than the API)
Ensuring ssh access between these containers USED FOR A DEMO, particularly from the jupyter environment to the other containers, will make it possible to more quickly develop these scenarios, often before more fundamental changes (adding a file server, finding the best python libraries to use etc) is in place - ie more adaptable.
Note that access to the k8s cli is another option (kubectl exec) but in general I would err on using ssh as it's more understandable for most -- unless it is k8s itself being demoed.
Also note that our containers DO NOT RUN AS ROOT, so there are some limits in what can be run. Also containers in general are cut down in what commands are installed.
Tested latest helm chart for lineage - 3.3.0-prerelease.0
However the data linage chart is not included since the included jupyter image (which includes our notebooks) is at version 3.2
As this is a 3.3 pre-release, updating version of all images to 3.3
The reactUI provided in version 3.7.0 of the charts is based off 3.5.0
This was due to needing a fix for : odpi/egeria-react-ui#391
This has now been tested, so I will create some new prerelease charts (3.7.1-prerelease.0 probably) which will deploy the new UI
.. .and test. These will use UI version 3.8.0-rc.0 & only advised to any users on request.
The other components in this chart will remain at the same levels
The current cts chart fails due to security limitations on openshift
Warning FailedCreate 77s (x15 over 2m40s) statefulset-controller create Pod cts-kafka-0 in StatefulSet cts-kafka failed error: pods "cts-kafka-0" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, provider restricted: .spec.securityContext.fsGroup: Invalid value: []int64{1001}: 1001 is not an allowed group, spec.containers[0].securityContext.runAsUser: Invalid value: 1001: must be in the ranges: [1000670000, 1000679999], provider "ibm-restricted-scc": Forbidden: not usable by user or serviceaccount, provider "nonroot": Forbidden: not usable by user or serviceaccount, provider "sparkscc": Forbidden: not usable by user or serviceaccount, provider "ibm-anyuid-scc": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "ibm-anyuid-hostpath-scc": Forbidden: not usable by user or serviceaccount, provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "ibm-anyuid-hostaccess-scc": Forbidden: not usable by user or serviceaccount, provider "node-exporter": Forbidden: not usable by user or serviceaccount, provider "ibm-privileged-scc": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount]
Approaches to fix include
I will propose the first option, as it is aligned with our changes for the base/lab charts. In future it would also allow for better/more realistic kafka throughput with multiple brokers easily scaled.
Some hostnames etc may change
cc: @cmgrote
Provide a chart that can be used to run the CTS (repository workbench) against any repository, simply by overriding some of the inputs to the chart.
Reported by snyk:
ShellCheck
Fix rate: > 60%
Command name starts with =. Bad line break?
charts/.../config-egeria.sh 57:1
Shellcheck
The default CTS values.yaml
file is insufficient to successfully run the CTS with only the default values therein, because it refers to the Graph repository connector and this connector must now be downloaded separately (no longer embedded in the Egeria images).
Fix should be relatively straightforward:
downloads.url
value that points to the location for downloading the latest released version of the graph connector (from https://search.maven.org/artifact/org.odpi.egeria/graph-repository-connector)When testing the egeria 3.3 release, the egeria container is failing to start with the error below
$ kubectl logs lab-odpi-egeria-lab-dev-0 [14:57:39]
Starting the Java application using /opt/jboss/container/java/run/run-java.sh ...
INFO exec java -XX:+UseParallelOldGC -XX:MinHeapFreeRatio=10 -XX:MaxHeapFreeRatio=20 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90 -XX:+ExitOnOutOfMemoryError -XX:MaxMetaspaceSize=1g -cp "." -jar /deployments/server/server-chassis-spring-3.3-SNAPSHOT.jar
Project Egeria - Open Metadata and Governance
____ __ ___ ___ ______ _____ ____ _ _ ___
/ __ \ / |/ // | / / / / ___ ____ _ __ ___ ____ / _ \ / / __ / / / _ / ____ _ _
/ / / // /|/ // /| | / / __ _ \ / _ \ / __/| | / // _ \ / __/ / // // // | / \ / / / | / // || |
/ // // / / // ___ |/ // / / // _// / | |/ // // / / __ // // / \ / / / // / // / / / / /
_/// //// ||_/ // ___/// |/ _/// // // _////// _/// // /_/
:: Powered by Spring Boot (v2.5.5) ::
2021-10-28 13:57:15.111 INFO 1 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat initialized with port(s): 9443 (https)
2021-10-28 13:57:29.413 ERROR 1 --- [ main] o.s.boot.SpringApplication : Application run failed
org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'modelConverterRegistrar' defined in class path resource [org/springdoc/core/SpringDocConfiguration.class]: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [org.springdoc.core.converters.ModelConverterRegistrar]: Factory method 'modelConverterRegistrar' threw exception; nested exception is java.lang.NoClassDefFoundError: com/fasterxml/jackson/dataformat/yaml/YAMLFactory
at org.springframework.beans.factory.support.ConstructorResolver.instantiate(ConstructorResolver.java:658) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:638) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.instantiateUsingFactoryMethod(AbstractAutowireCapableBeanFactory.java:1352) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutowireCapableBeanFactory.java:1195) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:582) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:542) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:335) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:234) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:333) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:208) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:944) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:918) ~[spring-context-5.3.10.jar!/:5.3.10]
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:583) ~[spring-context-5.3.10.jar!/:5.3.10]
at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.refresh(ServletWebServerApplicationContext.java:145) ~[spring-boot-2.5.5.jar!/:2.5.5]
at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:754) ~[spring-boot-2.5.5.jar!/:2.5.5]
at org.springframework.boot.SpringApplication.refreshContext(SpringApplication.java:434) ~[spring-boot-2.5.5.jar!/:2.5.5]
at org.springframework.boot.SpringApplication.run(SpringApplication.java:338) ~[spring-boot-2.5.5.jar!/:2.5.5]
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1343) ~[spring-boot-2.5.5.jar!/:2.5.5]
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1332) ~[spring-boot-2.5.5.jar!/:2.5.5]
at org.odpi.openmetadata.serverchassis.springboot.OMAGServerPlatform.main(OMAGServerPlatform.java:93) ~[classes!/:na]
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:na]
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:na]
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:na]
at java.base/java.lang.reflect.Method.invoke(Method.java:566) ~[na:na]
at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:49) ~[server-chassis-spring-3.3-SNAPSHOT.jar:na]
at org.springframework.boot.loader.Launcher.launch(Launcher.java:108) ~[server-chassis-spring-3.3-SNAPSHOT.jar:na]
at org.springframework.boot.loader.Launcher.launch(Launcher.java:58) ~[server-chassis-spring-3.3-SNAPSHOT.jar:na]
at org.springframework.boot.loader.PropertiesLauncher.main(PropertiesLauncher.java:467) ~[server-chassis-spring-3.3-SNAPSHOT.jar:na]
Caused by: org.springframework.beans.BeanInstantiationException: Failed to instantiate [org.springdoc.core.converters.ModelConverterRegistrar]: Factory method 'modelConverterRegistrar' threw exception; nested exception is java.lang.NoClassDefFoundError: com/fasterxml/jackson/dataformat/yaml/YAMLFactory
at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:185) ~[spring-beans-5.3.10.jar!/:5.3.10]
at org.springframework.beans.factory.support.ConstructorResolver.instantiate(ConstructorResolver.java:653) ~[spring-beans-5.3.10.jar!/:5.3.10]
... 27 common frames omitted
Caused by: java.lang.NoClassDefFoundError: com/fasterxml/jackson/dataformat/yaml/YAMLFactory
at io.swagger.v3.core.util.Json.mapper(Json.java:13) ~[swagger-core-2.1.11.jar!/:2.1.11]
at io.swagger.v3.core.converter.ModelConverters.(ModelConverters.java:31) ~[swagger-core-2.1.11.jar!/:2.1.11]
at io.swagger.v3.core.converter.ModelConverters.(ModelConverters.java:23) ~[swagger-core-2.1.11.jar!/:2.1.11]
at org.springdoc.core.converters.ModelConverterRegistrar.(ModelConverterRegistrar.java:42) ~[springdoc-openapi-common-1.5.12.jar!/:1.5.12]
at org.springdoc.core.SpringDocConfiguration.modelConverterRegistrar(SpringDocConfiguration.java:229) ~[springdoc-openapi-common-1.5.12.jar!/:1.5.12]
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:na]
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:na]
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:na]
at java.base/java.lang.reflect.Method.invoke(Method.java:566) ~[na:na]
at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:154) ~[spring-beans-5.3.10.jar!/:5.3.10]
... 28 common frames omitted
Caused by: java.lang.ClassNotFoundException: com.fasterxml.jackson.dataformat.yaml.YAMLFactory
at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:476) ~[na:na]
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:589) ~[na:na]
at org.springframework.boot.loader.LaunchedURLClassLoader.loadClass(LaunchedURLClassLoader.java:151) ~[server-chassis-spring-3.3-SNAPSHOT.jar:na]
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522) ~[na:na]
... 38 common frames omitted
chassis should launch ok
No response
- Egeria:
- OS:
- Java:
- Browser (for UI issues):
- Additional connectors and integration:
No response
Tracking the PR proposals as #107
(opening to allow prioritization via zenhub)
Currently both charts use inmemory or the graph repo. In memory isn't persistent - fine for the first steps of a demo, but can be frustrating if developers use the environment as a springboard for further investigation. The graph repo is slow.
Meanwhile xtdb is compelling, the performance figures look good, so we could point people down a better path by including xtdb in these charts
Additionally I need xtdb to support my operator work. Whilst the charts using the operator will be rather different in terms of egeria, the setup of xtdb would follow a similar pattern
The 3.4 pre-release helm chart 'base' fails to work with Server author, reporting 'Error getting all servers' when this is
clicked in the UI . This was also present in 3.3
The lab chart works ok, so it's likely missing configuration that has since been added to the notebooks, but we need to add to this chart
See odpi/egeria#5903 where the fix was added to that notebook environment only for 3.3
In addition to #157 which affected all charts on a clean openshift install, the egeria-base chart 3.9.1 fails during configuration with:
{"class":"VoidResponse","relatedHTTPCode":500,"exceptionClassName":"org.odpi.openmetadata.adminservices.ffdc.exception.OMAGConfigurationErrorException","exceptionCausedBy":"org.odpi.openmetadata.frameworks.connectors.ffdc.OCFRuntimeException","actionDescription":"setServerURLRoot","exceptionErrorMessage":"OMAG-ADMIN-500-001 Method setServerURLRoot for OMAG server mds1 returned an unexpected exception of org.odpi.openmetadata.frameworks.connectors.ffdc.OCFRuntimeException with message ENCRYPTED-DOC-STORE-400-008 Unable to create secure location for storing encryption key.","exceptionErrorMessageId":"OMAG-ADMIN-500-001","exceptionErrorMessageParameters":["mds1","setServerURLRoot","org.odpi.openmetadata.frameworks.connectors.ffdc.OCFRuntimeException","ENCRYPTED-DOC-STORE-400-008 Unable to create secure location for storing encryption key."],"exceptionSystemAction":"The system is unable to configure the OMAG server. No change was made to the server's configuration document.","exceptionUserAction":"This is likely to be either an operational or logic error. Look for other errors. Validate the request. If you are stuck, raise an issue."} [1160 bytes data]
^M100 1153 0 1153 0 0 3117 0 --:--:-- --:--:-- --:--:-- 3141
* Connection #0 to host base-platform left intact
Also the configuration script does not check the response, so the configuration continues & it is not clear it is not working until one tries to use it...
The same chart works fine on rancher desktop.
There are no errors recorded in the audit log
The issue is likely permisions/effective userids used for accessing the data volume
A workaround is to use a more liberal scc than the default. However the chart should work 'as is'
Currently there is no validation testing of PRs.
We could deploy each chart against a kind environment (see the postgres database connector pipelines)
and wait to at least check all pods are ready
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.