This repository contains the assets and the runbook to follow to containerized SPM.
The runbook is available at this URL: https://merative.github.io/spm-kubernetes/
This repository contains artifacts to assist IBM Cúram SPM customers in their journey to Kubernetes
License: Apache License 2.0
This repository contains the assets and the runbook to follow to containerized SPM.
The runbook is available at this URL: https://merative.github.io/spm-kubernetes/
On the pre-reqs page (https://ibm.github.io/spm-kubernetes/prereq/prereq/) the prereq minimum for Openshift says 4.6 in the table, but note 8 still has 4.5 "(8) | IBM Cúram Social Program Management supports OpenShift 4.5 or later".
Frankly, Note 8 doesn't provide any more information than the table itself - it may not be needed.
There is also a typo in Note 10 "(10) | Support for Docker 20.10 was introducted as part of the SPM@Kubernetes 21.2.0 release."
https://ibm.github.io/spm-kubernetes/01-prereq/3rdparty-sw states to install 8.0.5.41 IBM Java SDK - however following the link there is no version for Mac? Thanks
Link to chartmuseum is not working.
https://ibm.github.io/spm-kubernetes/03-deployment/hc_preparation/
When I try to install SPM (https://ibm.github.io/spm-kubernetes/01-deploy-spm/SPM-sw)
java -jar IBM\ Curam\ Social\ Program\ Management\ Platform\ Development.jar
I get following error message:
Apr 14, 2020 8:41:50 PM java.io.ObjectInputStream filterCheck
INFO: ObjectInputFilter REJECTED: class com.sun.crypto.provider.SealedObjectForKeyProtector, array length: -1, nRefs: 1, depth: 1, bytes: 70, ex: n/a
java.io.IOException: Invalid secret key format
at com.ibm.crypto.provider.JceKeyStore.engineLoad(Unknown Source)
at java.security.KeyStore.load(KeyStore.java:1445)
at curam.util.security.Encryption$CryptoConfig.getKeyFromKeyStore(Encryption.java:1040)
at curam.util.security.Encryption$CryptoConfig.getCipherKey(Encryption.java:947)
at curam.util.security.Encryption.setConfiguration(Encryption.java:201)
at curam.util.security.EncryptionConfiguration.<init>(EncryptionConfiguration.java:31)
at curam.util.security.EncryptPassword.encryptDBPassword(EncryptPassword.java:16)
at curam.installerinf.panelhelpers.BootstrapHelper.encryptPassword(BootstrapHelper.java:204)
at curam.installerinf.panelhelpers.BootstrapHelper.updateBoostrapFile(BootstrapHelper.java:97)
at curam.installerinf.processmanager.BootstrapManager.run(BootstrapManager.java:84)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.izforge.izpack.installer.ProcessPanelWorker$ExecutableClass.run(Unknown Source)
at com.izforge.izpack.installer.ProcessPanelWorker$ProcessingJob.run(Unknown Source)
at com.izforge.izpack.installer.ProcessPanelWorker.run(Unknown Source)
at java.lang.Thread.run(Thread.java:748)
Apr 14, 2020 8:41:50 PM java.io.ObjectInputStream filterCheck
INFO: ObjectInputFilter REJECTED: class com.sun.crypto.provider.SealedObjectForKeyProtector, array length: -1, nRefs: 1, depth: 1, bytes: 70, ex: n/a
java.io.IOException: Invalid secret key format
at com.ibm.crypto.provider.JceKeyStore.engineLoad(Unknown Source)
I have even tried to add unrestricted JCE jar files in my jre > security folder
Platform : OpenShift 4.8
After a successful helm install/deployment, there doesn't seem to be any timestamp appearing in the SPM App Consumer/Producer Pod Logs, making it very difficult to determine which messages are for which test activity. Please see the attached logs.
dev01-apps-curam-producer-66788ccb54-rxb85-apps-producer-curam.log
dev01-apps-curam-consumer-856b44855c-ld77w-apps-consumer-curam.log
We support ESDC's High Fidelity Prototype Project whose goal is to successfully demonstrate Curam in an OpenShift environment. Our SI incorporates this repo, including charts, directly into a pipeline whenever we build and deploy the Curam application. Since the May 2021 update to the SPM-kubernetes repo, builds have been failing with:
Error: template: spm/charts/apps/templates/configmaps/configmap-sessions.yaml:38:12: executing "spm/charts/apps/templates/configmaps/configmap-sessions.yaml" at <include "apps.dsprops.fragment" (list . "CURAMSESSDB")>: error calling include: template: spm/charts/apps/templates/_database.tpl:50:10: executing "apps.dsprops.fragment" at <include "apps.oracleurl" .>: error calling include: template: spm/charts/apps/templates/_database.tpl:68:24: executing "apps.oracleurl" at <.Values.global.database>: can't evaluate field Values in type []interface {}
Appreciate your support, as this error fails our builds and has delayed progress on current sprints. Available to provide further info upon request
Regarding the ISAM integration functionality, which seems to be based on generic SAML2 integration feature (samlWeb-2.0
) from Websphere Liberty:
The "Configuration Reference" link on this page results in a 404 Not Found error: https://ibm.github.io/spm-kubernetes/deployment/hc_preparation
It attempts to link to: https://ibm.github.io/spm-kubernetes/config-reference
After creating images (https://ibm.github.io/spm-kubernetes/build-images/build_images), the instructions only mention a local registry for MiniKube. Is there guidance for using a local registry if following the Openshift CRC "path"?
name: MQ Server pods fail when deploying to AWS using EFS storage
Describe the bug
When deploying the solution to an Openshift 4.4 cluster running in AWS and using AWS EFS for persistent storage the spm-mqserver-curam-0 and spm-mqserver-rest-0 pods fail with the message Error setting admin password: /usr/bin/sudo: exit status 1: sudo: effective uid is not 0, is /usr/bin/sudo on a file system with the 'nosuid' option set or an NFS file system without root privileges?
To Reproduce
Steps to reproduce the behavior:
helm upgrade --install spm local-development/spm -f curam_containerisation/static/resources/os-values.yaml
Expected behavior
Solution to be deployed to Openshift and the spm-mqserver-curam-0 and spm-mqserver-rest-0 pods to move into a running state.
Screenshots
If applicable, add screenshots to help explain your problem.
Please complete the following information:
* Openshift Version: [4.4.26]
* Cúram SPM Version: [7.0.10]
Additional context
Add any other context about the problem here.
Log Collection
2020-10-23T11:59:07.670Z CPU architecture: amd64
2020-10-23T11:59:07.670Z Linux kernel version: 4.18.0-193.23.1.el8_2.x86_64
2020-10-23T11:59:07.670Z Container runtime: kube
2020-10-23T11:59:07.670Z Base image: Red Hat Enterprise Linux Server 7.6 (Maipo)
2020-10-23T11:59:07.672Z Running as user ID 1000590000 (1000590000 user) with primary group 0, and supplementary groups 1000590000
2020-10-23T11:59:07.672Z Capabilities (bounding set): chown,dac_override,fowner,fsetid,setpcap,net_bind_service,net_raw,sys_chroot
2020-10-23T11:59:07.672Z seccomp enforcing mode: disabled
2020-10-23T11:59:07.672Z Process security attributes: system_u:system_r:container_t:s0:c19,c24
2020-10-23T11:59:07.672Z Detected 'nfs4' volume mounted to /mnt/mqm-log
2020-10-23T11:59:07.672Z Detected 'nfs4' volume mounted to /mnt/mqm-data
2020-10-23T11:59:07.672Z Detected 'nfs4' volume mounted to /mnt/mqm
2020-10-23T11:59:07.672Z Multi-instance queue manager: enabled
2020-10-23T11:59:07.674Z Error setting admin password: /usr/bin/sudo: exit status 1: sudo: effective uid is not 0, is /usr/bin/sudo on a file system with the 'nosuid' option set or an NFS file system without root privileges?
Issue:
When building the CuramBirtViewer image like so:
# Building curambirtviewer image
cd "${SPM_CONTAINERISATION_HOME}/dockerfiles/Liberty/"
docker build \
--tag curambirtviewer:latest \
--file ClientEAR.Dockerfile \
--build-arg "SERVERCODE_IMAGE=servercode:latest" \
--build-arg "EAR_NAME=CuramBIRTViewer" .
On deployment if this container using the SPM helm charts, SPM constantly logs out the logged in user, this was an issue in OpenLiberty/open-liberty#9663 which has now been fixed in Liberty version 20.0.0.6
which seems to be used to build the container images, however the logout's still occur.
I am following the SPM Runbook for Code Ready Containers: https://ibm.github.io/spm-kubernetes/prereq/openshift/codeready-containers
When executing the following step:
docker login -u kubeadmin -p $(oc whoami -t) $(oc registry info --public)
I get the following Certificate error:
Error response from daemon: Get https://default-route-openshift-image-registry.apps-crc.testing/v1/users/: x509: certificate signed by unknown authority
Let me know if you have any thoughts or advice. Thanks!
Describe the bug
Under the packaging-the-helm-charts section of the documentation if the instructions are followed, pushing of the helm charts fails with the error Error: ce-app chart not found in repo http://chartmuseum.spm-poc.scotgov-dt.internal
when the instructions are followed. This is because the ce-app
dependency doesn't have a conditional set to false by default and the pushing of the ce-app chart isn't mentioned in the instructions.
A conditional should be added to the code to allow the dependency to be disabled for users that don't need it, for example:
- name: ce-app
version: "~1.0.0"
repository: "@local-development"
condition: global.ceApp.enabled
To Reproduce
Steps to reproduce the behavior:
helm repo add local-development ${CHART_REPO}
cd $SPM_CONTAINERISATION_HOME/helm-charts
helm push apps local-development
helm push mqserver local-development
helm push configmaps local-development
helm push xmlserver local-development
helm push batch local-development
helm push ihs local-development
helm repo update
helm dep up $SPM_CONTAINERISATION_HOME/helm-charts/spm/
Expected behavior
Helm packages dependencies to upload them to chartmuseum.
Screenshots
Logs from jenkins build server:
+ ./build-charts.sh
"local-development" has been added to your repositories
Pushing apps-2.0.0.tgz to local-development...
Done.
Pushing mqserver-1.2.0.tgz to local-development...
Done.
Pushing configmaps-1.2.0.tgz to local-development...
Done.
Pushing xmlserver-1.1.1.tgz to local-development...
Done.
Pushing batch-1.1.1.tgz to local-development...
Done.
Pushing ihs-2.0.0.tgz to local-development...
Done.
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "local-development" chart repository
Update Complete. ⎈ Happy Helming!⎈
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "local-development" chart repository
Update Complete. ⎈Happy Helming!⎈
Error: ce-app chart not found in repo http://chartmuseum.spm-poc.scotgov-dt.internal
Please complete the following information:
* OS: CentOS Linux release 7.6.1810 (Core)
* Docker Version: Docker version 18.09.5, build e8ff056
* Minikube Version: N/a
* Ant Version: 1.10.6
* Java Version: Java 8 (version packaged with Liberty)
* Liberty Version: 19.0.0.12-full-java8-ibmjava
* Cúram SPM Version: 7.0.10
Additional context
Currently our build process uses Helm 3 which has the following bug around conditionally including dependencies: helm/helm#5780 However the issue is still fixable by commenting our the ce-app and openldap dependencies in the spm/requirements.yaml
file like:
+ ./build-charts.sh
"local-development" has been added to your repositories
Pushing apps-2.0.0.tgz to local-development...
Done.
Pushing mqserver-1.2.0.tgz to local-development...
Done.
Pushing configmaps-1.2.0.tgz to local-development...
Done.
Pushing xmlserver-1.1.1.tgz to local-development...
Done.
Pushing batch-1.1.1.tgz to local-development...
Done.
Pushing ihs-2.0.0.tgz to local-development...
Done.
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "local-development" chart repository
Update Complete. ⎈ Happy Helming!⎈
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "local-development" chart repository
Update Complete. ⎈Happy Helming!⎈
Saving 6 charts
Downloading apps from repo http://chartmuseum.spm-poc.scotgov-dt.internal
Downloading batch from repo http://chartmuseum.spm-poc.scotgov-dt.internal
Downloading configmaps from repo http://chartmuseum.spm-poc.scotgov-dt.internal
Downloading ihs from repo http://chartmuseum.spm-poc.scotgov-dt.internal
Downloading mqserver from repo http://chartmuseum.spm-poc.scotgov-dt.internal
Downloading xmlserver from repo http://chartmuseum.spm-poc.scotgov-dt.internal
Deleting outdated charts
Pushing spm-1.2.0.tgz to local-development...
Done.
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "local-development" chart repository
Update Complete. ⎈ Happy Helming!⎈
Log Collection
see Screenshots
Describe the bug
In SPM Umbrella chart under helm-charts/spm/values.yaml
the ihs
element is set twice with similar keys present
ihs:
runAs: 1000
replicaCount: 1
readinessPath: /CuramStatic
ingressPath: /CuramStatic/*
resources: {}
and further down:
ihs:
serviceType: NodePort
config:
readinessPath: /CuramStatic
ingressPath: /CuramStatic/*
This is slightly confusing and the second sections ingressPath
element isn't picked up by the ingress.yaml template:
- path: {{ .Values.global.ihs.ingressPath | default "/CuramStatic" }}
backend:
serviceName: {{ $.Release.Name }}-ihs
servicePort: http
On https://ibm.github.io/spm-kubernetes/01-prereq/Docker-Kubernetes-Helm/, in the section about Kubernetes and Minikube the next page link is broken.
The url is takes you to is https://ibm.github.io/spm-kubernetes/01-prereq/Docker-Kubernetes-Helm/minikube
Issue:
In the Dockerfiles for CE and StaticContent the FROM statement In both files is currently set to:
ARG DOCKER_REGISTRY="internal.docker.repository"
#Final
FROM ${DOCKER_REGISTRY}/ubi7/ibm-http-server:${HTTP_VERSION}
this repository can't be resolved and is inconsistent with the Liberty, batch and MQ docker files.
Solution:
This should be reverted back to:
FROM ibmcom/ibm-http-server:${HTTP_VERSION}
Have been going through setting up SPM with OpenShift service mesh. I came across a few inconsistences with the naming of ports and pod labels which caused some problems getting this set up
Naming of ports following: protocol-suffix
For example, MQ server has:
name: console-https
To show metrics in Kiali the expected is:
name: https-console
Adding in the labels for pods app and version to the helm charts, as this is what is displayed on the graph charts, rather than a user having to go through and add them in individually.
Labels to add: name
and version
for each deployment
Since the 12th of December 2020, a JVM Segmentation error when building the database from batch image.
prior to the 12th of December this was not an issue.
see error below
dispmsg:
[echo] 10:48:54 Starting batchlauncher
[batchlauncher] Using configured properties for logging.
[batchlauncher] Running a Single Batch Program.
[batchlauncher] Connecting to DB2 data source : com.ibm.db2.jcc.DB2SimpleDataSource.
[batchlauncher] 'batch.username' not found.
[batchlauncher] Batch invoking : 'curam.util.internal.userpreference.intf.UserPreferenceLoader.insertUserPreferencesToDatabase'.
[batchlauncher] Unhandled exception
[batchlauncher] Type=Segmentation error vmState=0x00080002
[batchlauncher] J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
[batchlauncher] Handler1=00007F4165F26CA0 Handler2=00007F416580B140 InaccessibleAddress=00007F41AE96D817
[batchlauncher] RDI=00007F4160015CA0 RSI=00007F41AE96D817 RAX=0000000000000000 RBX=00007F41AE96D817
[batchlauncher] RCX=00007F4165FEBD98 RDX=00007F41601F1450 R8=00007F415F606840 R9=000000000000000A
[batchlauncher] R10=0000000000000001 R11=0000000000000000 R12=0000000000000005 R13=0000000000000041
[batchlauncher] R14=00007F41AE96D817 R15=00000000000000A8
[batchlauncher] RIP=00007F4165FB4C26 GS=0000 FS=0000 RSP=00007F4166E61460
[batchlauncher] EFlags=0000000000010202 CS=0033 RBP=00007F4166E61650 ERR=0000000000000004
[batchlauncher] TRAPNO=000000000000000E OLDMASK=0000000000000000 CR2=00007F41AE96D817
[batchlauncher] xmm0 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[batchlauncher] xmm1 534e656c62616e65 (f: 1650552448.000000, d: 1.981380e+93)
[batchlauncher] xmm2 ffffffffffffffff (f: 4294967296.000000, d: -nan)
[batchlauncher] xmm3 120c00b42a1100c6 (f: 705757376.000000, d: 9.683534e-222)
[batchlauncher] xmm4 011700b92b2e01a7 (f: 724435392.000000, d: 2.096455e-303)
[batchlauncher] xmm5 0500990b00b61f00 (f: 11935488.000000, d: 1.395228e-284)
[batchlauncher] xmm6 c60c00b42a1800c6 (f: 706216128.000000, d: -2.773258e+29)
[batchlauncher] xmm7 00b42ab02b050099 (f: 721748096.000000, d: 2.871841e-305)
[batchlauncher] xmm8 2a1100c60c00b42a (f: 201372720.000000, d: 4.633484e-106)
[batchlauncher] xmm9 ffff00ff0000ffff (f: 65535.000000, d: -nan)
[batchlauncher] xmm10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[batchlauncher] xmm11 000000004d570a3d (f: 1297549824.000000, d: 6.410748e-315)
[batchlauncher] xmm12 000000004a09a025 (f: 1242144768.000000, d: 6.137011e-315)
[batchlauncher] xmm13 000000004b2c0833 (f: 1261176832.000000, d: 6.231042e-315)
[batchlauncher] xmm14 000000004be50e00 (f: 1273302528.000000, d: 6.290950e-315)
[batchlauncher] xmm15 000000004a373e68 (f: 1245134464.000000, d: 6.151782e-315)
[batchlauncher] Module=/opt/ibm/java/jre/lib/amd64/compressedrefs/libj9vm29.so
[batchlauncher] Module_base_address=00007F4165E91000
[batchlauncher] Target=2_90_20201102_458768 (Linux 4.4.0-148-generic)
[batchlauncher] CPU=amd64 (48 logical CPUs) (0x3ef3728000 RAM)
[batchlauncher] ----------- Stack Backtrace -----------
[batchlauncher] (0x00007F4165FB4C26 [libj9vm29.so+0x123c26])
[batchlauncher] (0x00007F4165FB54D3 [libj9vm29.so+0x1244d3])
[batchlauncher] (0x00007F4165F9ACAF [libj9vm29.so+0x109caf])
[batchlauncher] (0x00007F4165FB6BF1 [libj9vm29.so+0x125bf1])
[batchlauncher] (0x00007F4165F9184F [libj9vm29.so+0x10084f])
[batchlauncher] (0x00007F4165F984C6 [libj9vm29.so+0x1074c6])
[batchlauncher] (0x00007F4165F8D8CA [libj9vm29.so+0xfc8ca])
[batchlauncher] (0x00007F4165F8EFC0 [libj9vm29.so+0xfdfc0])
[batchlauncher] (0x00007F4165F8F689 [libj9vm29.so+0xfe689])
[batchlauncher] (0x00007F4165F8FDBE [libj9vm29.so+0xfedbe])
[batchlauncher] (0x00007F4165F84788 [libj9vm29.so+0xf3788])
[batchlauncher] (0x00007F4165F859D6 [libj9vm29.so+0xf49d6])
[batchlauncher] (0x00007F4165F16607 [libj9vm29.so+0x85607])
[batchlauncher] (0x00007F4165F17BBB [libj9vm29.so+0x86bbb])
[batchlauncher] (0x00007F4165F19E01 [libj9vm29.so+0x88e01])
[batchlauncher] (0x00007F4165EA809F [libj9vm29.so+0x1709f])
[batchlauncher] (0x00007F4165EA3C50 [libj9vm29.so+0x12c50])
[batchlauncher] (0x00007F4165F61DB2 [libj9vm29.so+0xd0db2])
[batchlauncher] ---------------------------------------
[batchlauncher] JVMDUMP039I Processing dump event "gpf", detail "" at 2020/12/14 10:49:02 - please wait.
[batchlauncher] JVMDUMP032I JVM requested System dump using '/opt/ibm/Curam/release/buildlogs/core.20201214.231413.421.0001.dmp' in response to an event
[batchlauncher] JVMDUMP010I System dump written to /opt/ibm/Curam/release/buildlogs/core.20201214.231413.421.0001.dmp
[batchlauncher] JVMDUMP032I JVM requested Java dump using '/opt/ibm/Curam/release/buildlogs/javacore.20201214.231413.421.0002.txt' in response to an event
[batchlauncher] JVMDUMP010I Java dump written to /opt/ibm/Curam/release/buildlogs/javacore.20201214.231413.421.0002.txt
[batchlauncher] JVMDUMP032I JVM requested Snap dump using '/opt/ibm/Curam/release/buildlogs/Snap.20201214.231413.421.0003.trc' in response to an event
[batchlauncher] JVMDUMP010I Snap dump written to /opt/ibm/Curam/release/buildlogs/Snap.20201214.231413.421.0003.trc
[batchlauncher] JVMDUMP032I JVM requested JIT dump using '/opt/ibm/Curam/release/buildlogs/jitdump.20201214.231413.421.0004.dmp' in response to an event
I'm following the instructions to install CodeReady Containers (https://ibm.github.io/spm-kubernetes/prereq/openshift/codeready-containers/#creating-a-crc-project). I'm using an RHEL VM, and when I attempt the "crc start" command I get the following error regarding NetworkManager:
[vnc@bluffing1 ~]$ crc start
INFO Checking if running as non-root
INFO Checking if podman remote executable is cached
INFO Checking if admin-helper executable is cached
INFO Checking minimum RAM requirements
INFO Checking if Virtualization is enabled
INFO Checking if KVM is enabled
INFO Checking if libvirt is installed
INFO Checking if user is part of libvirt group
INFO Checking if libvirt daemon is running
INFO Checking if a supported libvirt version is installed
INFO Checking if crc-driver-libvirt is installed
INFO Checking if systemd-networkd is running
INFO Checking if NetworkManager is installed
INFO Checking if NetworkManager service is running
INFO Checking if /etc/NetworkManager/conf.d/crc-nm-dnsmasq.conf exists
File not found: /etc/NetworkManager/conf.d/crc-nm-dnsmasq.conf: stat /etc/NetworkManager/conf.d/crc-nm-dnsmasq.conf: no such file or directory
Can someone advise how /etc/NetworkManager/conf.d/crc-nm-dnsmasq.conf file should be configured?
There is some info here but I assume that is just an example? Not sure what values I should put
The runbook step to execute helm install using
helm install releasename local-development/spm
is failing on a timeout. There doesn't appear to be detailed logs I can find. Please see the attached screenshot of the dashboard. it shows a timeout event on the customsql job. However, when I Execute a curl command from the command line I get a response immediately.
[ibmadmin@localserver helm-charts]$ curl http://minikube.local:5000/v2
<a href="/v2/">Moved Permanently</a>.
If I attempt to install again, I get this error message.
[ibmadmin@localserver helm-charts]$ helm install spm-v1 local-development/spm
Error: cannot re-use a name that is still in use
Please advise.
Kev and I have completed the deployment we have been working on. We struggled a bit with the values.yaml settings that were needed. We got through it by comparing with various values.yaml files from working deployments, but it felt a bit hap-hazard. We know that the configuration settings are documented in the runbook, but it was difficult, in some cases, to put 2 and 2 together to figure out what the documentation was telling us. One suggestion here is to add some comments in the provided YAML files. Also, we feel that a worked through example would be very helpful... perhaps a values.yaml file with a bunch of commented overrides in place, based on an example project name and registry, assuming a deployment of curam, CE, birt and rest. etc. Also we're still a bit confused by some of the main blocks in the values.yaml file, for example the later blocks that are "turned-off" by default using "{}" - still not sure what they are there for (a comment highlighting the need to remove these braces if using any of the subsequent overrides would have been useful - yes I know we should be able to figure that out, and we did, but anything to reduce the cognitive load around this stuff would be appreciated).
In general through the documentation was great, and we are not suggesting here the need for a fundamental change, rather the provision of some worked through examples.
Following symptoms appear on minikube cluster v1.15.1:
When the SPM Helm chart in installed and the different deployments are created on k8s, the SPM producer and consumer pods fail to start initially and only after few rounds of re-attempts the stabilize to running state. On the failed start attempts, following type of errors can be observed on Pod Events:
MountVolume.SetUp failed for volume "ejb-bindings" : failed to sync configmap cache: timed out waiting for the condition
Same error is repeated for all volumes required by the pod.
when running the createSSC.sh script like below:
./createSCC.sh -n spm-deploy
The script throws the following error on MacOS running zsh and windows using gitbash
./createSCC.sh: line 77: syntax error: unexpected end of file
this is due to indentation in the usage function, changing this:
function usage() {
cat <<-USAGE #| fmt
Usage: $0 [OPTIONS] [arg]
OPTIONS:
=======
--namespace [namespace] - The name of an existing namespace for the SPM deployment.
USAGE
}
to this:
function usage() {
cat <<-USAGE #| fmt
Usage: $0 [OPTIONS] [arg]
OPTIONS:
=======
--namespace [namespace] - The name of an existing namespace for the SPM deployment.
USAGE
}
fixes the issue.
Describe the bug
In SPM Umbrella chart under helm-charts/spm/requirements.yaml there is a dependency on db2 as follows:
- name: db2
version: "~1.3.0"
repository: "@local-development"
There is no valid charts for this dependency and therefore it should be removed as it causes the following issue when packaging the spm umbrella chart.
Error: db2 chart not found in repo http://chartmuseum.apps-crc.testing
I use Mac, and not able to allocate 4 cpus and 8 GB RAM. Do we really need that much to run Curam?. The following downgraded configuration worked. As per the instructions, I always get a timeout.
minikube start --vm-driver=vmware --cpus 2 --memory 4G --insecure-registry "192.168.3.0/16" --disk-size='20G' --kubernetes-version v1.16.7
Describe the bug
In SPM Umbrella chart under helm-charts/spm/values.yaml
when ceApp has the following values set:
ceApp:
enabled: false
replicaCount: 1
imageLibrary: ''
imageName: ce-ihs
imageTag: latest
ingressPath: /universal/*
resources: {}
The helm charts still produce the ingress controller rules, this causes the ingress controller provisioning to fail. this is due to this if statement in helm-charts/spm/templates/ingress.yaml
:
{{- if .Values.global.ceApp.imageTag }}
- path: {{ .Values.global.ceApp.ingressPath | default "/universal" }}
backend:
serviceName: {{ $.Release.Name }}-ce-app
servicePort: http
{{- end }}
shouldn't this be set to?:
{{- if .Values.global.ceApp.enabled }}
- path: {{ .Values.global.ceApp.ingressPath | default "/universal" }}
backend:
serviceName: {{ $.Release.Name }}-ce-app
servicePort: http
{{- end }}
To Reproduce
Steps to reproduce the behavior:
{
"path": "/universal",
"backend": {
"serviceName": "spm-dev01-ce-app",
"servicePort": "http"
}
Since that service is not installed:
[matt.smithson@ssvc-c-403 dashboard]$ kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
spm-dev01-apps-curam ClusterIP 172.20.35.250 <none> 8443/TCP 5h1m
spm-dev01-apps-rest ClusterIP 172.20.103.175 <none> 8443/TCP 5h1m
spm-dev01-ihs NodePort 172.20.120.160 <none> 8443:31010/TCP 5h1m
spm-dev01-mqserver-curam NodePort 172.20.254.199 <none> 9443:30411/TCP,1414:31859/TCP 137m
spm-dev01-mqserver-rest NodePort 172.20.247.88 <none> 9443:30562/TCP,1414:32102/TCP 137m
spm-dev01-xmlserver ClusterIP 172.20.16.199 <none> 1800/TCP 5h1m
ingress controller setup fails.
Expected behavior
Helm charts should not produce ingress rules for components that aren't enabled.
Screenshots
See logs above
Please complete the following information:
* OS: AWS EKS 1.14
* Docker Version: Docker version 18.09.5, build e8ff056
* Minikube Version: N/a
* Ant Version: 1.10.6
* Java Version: Java 8 (version packaged with Liberty)
* Liberty Version: 19.0.0.12-full-java8-ibmjava
* Cúram SPM Version: 7.0.10
Log Collection
See above.
https://ibm.github.io/spm-kubernetes/03-deployment/hc_deployment
The following line:
docker run --rm -e LICENSE=view websphere-liberty:19.0.0.12-java8-ibmjava
should be:
docker run --rm -e LICENSE=view websphere-liberty:19.0.0.12-full-java8-ibmjava
Issue:
When running MQ with the following properties:
mq:
version: 9.1.3.0
# Set to True if running MQ in HA mode
useConnectionNameList: true
tlsSecretName: 'spm-dev01-mq-secret'
queueManager:
name: 'QM1'
secret:
# name is the secret that contains the 'admin' user password and the 'app' user password to use for messaging
name: ''
# adminPasswordKey is the secret key that contains the 'admin' user password
adminPasswordKey: 'adminPasswordKey'
# appPasswordKey is the secret key that contains the 'admin' user password
appPasswordKey: 'appPasswordKey'
metrics:
enabled: false
resources: {}
multiInstance:
cephEnabled: false
storageClassName: 'nfs'
nfsEnabled: true
nfsIP: 'fs-xxxxxxxx.efs.eu-west-2.amazonaws.com'
nfsFolder: 'spm-dev01'
When the curam-mq
and rest-mq
pods stasrt they connect mount the AWS EFS file system, and Kubernetes(EKS) returns the following error:
Warning FailedMount 2m31s kubelet, ip-100-64-18-180.eu-west-2.compute.internal MountVolume.SetUp failed for volume "spm-dev01-curam-pv-qm" : mount failed: exit status 32
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/11e52fd5-fb8c-40b6-9cf7-b252f1f4e1ac/volumes/kubernetes.io~nfs/spm-dev01-curam-pv-qm --scope -- mount -t nfs -o hard,nfsvers=4.1,noresvport,retrans=2,rsize=1048576,timeo=600,wsize=1048576 fs-xxxxxxx.efs.eu-west-2.amazonaws.com:/spm-dev01/curam /var/lib/kubelet/pods/11e52fd5-fb8c-40b6-9cf7-b252f1f4e1ac/volumes/kubernetes.io~nfs/spm-dev01-curam-pv-qm
Output: Running scope as unit run-12552.scope.
mount.nfs: Connection timed out
Solution:
adding mountOptions with the following properties as recommended here seems to resolve the issue
For this to be portable or able to be changed for different servicer providers I've added the following code to mqserver/templates/pv-data.yaml
, mqserver/templates/pv-logs.yaml
, mqserver/templates/pv-qm.yaml
{{- if $.Values.global.mq.multiInstance.nfsMountOptions }}
mountOptions:
{{- range $.Values.global.mq.multiInstance.nfsMountOptions }}
- {{ . | quote }}
{{- end }}
{{- end}}
adding the following element, to the set values then sets the mountOptions
:
nfsMountOptions:
- "nfsvers=4.1"
- "rsize=1048576"
- "wsize=1048576"
- "hard"
- "timeo=600"
- "retrans=2"
- "noresvport"
My helm install failed with:
Error: failed pre-install: timed out waiting for the condition
helm.go:81: [debug] failed pre-install: timed out waiting for the condition
I used the following to get details on the pod: kubectl describe pod releasename-apps-create-ltpa-keys-xrdp5 - note the "toomanyrequests: You have reached your pull rate limit" error:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 3m46s default-scheduler Successfully assigned ocp/releasename-apps-create-ltpa-keys-fnkq7 to crc-ctj2r-master-0
Normal Pulled 3m6s kubelet Container image "ibmcom/websphere-liberty:kernel-java8-ibmjava-ubi" already present on machine
Normal Created 2m58s kubelet Created container create-ltpa-keys
Normal Started 2m48s kubelet Started container create-ltpa-keys
Warning Failed 60s kubelet Failed to pull image "bitnami/kubectl:1.19": rpc error: code = Unknown desc = Error reading manifest 1.19 in docker.io/bitnami/kubectl: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
Warning Failed 60s kubelet Error: ErrImagePull
Normal BackOff 59s kubelet Back-off pulling image "bitnami/kubectl:1.19"
Warning Failed 59s kubelet Error: ImagePullBackOff
Normal Pulling 46s (x2 over 2m9s) kubelet Pulling image "bitnami/kubectl:1.19"
To try to get around that, I authenticated to Docker Hub, pulled and tagged "kubectl" image as just "ocp/kubectl:latest" and did:
docker push default-route-openshift-image-registry.apps-crc.testing/ocp/kubectl:latest
I then modified Helm Charts to look for "kubectl:latest" instead of "bitnami/kubectl:1.19", issued
helm push apps + helm repo update
and then tried it all again, but that also failed - note the "unauthorized: authentication required":
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 10m default-scheduler Successfully assigned ocp/releasename-apps-create-ltpa-keys-rfq2j to crc-ctj2r-master-0
Normal AddedInterface 9m56s multus Add eth0 [10.217.0.189/23]
Normal Pulled 9m41s kubelet Container image "ibmcom/websphere-liberty:kernel-java8-ibmjava-ubi" already present on machine
Normal Created 9m32s kubelet Created container create-ltpa-keys
Normal Started 9m19s kubelet Started container create-ltpa-keys
Normal Pulling 7m3s (x4 over 8m45s) kubelet Pulling image "kubectl:latest"
Warning Failed 6m58s (x4 over 8m40s) kubelet Failed to pull image "kubectl:latest": rpc error: code = Unknown desc = Error reading manifest latest in docker.io/library/kubectl: errors:
denied: requested access to the resource is denied
unauthorized: authentication required
Warning Failed 6m58s (x4 over 8m40s) kubelet Error: ErrImagePull
Normal BackOff 6m44s (x5 over 8m14s) kubelet Back-off pulling image "kubectl:latest"
Warning Failed 4m32s (x14 over 8m14s) kubelet Error: ImagePullBackOff
Question: it seems that one way or another, I need to authenticate to a registry so that the docker pull's can succeed. Please advise - thanks!
The following commands are not working, note I am on helm 3.
helm push apps local-development
Error: unknown command "push" for "helm"
Did you mean this?
pull
Run 'helm --help' for usage.
While building the S2I Core base image from source, the base image build fails on the tagging phase:
(base) Mikkos-MacBook-Pro-3:s2i-base-container makelm$ make build TARGET=rhel8 VERSIONS=core VERBOSE=1
Makefile:15: warning: overriding commands for target `core'
common/common.mk:87: warning: ignoring old commands for target `core'
Makefile:15: warning: overriding commands for target `core'
common/common.mk:87: warning: ignoring old commands for target `core'
VERSIONS="core" SKIP_SQUASH=1 UPDATE_BASE= OS=rhel8 CLEAN_AFTER= DOCKER_BUILD_CONTEXT=. OPENSHIFT_NAMESPACES="" CUSTOM_REPO="" REGISTRY="""" /usr/bin/env bash common/build.sh
-> Version core: building image from 'Dockerfile.rhel8' ...
-> Pulling image registry.access.redhat.com/ubi8:latest before building image from Dockerfile.rhel8.
The image registry.access.redhat.com/ubi8:latest is already pulled.
[+] Building 0.2s (10/10) FINISHED
=> [internal] load build definition from Dockerfile.rhel8 0.0s
=> => transferring dockerfile: 43B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for registry.access.redhat.com/ubi8:latest 0.0s
=> [1/5] FROM registry.access.redhat.com/ubi8:latest 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 573B 0.0s
=> CACHED [2/5] RUN INSTALL_PKGS="bsdtar findutils groff-base glibc-locale-source glibc-langpack-en 0.0s
=> CACHED [3/5] COPY ./root/ / 0.0s
=> CACHED [4/5] WORKDIR /opt/app-root/src 0.0s
=> CACHED [5/5] RUN rpm-file-permissions && useradd -u 1001 -r -g 0 -d /opt/app-root/src -s /sbin/nologin 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:2001f8f3115e301855c50ee940924cd688e95411e968fd2553663f6fe38468ad 0.0s
VERSIONS="core" SKIP_SQUASH=1 UPDATE_BASE= OS=rhel8 CLEAN_AFTER= DOCKER_BUILD_CONTEXT=. OPENSHIFT_NAMESPACES="" CUSTOM_REPO="" REGISTRY="""" /usr/bin/env bash common/tag.sh
Error: No such object:
make[1]: *** [core] Error 1
make: *** [build-serial] Error 2
Before running the build, I have installed the coreutils+md2man and updated my path to have the gnuutils to have priority.
We can get to the Curam login screen, but when we attempt to login, we just get a white screen. Looking in the logs of a producer pod we see the error: [ERROR ] Exception is: [jcc][t4][2013][11249][4.28.11] Connection authorization failure occurred. Reason: User ID or Password invalid. ERRORCODE=-4214, SQLSTATE=28000 DSRA0010E: SQL State = 28000, Error Code = -4,214
.
We believe that we have setup the relevant database properties correctly in our values.yaml file, and have encrypted the password correctly. We are struggling to debug this further and get to the root of the problem. We are not sure what the Helm install scripts are doing with these properties. If we knew that we could dig around further. The log file from the producer pod is attached here
producer.log
.
I am trying to download WebSphere Liberty image, and execute the following:
https://ibm.github.io/spm-kubernetes/02-build-images/setup_docker_context
docker run --rm
-v $ANT_HOME:/tmp/ant
-v $SPM_HOME/dockerfiles/Liberty/content/release-stage:/work/dir
-v $SPM_HOME/dockerfiles/Liberty/content/release-stage/SetEnvironment.sh:/work/SetEnvironment.sh
-w /work/dir
-u root
-e ANT_HOME=/tmp/ant
-e WLP_HOME=/opt/ibm/wlp
websphere-liberty:19.0.0.12-full-java8-ibmjava
bash -c 'export PATH=$ANT_HOME/bin:$PATH:.; build.sh internal.update.crypto.jar'
There are couple of issues:
then run the above.
Underlying reason is that /opt/ibm/java/lib folder is missing
I am following the SPM Runbook and have built my Docker images. I am using Minikube and I'm trying to push the images to the local repository. When I do so I get the following error:
[vnc@bluffing1 ~]$ docker push $DOCKER_REGISTRY/$PROJECT/xmlserver:latest
The push refers to repository [minikube.local:5000/minikubemtess/xmlserver]
Get http://minikube.local:5000/v2/: dial tcp 172.17.0.2:5000: connect: connection refused
Any advice?
Background info:
OS: RHEL 7.9
Docker Version: 20.10.2
Minikube Version: v1.13.1
Ant Version: 1.10.6
Java Version: 1.8.0_201
Liberty Version: 20.0.0.9
Cúram SPM Version: 7.0.11
Other info from my environment:
[vnc@bluffing1 ~]$ env | egrep "DOCKER_REGISTRY|PROJECT"
PROJECT=minikubemtess
DOCKER_REGISTRY=minikube.local:5000
[vnc@bluffing1 ~]$ minikube ip
172.17.0.2
[vnc@bluffing1 ~]$ minikube addons list
|-----------------------------|----------|--------------|
| ADDON NAME | PROFILE | STATUS |
|-----------------------------|----------|--------------|
| ambassador | minikube | disabled |
| csi-hostpath-driver | minikube | disabled |
| dashboard | minikube | disabled |
| default-storageclass | minikube | enabled \u2705 |
| efk | minikube | disabled |
| freshpod | minikube | disabled |
| gcp-auth | minikube | disabled |
| gvisor | minikube | disabled |
| helm-tiller | minikube | disabled |
| ingress | minikube | enabled \u2705 |
| ingress-dns | minikube | disabled |
| istio | minikube | disabled |
| istio-provisioner | minikube | disabled |
| kubevirt | minikube | disabled |
| logviewer | minikube | disabled |
| metallb | minikube | disabled |
| metrics-server | minikube | disabled |
| nvidia-driver-installer | minikube | disabled |
| nvidia-gpu-device-plugin | minikube | disabled |
| olm | minikube | disabled |
| pod-security-policy | minikube | disabled |
| registry | minikube | enabled \u2705 |
| registry-aliases | minikube | disabled |
| registry-creds | minikube | disabled |
| storage-provisioner | minikube | enabled \u2705 |
| storage-provisioner-gluster | minikube | disabled |
| volumesnapshots | minikube | disabled |
|-----------------------------|----------|--------------|
[vnc@bluffing1 bin]$ cat /etc/docker/daemon.json
{
"insecure-registries": [
"172.17.0.2/16"
]
}
Can you clarify what is the need for SPM JMS Engine (On Virtual Machine) in SPM Open Shift Reference Architecture?
Issue:
The batch pod no longer starts with the following error:
MountVolume.SetUp failed for volume "debug-file" : configmap "spm-dev01-debug" not found
The batch/templates/cronjob.yaml
file references {{ $.Release.Name }}-debug
per:
- name: debug-file
configMap:
name: {{ $.Release.Name }}-debug
However there doesn't seem to be a confiqmap-debug.yaml file or anything in the project for that volume to reference.
I am following the SPM Runbook and I want to install Openshift CRC (please see https://ibm.github.io/spm-kubernetes/prereq/openshift/codeready-containers). I clicked that that latest release link to download Code Ready Containers, but the page catches me off guard. I was expecting to see the software listed/download options. Instead I had to poke around a bit, and I finally realized/discovered what I needed was under the "Sandbox" tab. We should update the documentation to clarify that / eliminate confusion for the developer/customer.
It looks like this when you click latest release link:
The installer and pull secret are actually under the Sandbox tab:
I am running a automatized CI build for SPM 7.0.11 codebase (project codebase) and experiencing following Ant build error with the libertyEAR
Ant target:
dispmsg:
[echo] 04:04:32 Starting buildEAR
[echo] Using properties file '/home/jenkins/agent/workspace/masked-project/masked-project-spm-application-build-pipeline/EJBServer/project/properties/AppServer.properties'.
check.properties.exists:
[mkdir] Created dir: /home/jenkins/agent/workspace/masked-project/masked-project-spm-application-build-pipeline/EJBServer/build/ear/temp
[mkdir] Created dir: /home/jenkins/agent/workspace/masked-project/masked-project-spm-application-build-pipeline/EJBServer/build/ear/combined
[mkdir] Created dir: /home/jenkins/agent/workspace/masked-project/masked-project-spm-application-build-pipeline/EJBServer/build/ear/combined/META-INF
[mkdir] Created dir: /home/jenkins/agent/workspace/masked-project/masked-project-spm-application-build-pipeline/EJBServer/build/ear/temp/extrajars
[mkdir] Created dir: /home/jenkins/agent/workspace/masked-project/masked-project-spm-application-build-pipeline/EJBServer/build/ear/WLP
BUILD FAILED
/home/jenkins/agent/workspace/masked-project/masked-project-spm-application-build-pipeline/CuramSDEJ/bin/build.xml:147: The following error occurred while executing this line:
/home/jenkins/agent/workspace/masked-project/masked-project-spm-application-build-pipeline/CuramSDEJ/bin/app_buildEAR.xml:170: Warning: Could not find file /home/jenkins/agent/workspace/masked-project/masked-project-spm-application-build-pipeline/EJBServer/tools/search/GlobalSearchServer_lib/gss.jar to copy.
As per documentation GSS should be deprecated in latest SPM releases, so why is it referenced by the Ant build?
When I execute the helm install command to deploy SPM, the first job fails on connecting to DB2. See log below.
From my terminal, I can connect to DB2 successfully. A google search suggests that DB2 be configured to enable tcpip for the DB2COMM config parameter, which is already set. From my local SPM installation, a build configtest also connects to the DB.
Any suggestions?
`Unable to locate tools.jar. Expected to find it in /opt/ibm/java/lib/tools.jar
Buildfile: /opt/ibm/Curam/release/CuramSDEJ/util/loadsql.xml
check.parameter.type:
check.db.type:
check.props.inside.file:
check.curam.environment.bindings.location.isset:
check.curam.environment.bindings.location.valid:
run.database.db2:
ora.use.servicename:
run.database.ora:
run.database.zos:
get.decrypted.db.password:
load:
targetDirectory:
BUILD FAILED
/opt/ibm/Curam/release/CuramSDEJ/util/loadsql.xml:40: The following error occurred while executing this line:
/opt/ibm/Curam/release/CuramSDEJ/util/loadsql.xml:53: com.ibm.db2.jcc.am.DisconnectNonTransientConnectionException: [jcc][t4][2043][11550][4.21.29] Exception java.net.ConnectException: Error opening socket to server localhost/127.0.0.1 on port 50,000 with message: Connection refused (Connection refused). ERRORCODE=-4499, SQLSTATE=08001
at com.ibm.db2.jcc.am.kd.a(kd.java:338)
at com.ibm.db2.jcc.am.kd.a(kd.java:435)
at com.ibm.db2.jcc.t4.ac.a(ac.java:440)
at com.ibm.db2.jcc.t4.ac.(ac.java:96)
at com.ibm.db2.jcc.t4.a.b(a.java:366)
at com.ibm.db2.jcc.t4.b.newAgent_(b.java:2076)
at com.ibm.db2.jcc.am.Connection.initConnection(Connection.java:812)
at com.ibm.db2.jcc.am.Connection.(Connection.java:754)
at com.ibm.db2.jcc.t4.b.(b.java:339)
at com.ibm.db2.jcc.DB2SimpleDataSource.getConnection(DB2SimpleDataSource.java:233)
at com.ibm.db2.jcc.DB2SimpleDataSource.getConnection(DB2SimpleDataSource.java:199)
at com.ibm.db2.jcc.DB2Driver.connect(DB2Driver.java:482)
at com.ibm.db2.jcc.DB2Driver.connect(DB2Driver.java:116)
at org.apache.tools.ant.taskdefs.JDBCTask.getConnection(JDBCTask.java:364)
at org.apache.tools.ant.taskdefs.SQLExec.getConnection(SQLExec.java:953)
at org.apache.tools.ant.taskdefs.SQLExec.execute(SQLExec.java:649)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
at java.lang.reflect.Method.invoke(Method.java:508)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:99)
at org.apache.tools.ant.Task.perform(Task.java:350)
at org.apache.tools.ant.Target.execute(Target.java:449)
at org.apache.tools.ant.Target.performTasks(Target.java:470)
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1391)
at org.apache.tools.ant.helper.SingleCheckExecutor.executeTargets(SingleCheckExecutor.java:36)
at org.apache.tools.ant.Project.executeTargets(Project.java:1254)
at org.apache.tools.ant.taskdefs.Ant.execute(Ant.java:437)
at org.apache.tools.ant.taskdefs.CallTarget.execute(CallTarget.java:106)
at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
at java.lang.reflect.Method.invoke(Method.java:508)
at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:99)
at org.apache.tools.ant.Task.perform(Task.java:350)
at org.apache.tools.ant.Target.execute(Target.java:449)
at org.apache.tools.ant.Target.performTasks(Target.java:470)
at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1391)
at org.apache.tools.ant.Project.executeTarget(Project.java:1364)
at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
at org.apache.tools.ant.Project.executeTargets(Project.java:1254)
at org.apache.tools.ant.Main.runBuild(Main.java:830)
at org.apache.tools.ant.Main.startAnt(Main.java:223)
at org.apache.tools.ant.launch.Launcher.run(Launcher.java:284)
at org.apache.tools.ant.launch.Launcher.main(Launcher.java:101)
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:380)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:236)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:218)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:403)
at java.net.Socket.connect(Socket.java:682)
at com.ibm.db2.jcc.t4.w.run(w.java:49)
at java.security.AccessController.doPrivileged(AccessController.java:734)
at com.ibm.db2.jcc.t4.ac.a(ac.java:426)
... 42 more
Total time: 16 seconds`
Issue:
When running MQ with the following properties:
mq:
version: 9.1.3.0
# Set to True if running MQ in HA mode
useConnectionNameList: true
tlsSecretName: 'spm-dev01-mq-secret'
queueManager:
name: 'QM1'
secret:
# name is the secret that contains the 'admin' user password and the 'app' user password to use for messaging
name: ''
# adminPasswordKey is the secret key that contains the 'admin' user password
adminPasswordKey: 'adminPasswordKey'
# appPasswordKey is the secret key that contains the 'admin' user password
appPasswordKey: 'appPasswordKey'
metrics:
enabled: false
resources: {}
multiInstance:
cephEnabled: false
storageClassName: 'nfs'
nfsEnabled: true
nfsIP: 'fs-xxxxxxxx.efs.eu-west-2.amazonaws.com'
nfsFolder: 'spm-dev01'
nfsMountOptions:
- "nfsvers=4.1"
- "rsize=1048576"
- "wsize=1048576"
- "hard"
- "timeo=600"
- "retrans=2"
- "noresvport"
When the curam-mq and rest-mq pods start they connect mount the AWS EFS file system, and Kubernetes(EKS) returns the following error:
Warning FailedMount 2m31s kubelet, ip-100-64-18-180.eu-west-2.compute.internal MountVolume.SetUp failed for volume "spm-dev01-curam-pv-qm" : mount failed: exit status 32
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/11e52fd5-fb8c-40b6-9cf7-b252f1f4e1ac/volumes/kubernetes.io~nfs/spm-dev01-curam-pv-qm --scope -- mount -t nfs -o hard,nfsvers=4.1,noresvport,retrans=2,rsize=1048576,timeo=600,wsize=1048576 fs-1c2e6eed.efs.eu-west-2.amazonaws.com:/spm-dev01/curam /var/lib/kubelet/pods/11e52fd5-fb8c-40b6-9cf7-b252f1f4e1ac/volumes/kubernetes.io~nfs/spm-dev01-curam-pv-qm
Output: Running scope as unit run-12552.scope.
mount.nfs: mounting fs-xxxxxx.efs.eu-west-2.amazonaws.com:/spm-dev01/curam failed, reason given by server: No such file or directory
Solution:
The only solution I've found for this is to the manually mount the EFS filesystem to one of the worker nodes and make the directories using:
sudo mount -t nfs -o hard,nfsvers=4.1,noresvport,retrans=2,rsize=1048576,timeo=600,wsize=1048576 fs-xxxxx.efs.eu-west-2.amazonaws.com:/ /mnt/
sudo mkdir -p /mnt/spm-dev01/curam/logs
sudo mkdir -p /mnt/spm-dev01/curam/data
sudo mkdir -p /mnt/spm-dev01/rest/data
sudo mkdir -p /mnt/spm-dev01/rest/logs
I'm not sure if there's an easier why do this using IKS but this seems to be the only way for AWS EFS, other suggestions include starting a init-container(see here) to start, mount the filesystem and then create the paths.
Couldn't the file system just about mounted as /
and any paths created by the pods themselves at runtime?
Issue:
in mqservers/templates/statefulsets.yaml
the containers uses the ibmcom/mq
image from the public IBM repo, whereas the init-containers uses the
container:
containers:
- name: {{ $.Chart.Name }}-{{ $name }}
image: ibmcom/mq:{{ $.Values.global.mq.version }}
init-container:
initContainers:
- name: {{ $.Chart.Name }}-{{ $name }}-init
image: {{ include "mqserver.imageFullName" $.Values.global.images }}
Shouldn't these be the same, the instructions in https://ibm.github.io/spm-kubernetes/03-build-images/build_images don't mention building or pushing the mqserver image. Because if this i'd get errors on pulling the init containers.
Solution:
Change the image to:
initContainers:
- name: {{ $.Chart.Name }}-{{ $name }}-init
image: ibmcom/mq:{{ $.Values.global.mq.version }}
Once this is done the init container pulls OK.
Looking at https://www.ibm.com/support/knowledgecenter/SS8S5A_7.0.11/com.ibm.curam.wlp.doc/Deployment_WLP/cWLPTestingDeployment.html it says to get the URL for the deployed app to search for CWWKT0016I.
For me this gives:
[AUDIT ] CWWKT0016I: Web application available (default_host): https://rel6-apps-curam-producer-88b8cc888-rml4q:8443/Curam/
This fails to return anything.
I also tried using the IP given from 'minikube ip', i.e. https://192.168.99.104:8443/Curam/, but this gives a 403:
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {
},
"status": "Failure",
"message": "forbidden: User "system:anonymous" cannot get path "/Curam/"",
"reason": "Forbidden",
"details": {
},
"code": 403
}
I tried kubectl describe service rel6-web and kubectl describe service rel6-apps-curam to see if there was an Ingress IP as suggested by some googling, but that didn't yield anything
attached output of kubectl logs for web, producer and consumer.
I can see in the consumer some issue with curamtimerdb (possibly missed a WLS config step) but wouldn't have thought this would cause the basic access issue? Any thoughts on what to do next much appreciated.
(also dockered onto the actual consumer/producer to see if anything in logs but the kubectl version seemed to have better info, unless i was looking in wrong place on WLS)
thanks
consumer.log
producer.log
web.log
In V8, support was introduced for the ojdbc8 Oracle database driver. For more information, refer to the IBM Cúram Social Program Management 8.0.0 release notes .
The helm chart needs to be updated to use the correct naming convention.
During going through the SPM Runbook (https://ibm.github.io/spm-kubernetes/prereq/kubernetes/minikube), I have found that some commands seem to be outdated or incorrect.
For example, in "minikube start --vm-driver=virtualbox --cpus 4 --memory 8G --insecure-registry "192.168.0.0/16" --disk-size='30G' --kubernetes-version v1.18.6", --vm-driver='': DEPRECATED, use driver
instead and no need to have '' for disk-size...etc.
If I am correct, we might need to update the document? Thanks.
Issue:
When setting variables for MQ as follows:
mq:
version: 9.1.3.0
# Set to True if running MQ in HA mode
useConnectionNameList: true
tlsSecretName: 'spm-dev01-mq-secret'
The pods for mqserver-curam
and mqserver-rest
don't start correctly and throw the following error:
MountVolume.SetUp failed for volume "service-certs" : secret "spm-dev01-spm-dev01-mq-secret" not found
This seems to be caused by inconsistencies between apps/templates/deployment-consumer.yaml
, apps/templates/deployment-producer.yaml
and mqserver/templates/deployment.yaml
the Apps producer/consumer deployment scripts set the mq-cert
secret volume as such:
{{- if $.Values.global.mq.tlsSecretName }}
- name: mq-certs
secret:
{{- if $.Values.global.mq.useConnectionNameList }}
secretName: {{ $.Values.global.mq.tlsSecretName }}
{{- else }}
secretName: {{ $.Release.Name }}-mq-secret
{{- end }}
{{- end}}
Whereas the MQ deployment.yaml sets service-certs
as:
{{- if $.Values.global.mq.tlsSecretName }}
- name: service-certs
secret:
secretName: {{ $.Release.Name }}-{{ $.Values.global.mq.tlsSecretName }}
{{- end}}
This leads the the namespace, in the this case spm-dev01
been suffixed to service-certs
but not to mq-certs
If tlsSecretName
is left at the default of mq-secret
the opposite occurs and the consumer/producer pods fail to deploy.
Solution:
Changing mqserver/templates/deployment.yaml
and mqserver/templates/statefulset.yaml
from:
{{- if $.Values.global.mq.tlsSecretName }}
- name: service-certs
secret:
secretName: {{ $.Release.Name }}-{{ $.Values.global.mq.tlsSecretName }}
{{- end}}
to:
{{- if $.Values.global.mq.tlsSecretName }}
- name: service-certs
secret:
{{- if $.Values.global.mq.useConnectionNameList }}
secretName: {{ $.Values.global.mq.tlsSecretName }}
{{- else }}
secretName: {{ $.Release.Name }}-mq-secret
{{- end }}
{{- end}}
Fixes the issue, although i'm a bit unsure on some of the logic here, don't the if statements need to be other other way round e.g
{{- if $.Values.global.mq.useConnectionNameList }}
- name: service-certs
secret:
{{- if $.Values.global.mq.tlsSecretName }}
secretName: {{ $.Values.global.mq.tlsSecretName }}
{{- else }}
secretName: {{ $.Release.Name }}-mq-secret
{{- end }}
{{- end}}
We are supporting ESDC's High Fidelity Prototype Project which is aimed to demonstrate Curam in OpenShift Containairized environment. Our SI is following the runbook provided by PD to build and deploy the Curam application. The following technical exception is encountered for a simple Universal Access application being submitted and further processed by the REST Consumer. This is kind of blocking the High Fidelity Prototype project which has huge implications in restoring Customer confidence in using Curam for longer term in upcoming ESDC projects.
Support Case WH00012069 is also raised to track this issue.
https://ibmwatsonhealth.force.com/mysupport/s/case/5001U00000lEOFZQA4/esdc-hfp-technical-exception-encountered-in-ibm-mq-with-openshift-containerized-environment?openCase=true
[INFO ] FFDC1015I: An FFDC Incident has been created: "javax.transaction.xa.XAException: The method 'xa_start' has failed with errorCode '-6'. com.ibm.tx.jta.impl.JTAXAResourceImpl.start 307" at ffdc_21.04.28_20.53.01.0.log
[INFO ] FFDC1015I: An FFDC Incident has been created: "javax.transaction.xa.XAException: The method 'xa_start' has failed with errorCode '-6'. com.ibm.tx.jta.impl.RegisteredResources.startRes 1053" at ffdc_21.04.28_20.53.01.1.log
[ERROR ] WTRN0078E: An attempt by the transaction manager to call start on a transactional resource has resulted in an error. The error code was XAER_PROTO. The exception stack trace follows: javax.transaction.xa.XAException: The method 'xa_start' has failed with errorCode '-6'.
at com.ibm.mq.jmqi.JmqiXAResource.start(JmqiXAResource.java:980)
at com.ibm.mq.connector.xa.XARWrapper.start(XARWrapper.java:680)
at com.ibm.ws.Transaction.JTA.JTAResourceBase.start(JTAResourceBase.java:121)
at [internal classes]
at com.ibm.mq.connector.inbound.AbstractWorkImpl.run(AbstractWorkImpl.java:210)
at com.ibm.ws.jca.inbound.security.JCASecurityContextService.runInInboundSecurityContext(JCASecurityContextService.java:49)
at [internal classes]
[INFO ] FFDC1015I: An FFDC Incident has been created: "javax.transaction.SystemException: XAResource start association error:XAER_PROTO com.ibm.tx.jta.impl.RegisteredResources.enlistResource 523" at ffdc_21.04.28_20.53.01.2.log
[INFO ] FFDC1015I: An FFDC Incident has been created: "javax.transaction.SystemException: XAResource start association error:XAER_PROTO com.ibm.tx.jta.TransactionImpl.enlistResource 2042" at ffdc_21.04.28_20.53.01.3.log
[INFO ] FFDC1015I: An FFDC Incident has been created: "javax.transaction.SystemException: XAResource start association error:XAER_PROTO com.ibm.ws.ejbcontainer.mdb.MessageEndpointBase.beforeDelivery 1244" at ffdc_21.04.28_20.53.01.4.log
[INFO ] Message delivery to an MDB 'curam.util.jms.MDBProxyDPEnactmentMDB_49ac78a4@99d46d8f(BeanId(CuramServerCode#coreinf-ejb.jar#DPEnactmentMDB, null))' failed with exception: 'beforeDelivery failure'.
[INFO ] FFDC1015I: An FFDC Incident has been created: "com.ibm.websphere.csi.CSITransactionRolledbackException: Transaction marked rollbackonly com.ibm.ejs.container.EJSContainer.postInvoke 2326" at ffdc_21.04.28_20.53.01.5.log
[INFO ] FFDC1015I: An FFDC Incident has been created: "javax.ejb.TransactionRolledbackLocalException: ; nested exception is: com.ibm.websphere.csi.CSITransactionRolledbackException: Transaction marked rollbackonly com.ibm.ws.ejbcontainer.mdb.MessageEndpointBase.afterDelivery 1280" at ffdc_21.04.28_20.53.01.6.log
[INFO ] WTRN0006W: Transaction 000001791A41120D00000001681E949CCC38DCA6F5D43D42714F62BDC24547D1E7EFE6F5000001791A41120D00000001681E949CCC38DCA6F5D43D42714F62BDC24547D1E7EFE6F500000001 has timed out after 180 seconds.
[INFO ] WTRN0124I: When the timeout occurred the thread with which the transaction is, or was most recently, associated was Thread[Default Executor-thread-6,5,Default Executor Thread Group]. The stack trace of this thread when the timeout occurred was:
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:218)
com.ibm.ws.threading.internal.BoundedBuffer.waitGet_(BoundedBuffer.java:176)
com.ibm.ws.threading.internal.BoundedBuffer.take(BoundedBuffer.java:647)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1085)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
java.lang.Thread.run(Thread.java:822)
And here is what the MQ does:
2021-04-28T20:53:01.444Z AMQ6125E: An internal IBM MQ error has occurred.
2021-04-28T20:53:01.444Z AMQ6184W: An internal IBM MQ error has occurred on queue manager QM1.
After a successful helm install/deployment, the application is not accessible. The "Producer" and "Consumer" pods are in a Pending state, and examining their logs they are stating insufficient CPU. We are using Fyre VM's that have 4 CPU's and 32Gb of memory. See attached for logs - please advise. Thanks!
Prereqs all followed (i believe)
Images built and pushed to Docker Registry ok
Helm charts prepared, packaged and pushed ok.
Now run:
helm install release2 local-development/spm -f ../static/resources/crc-values.yaml
in separate tab:
kubectl get pods -w
NAME READY STATUS RESTARTS AGE
release1-apps-apply-customsql-mbl2f 0/1 ImagePullBackOff 0 18m
release2-apps-apply-customsql-fc8pb 0/1 ImagePullBackOff 0 53s
releasename-apps-apply-customsql-2bm8l 0/1 ImagePullBackOff 0 22m
release2-apps-apply-customsql-fc8pb 0/1 ErrImagePull 0 56s
(note releasename and release1 were also earlier fails)
Viewing log:
kubectl logs -f pod/release1-apps-apply-customsql-mbl2f
Error from server (BadRequest): container "apply-customsql" in pod "release1-apps-apply-customsql-mbl2f" is waiting to start: trying and failing to pull image
OS = MacOSx Catalina 10.15.7
ChartMuseum installed locally and run like this:
chartmuseum --debug --port=8080 --storage="local" --storage-local-rootdir="./chartstorage"
CRC installed locally:
crc version
CodeReady Containers version: 1.15.0+e317bed
OpenShift version: 4.5.7 (embedded in binary)
Docker version 19.03.12
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.