Coder Social home page Coder Social logo

assisted-installer's People

Contributors

adriengentil avatar asalkeld avatar carbonin avatar crystalchun avatar danielerez avatar dependabot[bot] avatar eifrach avatar eliorerz avatar empovit avatar eranco74 avatar filanov avatar flaper87 avatar javipolo avatar machacekondra avatar masayag avatar mkowalski avatar omertuc avatar openshift-bot avatar ori-amizur avatar oshercc avatar osherdp avatar paul-maidment avatar rollandf avatar rwsu avatar sacharya avatar slaviered avatar tsorya avatar ybettan avatar yevgeny-shnaidman avatar yuvigold avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

assisted-installer's Issues

Add a contributing guide?

Or a section in the README. Not clear from this repo that you're looking for contributions. :)

Note: this is a drive-by issue. I saw this after @flaper87 mentioned this repo in a podcast, and figured I would open this.

Assisted Installer doesn't detect Add-On nodes

Hi, I installed a cluster with AI and initially installed with 3 Masters. Now the cluster state is "installed". Now I wanted to add a add-on node which is a worker node and that doesn't get discovered on the AI GUI. I ensure the new node gets DHCP and it has access to public internet.

Is this expected behavior and still not supported to create add-on nodes?. If so, how can I add additional nodes now to the existing cluster?

serious bug with SNO?

Hi,
I've reinstalled SNO on a VM today from scratch. Installation went without a problem.

However, I see 2 weird things compared to yesterday:

  1. When trying to deploy any sample app (sample Python basic for example), I get the following in the log stream:
Cloning "https://github.com/elsony/devfile-sample-python-basic.git" ...
error: fatal: unable to access 'https://github.com/elsony/devfile-sample-python-basic.git/': SSL certificate problem: self signed certificate in certificate chain
  1. Clicking on "add" and selecting "All services" - I only get 4 services (Basic NodeJS, Basic Python, Basic Quarkus, Basic Spring boot) and only Devfile type. There are no other items in the catalog.

Should I open a bug in BZ about it?

bug with OCP 4.8.2 installed with assisted installer

Hi,
I've used the assisted installer to install OpenShift 4.8.2 to a VM running on CentOS 8 using kvm - using the SNO method.
The installation process went fine, they were no errors and the monitor in cloud.redhat.com doesn't show anything.

I tried to test a simple thing. I should mention that the SNO method is the default, nothing has been changed or tweaked from the stock install.

I used the oc command to create a project and test rails-postgresql-example. It fails due to "but the administrator has not configured the integrated container image registry".

Here are the logs. Should I post it in bugzilla? (if so, under which component?)

$ oc new-project test124
Now using project "test124" on server "https://api.demo.hetzlabs.local:6443".

You can add applications to this project with the 'new-app' command. For example, try:

    oc new-app rails-postgresql-example

to build a new example application in Ruby. Or use kubectl to deploy a simple Kubernetes application:

    kubectl create deployment hello-node --image=k8s.gcr.io/serve_hostname

$ oc new-app rails-postgresql-example
--> Deploying template "openshift/rails-postgresql-example" to project test124

     Rails + PostgreSQL (Ephemeral)
     ---------
     An example Rails application with a PostgreSQL database. For more information about using this template, including OpenShift considerations, see https://github.com/sclorg/rails-ex/blob/master/README.md.
     
     WARNING: Any data stored will be lost upon pod destruction. Only use this template for testing.

     The following service(s) have been created in your project: rails-postgresql-example, postgresql.
     
     For more information about using this template, including OpenShift considerations, see https://github.com/sclorg/rails-ex/blob/master/README.md.

     * With parameters:
        * Name=rails-postgresql-example
        * Namespace=openshift
        * Memory Limit=512Mi
        * Memory Limit (PostgreSQL)=512Mi
        * Git Repository URL=https://github.com/sclorg/rails-ex.git
        * Git Reference=
        * Context Directory=
        * Application Hostname=
        * GitHub Webhook Secret=4cker8Iaya8IYMaKAcb5mIx5b5jy23yyHexlthkN # generated
        * Secret Key=4xngxigktgwccawjm8vx2u16u3j7heq1n7xuhccwwdhhx64vuga5o7f7d45n6fx8ooyexkub65qetedgm2kn0nc7onkx7tdgsbdkh4x3oehhj5m7jyjrx0w7rpon61o # generated
        * Application Username=openshift
        * Application Password=secret
        * Rails Environment=production
        * Database Service Name=postgresql
        * Database Username=userNMR # generated
        * Database Password=T47h18Ne # generated
        * Database Name=root
        * Maximum Database Connections=100
        * Shared Buffer Amount=12MB
        * Custom RubyGems Mirror URL=

--> Creating resources ...
    secret "rails-postgresql-example" created
    service "rails-postgresql-example" created
    route.route.openshift.io "rails-postgresql-example" created
    imagestream.image.openshift.io "rails-postgresql-example" created
    buildconfig.build.openshift.io "rails-postgresql-example" created
    deploymentconfig.apps.openshift.io "rails-postgresql-example" created
    service "postgresql" created
    deploymentconfig.apps.openshift.io "postgresql" created
--> Success
    Access your application via route 'rails-postgresql-example-test124.apps.demo.hetzlabs.local' 
    WARNING: No container image registry has been configured with the server. Automatic builds and deployments may not function.
    Build scheduled, use 'oc logs -f buildconfig/rails-postgresql-example' to track its progress.
    Run 'oc status' to view your app.

$ oc status
In project test124 on server https://api.demo.hetzlabs.local:6443

svc/postgresql - 172.30.147.134:5432
  dc/postgresql deploys openshift/postgresql:12-el8 
    deployment #1 deployed 3 minutes ago - 1 pod

http://rails-postgresql-example-test124.apps.demo.hetzlabs.local (svc/rails-postgresql-example)
  dc/rails-postgresql-example deploys istag/rails-postgresql-example:latest <-
    bc/rails-postgresql-example source builds https://github.com/sclorg/rails-ex.git on openshift/ruby:2.6-ubi8 
      build #1 new for 3 minutes (can't push to image)
    deployment #1 waiting on image or update

Errors:
  * bc/rails-postgresql-example is pushing to istag/rails-postgresql-example:latest, but the administrator has not configured the integrated container image registry.

1 error identified, use 'oc status --suggest' to see details.

I'm sure there is a simple fix for that, but I think it should be fixed in RH's OCP IMHO...

[APPROVALNOTIFIER] This PR is **APPROVED**

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: YuviGold

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Originally posted by @openshift-ci-robot in #201 (comment)

Which doc is right?

Looking at the instructions in Redhat's cloud page after creating a SNO install, it asks the user to create 3 records in their DNS server: api, api-int, and a wildcard *.apps.<cluster_ID>.

However, looking at the OKD docs here, at the bottom of the page, it looks there's a 4th records. I don't see it mentioned in cloud.redhat.com pages.

So which one is right and how do I contact the docs team to fix it?

Thanks

assisted-installer-controller Job does not apply Additional Root CA Trust Bundle

ISSUE:

When installing an OpenShift cluster that has a combination of:

  • Outbound Proxy
  • SSL Re-encryption with another trusted root CA (Cisco/McAfee/Squid proxy SSL MitM basically)

...the installation halts with the bootstrap node at reached installation stage Waiting for controller: waiting for controller pod ready event and the other two control plane nodes at a Joined status.

CAUSE:

Once the assisted-installer-controller Job is created in the assisted-installer namespace, it correctly passes in the Outbound Proxy but the Additional Trust Bundles are not added as a mounted volume. If the Outbound Proxy performs SSL re-encryption then the Pod will fail with the following, even if the CA that is performing the re-encryption is applied as an additionalTrustBundle certificate:

time="2022-08-01T23:39:07Z" level=info msg="Start running Assisted-Controller. Configuration is:\n struct ControllerConfig {\n\tClusterID: \"6e08c390-3b8c-4009-9a41-605c7ff40f25\",\n\tURL: \"https://api.openshift.com\",\n\tPullSecretToken: <SECRET>,\n\tSkipCertVerification: false,\n\tCACertPath: \"\",\n\tNamespace: \"assisted-installer\",\n\tOpenshiftVersion: \"4.10.18\",\n\tHighAvailabilityMode: \"Full\",\n\tWaitForClusterVersion: true,\n\tMustGatherImage: \"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:03a06499c3efae948535eb61c340efb8511d3ac45db1ca9fccfe5515e49a70ac\",\n\tDryRunEnabled: false,\n\tDryFakeRebootMarkerPath: \"\",\n\tDryRunClusterHostsPath: \"\",\n\tParsedClusterHosts: config.DryClusterHosts(nil),\n}"
W0801 23:39:07.295522       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0801 23:39:09.236233       1 request.go:601] Waited for 1.045203989s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/node.k8s.io/v1?timeout=32s
time="2022-08-01T23:39:09Z" level=info msg="Using proxy {HTTPProxy:http://192.168.42.31:3128/ HTTPSProxy:http://192.168.42.31:3128/ NoProxy:.cluster.local,.kemo.labs,.kemo.network,.svc,.svc.cluster.local,10.128.0.0/14,127.0.0.1,172.30.0.0/16,192.168.0.0/16,192.168.70.0/23,api-int.core-ocp.d70.lab.kemo.network,localhost} to set env-vars for installer-controller pod"
time="2022-08-01T23:39:09Z" level=info msg="Start waiting to be ready"
time="2022-08-01T23:39:09Z" level=info msg="Making sure service dns-default can reserve the .10 address"
time="2022-08-01T23:39:09Z" level=info msg="No service found with IP 172.30.0.10, attempt 1/45"
time="2022-08-01T23:39:10Z" level=warning msg="Failed to connect to assisted service" error="Get \"https://api.openshift.com/api/assisted-install/v2/clusters/6e08c390-3b8c-4009-9a41-605c7ff40f25?exclude-hosts=true\": x509: certificate signed by unknown authority"
time="2022-08-01T23:39:11Z" level=warning msg="Failed to connect to assisted service" error="Get \"https://api.openshift.com/api/assisted-install/v2/clusters/6e08c390-3b8c-4009-9a41-605c7ff40f25?exclude-hosts=true\": x509: certificate signed by unknown authority"
time="2022-08-01T23:39:12Z" level=warning msg="Failed to connect to assisted service" error="Get \"https://api.openshift.com/api/assisted-install/v2/clusters/6e08c390-3b8c-4009-9a41-605c7ff40f25?exclude-hosts=true\": x509: certificate signed by unknown authority"
time="2022-08-01T23:39:13Z" level=warning msg="Failed to connect to assisted service" error="Get \"https://api.openshift.com/api/assisted-install/v2/clusters/6e08c390-3b8c-4009-9a41-605c7ff40f25?exclude-hosts=true\": x509: certificate signed by unknown authority"
time="2022-08-01T23:39:14Z" level=warning msg="Failed to connect to assisted service" error="Get \"https://api.openshift.com/api/assisted-install/v2/clusters/6e08c390-3b8c-4009-9a41-605c7ff40f25?exclude-hosts=true\": x509: certificate signed by unknown authority"

The container is using it's own ca-certificates installed RPM trusted bundle, and thus has none of the additionalTrustBundle CA Certificates updated on the RHCOS system.

You can manually force the installation to continue by modifying the assisted-installer-controller-config ConfigMap to set the .data.ca-cert-path: /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem and deleting the assisted-installer-controller Job in the assisted-installer namespace and then recreating it with the RHCOS system trusted root store mounted:

apiVersion: batch/v1
kind: Job
metadata:
  labels:
    app: assisted-installer-controller
    job-name: assisted-installer-controller
  name: assisted-installer-controller
  namespace: assisted-installer
spec:
  backoffLimit: 100
  completionMode: NonIndexed
  completions: 1
  parallelism: 1
  suspend: false
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: assisted-installer-controller
        job-name: assisted-installer-controller
    spec:
      containers:
      - env:
        - name: CLUSTER_ID
          valueFrom:
            configMapKeyRef:
              key: cluster-id
              name: assisted-installer-controller-config
        - name: INVENTORY_URL
          valueFrom:
            configMapKeyRef:
              key: inventory-url
              name: assisted-installer-controller-config
        - name: PULL_SECRET_TOKEN
          valueFrom:
            secretKeyRef:
              key: pull-secret-token
              name: assisted-installer-controller-secret
        - name: CA_CERT_PATH
          valueFrom:
            configMapKeyRef:
              key: ca-cert-path
              name: assisted-installer-controller-config
              optional: true
        - name: SKIP_CERT_VERIFICATION
          valueFrom:
            configMapKeyRef:
              key: skip-cert-verification
              name: assisted-installer-controller-config
              optional: true
        - name: OPENSHIFT_VERSION
          value: 4.10.18
        - name: HIGH_AVAILABILITY_MODE
          valueFrom:
            configMapKeyRef:
              key: high-availability-mode
              name: assisted-installer-controller-config
              optional: true
        - name: CHECK_CLUSTER_VERSION
          valueFrom:
            configMapKeyRef:
              key: check-cluster-version
              name: assisted-installer-controller-config
              optional: true
        - name: MUST_GATHER_IMAGE
          valueFrom:
            configMapKeyRef:
              key: must-gather-image
              name: assisted-installer-controller-config
              optional: true
        image: registry.redhat.io/rhai-tech-preview/assisted-installer-reporter-rhel8:v1.0.0-238
        imagePullPolicy: IfNotPresent
        name: assisted-installer-controller
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - name: service-ca-cert-config
          mountPath: /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem
          readOnly: true
      dnsPolicy: ClusterFirst
      hostNetwork: true
      nodeSelector:
        node-role.kubernetes.io/master: ""
      restartPolicy: OnFailure
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: assisted-installer-controller
      serviceAccountName: assisted-installer-controller
      terminationGracePeriodSeconds: 30
      tolerations:
      - effect: NoSchedule
        key: node-role.kubernetes.io/master
        operator: Exists
      volumes:
        - name: service-ca-cert-config
          hostPath:
            path: /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem

EXPECTED RESULT:

When Root CA Certificates defined in the additionalTrustBundles spec are added to the RHCOS system trusted store and the store is update, those Root CA Certificates can now be found prepended in /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem - attach the volume to the Job's spec and associated ConfigMap by setting a default for .CACertPath when passing to the assisted-installer-controller-pod.yaml.template file.

The assisted-installer-controller Job Pod should continue as such:

time="2022-08-02T01:26:55Z" level=info msg="Start running Assisted-Controller. Configuration is:\n struct ControllerConfig {\n\tClusterID: \"ef4420c4-7d6b-4184-b746-e194e422b0fc\",\n\tURL: \"https://api.openshift.com\",\n\tPullSecretToken: <SECRET>,\n\tSkipCertVerification: false,\n\tCACertPath: \"/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem\",\n\tNamespace: \"assisted-installer\",\n\tOpenshiftVersion: \"4.10.18\",\n\tHighAvailabilityMode: \"Full\",\n\tWaitForClusterVersion: true,\n\tMustGatherImage: \"quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:03a06499c3efae948535eb61c340efb8511d3ac45db1ca9fccfe5515e49a70ac\",\n\tDryRunEnabled: false,\n\tDryFakeRebootMarkerPath: \"\",\n\tDryRunClusterHostsPath: \"\",\n\tParsedClusterHosts: config.DryClusterHosts(nil),\n}"
W0802 01:26:55.410855       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0802 01:26:56.465700       1 request.go:601] Waited for 1.014466487s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/k8s.ovn.org/v1?timeout=32s
time="2022-08-02T01:26:57Z" level=info msg="Using proxy {HTTPProxy:http://192.168.42.31:3128/ HTTPSProxy:http://192.168.42.31:3128/ NoProxy:.cluster.local,.kemo.labs,.kemo.network,.svc,.svc.cluster.local,10.128.0.0/14,127.0.0.1,172.30.0.0/16,192.168.0.0/16,192.168.70.0/23,api-int.core-ocp.d70.lab.kemo.network,localhost} to set env-vars for installer-controller pod"
time="2022-08-02T01:26:57Z" level=info msg="Using custom CA certificate: /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem"
time="2022-08-02T01:26:57Z" level=info msg="Start waiting to be ready"
time="2022-08-02T01:26:57Z" level=info msg="Making sure service dns-default can reserve the .10 address"
time="2022-08-02T01:26:57Z" level=info msg="Service dns-default has successfully taken IP 172.30.0.10"
time="2022-08-02T01:26:57Z" level=info msg="HackDNSAddressConflict finished"
time="2022-08-02T01:26:58Z" level=info msg="assisted-service is available"
time="2022-08-02T01:26:58Z" level=info msg="kube-apiserver is available"
time="2022-08-02T01:26:58Z" level=info msg="Sending ready event"
time="2022-08-02T01:26:58Z" level=info msg="monitor cluster installation status"
time="2022-08-02T01:26:58Z" level=info msg="Start sending logs"
time="2022-08-02T01:26:58Z" level=info msg="Waiting till all nodes will join and update status to assisted installer"
time="2022-08-02T01:26:58Z" level=info msg="Start approving CSRs"

PROPOSED FIX:

  • Set a default value of /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem for the CACertPath configuration variable in the src/config/config.go file.

Rejected build in assisted installer

Hi,

we were trying to install 4.6 in 3 bare metals and, once it was up we faced a few other issues which eventually led us to check and compare the cluster version and found that the cluster version is a rejected one.

Shouldn't assisted installer pick only accepted builds from pipelines[1]

FYI our cluster version:-

NAME VERSION AVAILABLE PROGRESSING
version 4.6.0-fc.9 True False 23h

[1]https://openshift-release.apps.ci.l2s4.p1.openshiftapps.com/

Regards,
Prajith

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.