Coder Social home page Coder Social logo

rke2's Introduction

RKE2

RKE2

RKE2, also known as RKE Government, is Rancher's next-generation Kubernetes distribution.

It is a fully conformant Kubernetes distribution that focuses on security and compliance within the U.S. Federal Government sector.

To meet these goals, RKE2 does the following:

For more information and detailed installation and operation instructions, please visit our docs.

Quick Start

Here's the extremely quick start:

curl -sfL https://get.rke2.io | sh -
systemctl enable rke2-server.service
systemctl start rke2-server.service
# Wait a bit
export KUBECONFIG=/etc/rancher/rke2/rke2.yaml PATH=$PATH:/var/lib/rancher/rke2/bin
kubectl get nodes

For a bit more, check out our full quick start guide.

Installation

A full breakdown of installation methods and information can be found here.

Configuration File

The primary way to configure RKE2 is through its config file. Command line arguments and environment variables are also available, but RKE2 is installed as a systemd service and thus these are not as easy to leverage.

By default, RKE2 will launch with the values present in the YAML file located at /etc/rancher/rke2/config.yaml.

An example of a basic server config file is below:

# /etc/rancher/rke2/config.yaml
write-kubeconfig-mode: "0644"
tls-san:
  - "foo.local"
node-label:
  - "foo=bar"
  - "something=amazing"

In general, cli arguments map to their respective yaml key, with repeatable cli args being represented as yaml lists. So, an identical configuration using solely cli arguments is shown below to demonstrate this:

rke2 server \
  --write-kubeconfig-mode "0644"    \
  --tls-san "foo.local"             \
  --node-label "foo=bar"            \
  --node-label "something=amazing"

It is also possible to use both a configuration file and cli arguments. In these situations, values will be loaded from both sources, but cli arguments will take precedence. For repeatable arguments such as --node-label, the cli arguments will overwrite all values in the list.

Finally, the location of the config file can be changed either through the cli argument --config FILE, -c FILE, or the environment variable $RKE2_CONFIG_FILE.

FAQ

Security

Security issues in RKE2 can be reported by sending an email to [email protected]. Please do not open security issues here.

rke2's People

Contributors

brandond avatar briandowns avatar brooksn avatar c3y1huang avatar caroline-suse-rancher avatar davidnuzik avatar dependabot[bot] avatar dereknola avatar dweomer avatar erikwilson avatar galal-hussein avatar harrisonwaffel avatar ibuildthecloud avatar jossemargt-3pillar avatar luthermonson avatar macedogm avatar manuelbuil avatar matttrach avatar mgfritch avatar monzelmasry avatar nikolaishields avatar oats87 avatar phillipsj avatar rancher-max avatar rbrtbnfgl avatar rosskirkpat avatar shylajadevadiga avatar thomasferrandiz avatar vadorovsky avatar yaocw2020 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rke2's Issues

Customize aws profile in provider.

Few issues found during the initial install of rke2

  1. aws profile is hardcoded, had to manually adjust the tf files to set the correct profile
provider "aws" {
  region  = "us-east-2"
  profile = "rancher-eng"
}
  1. With terraform version 0.12.26 warning is seen with interpolation syntax. terraform version 0.11 or earlier accepts below syntax.
Warning: Interpolation-only expressions are deprecated

  on main.tf line 238, in resource "aws_lb_target_group_attachment" "rke2-nlb-attachement":
 238:   target_group_arn = "${aws_lb_target_group.rke2-master-nlb-tg.arn}"
  1. Build process fails while generating kubeconfig, could be due to reuse of ip for ec2 instance
Are you sure you want to continue connecting (yes/no/[fingerprint])? 
null_resource.get-kubeconfig (local-exec): Host key verification failed.

Integrate Helm charts for addons to rke2

As per discussion with @ibuildthecloud:

rke2 will integrate helm charts as CRs manifests in the manifest directory, however since rke2 is using different supervisor port the helm controller will not be able to download the charts, so the following changes will be added:

  • helm controller will have an added spec to the crd called chartContent
  • rke2 build process will download the tgz charts and add it to the yaml manifest as helmChart CRs
  • helm controller will create configmap as the chartContent
  • klipper helm job will mount this configmap as a chart tgz and it will decode it

CPU usage 100%

Hi! I am using

curl https://raw.githubusercontent.com/rancher/rke2/master/install.sh | INSTALL_RKE2_VERSION=v0.0.1-alpha.5 sh -

command to start single-node cluster on Ubuntu 20.04 VM (1 CPU, 2GB RAM) .

When canal helm charts is deployed CPU usage is 100% and logs full of

Jul 03 13:32:49 rke2-1 systemd-udevd[3692]: calico_tmp_A: Failed to get link config: No such device
Jul 03 13:32:49 rke2-1 networkd-dispatcher[697]: ERROR:Unknown interface index 150 seen even after reload
Jul 03 13:32:49 rke2-1 networkd-dispatcher[697]: WARNING:Unknown index 151 seen, reloading interface list
Jul 03 13:32:49 rke2-1 systemd-udevd[3698]: calico_tmp_B: Failed to get link config: No such device
Jul 03 13:32:49 rke2-1 networkd-dispatcher[697]: ERROR:Unknown interface index 151 seen even after reload
Jul 03 13:32:49 rke2-1 networkd-dispatcher[697]: WARNING:Unknown index 151 seen, reloading interface list
Jul 03 13:32:49 rke2-1 systemd-networkd[569]: calico_tmp_B: Could not find device, waiting for device initialization: No such device
Jul 03 13:32:49 rke2-1 systemd-networkd[569]: calico_tmp_A: Could not find device, waiting for device initialization: No such device
Jul 03 13:32:49 rke2-1 networkd-dispatcher[697]: ERROR:Unknown interface index 151 seen even after reload
Jul 03 13:32:49 rke2-1 networkd-dispatcher[697]: WARNING:Unknown index 150 seen, reloading interface list
Jul 03 13:32:49 rke2-1 systemd-networkd[569]: calico_tmp_B: Could not find device, waiting for device initialization: No such device
Jul 03 13:32:49 rke2-1 systemd-networkd[569]: calico_tmp_A: Could not find device, waiting for device initialization: No such device
Jul 03 13:32:49 rke2-1 networkd-dispatcher[697]: ERROR:Unknown interface index 150 seen even after reload
Jul 03 13:32:49 rke2-1 networkd-dispatcher[697]: WARNING:Unknown index 152 seen, reloading interface list
Jul 03 13:32:49 rke2-1 systemd-networkd[569]: calico_tmp_B: Could not find device, waiting for device initialization: No such device
Jul 03 13:32:49 rke2-1 systemd-networkd[569]: calico_tmp_A: Could not find device, waiting for device initialization: No such device
Jul 03 13:32:49 rke2-1 networkd-dispatcher[697]: ERROR:Unknown interface index 152 seen even after reload
Jul 03 13:32:49 rke2-1 networkd-dispatcher[697]: WARNING:Unknown index 153 seen, reloading interface list
Jul 03 13:32:49 rke2-1 networkd-dispatcher[697]: ERROR:Unknown interface index 153 seen even after reload
Jul 03 13:32:49 rke2-1 networkd-dispatcher[697]: WARNING:Unknown index 153 seen, reloading interface list
Jul 03 13:32:49 rke2-1 systemd-udevd[3692]: calico_tmp_A: Failed to get link config: No such device
Jul 03 13:32:49 rke2-1 networkd-dispatcher[697]: ERROR:Unknown interface index 153 seen even after reload
Jul 03 13:32:49 rke2-1 networkd-dispatcher[697]: WARNING:Unknown index 152 seen, reloading interface list
Jul 03 13:32:50 rke2-1 systemd-udevd[3698]: calico_tmp_B: Failed to get link config: No such device

In calico-node container's log i see

2020-07-03T13:43:51.219280859Z stdout F 2020-07-03 13:43:51.217 [INFO][25660] int_dataplane.go 1258: Applying XDP actions did not succeed, disabling XDP error=failed to resync: cannot find XDP object "/usr/lib/calico/bpf/filter.o"
2020-07-03T13:43:51.311974047Z stdout F 2020-07-03 13:43:51.298 [INFO][25660] int_dataplane.go 778: Linux interface addrs changed. addrs=set.mapSet{} ifaceName="calico_tmp_B"
2020-07-03T13:43:51.312013473Z stdout F 2020-07-03 13:43:51.300 [INFO][25660] int_dataplane.go 778: Linux interface addrs changed. addrs=set.mapSet{} ifaceName="calico_tmp_A"
2020-07-03T13:43:51.325926965Z stdout F 2020-07-03 13:43:51.325 [WARNING][25660] int_dataplane.go 981: failed to wipe the XDP state error=cannot find XDP object "/usr/lib/calico/bpf/filter.o" try=0
2020-07-03T13:43:51.510189347Z stdout F 2020-07-03 13:43:51.509 [WARNING][25660] int_dataplane.go 981: failed to wipe the XDP state error=cannot find XDP object "/usr/lib/calico/bpf/filter.o" try=1
2020-07-03T13:43:51.724108398Z stdout F 2020-07-03 13:43:51.718 [WARNING][25660] int_dataplane.go 981: failed to wipe the XDP state error=cannot find XDP object "/usr/lib/calico/bpf/filter.o" try=2
2020-07-03T13:43:51.927056232Z stdout F 2020-07-03 13:43:51.913 [WARNING][25660] int_dataplane.go 981: failed to wipe the XDP state error=cannot find XDP object "/usr/lib/calico/bpf/filter.o" try=3
2020-07-03T13:43:52.102314442Z stdout F 2020-07-03 13:43:52.097 [WARNING][25660] int_dataplane.go 981: failed to wipe the XDP state error=cannot find XDP object "/usr/lib/calico/bpf/filter.o" try=4

kubectl -n kube-system set env ds/canal -c calico-node FELIX_XDPENABLED=false and reboot is fixing the problem. Looks like in ranchertest/calico:v3.13.3 docker image is missing /usr/lib/calico/bpf/ directory:

docker run --rm ranchertest/calico:v3.13.3 ls -l /usr/lib/calico
ls: cannot access /usr/lib/calico: No such file or directory

Helm chart for cloud-controller-manager

We might close this later in favor of #142. We decided to provide flags to the user that allows them to set up cloud providers.
However, it's possible we may need this later. Such as for vmware external ccm

We'll need to have this sorted out before we make a call on if this issue needs to be closed or not.

RKE2 Fails on Execution Looking for ETCD User in Non CIS Mode

Version:
rke2 version v0.0.1-alpha.6

Node OS:
Ubuntu 20.04

Issue:

  1. When NSTALL_RKE2_CIS_MODE is set to true installation is successful.
  2. When INSTALL_RKE2_CIS_MODE is not passed to install script as below, installation fails with message "unknown user etcd"
    INSTALL_RKE2_VERSION=v0.0.1-alpha.6 ./install.sh
Jul 07 23:07:16 ip-172-31-15-215 systemd[1]: Failed to start Rancher Kubernetes Engine v2.
Jul 07 23:07:21 ip-172-31-15-215 systemd[1]: rke2.service: Scheduled restart job, restart counter is at 6.
Jul 07 23:07:21 ip-172-31-15-215 systemd[1]: Stopped Rancher Kubernetes Engine v2.
Jul 07 23:07:21 ip-172-31-15-215 systemd[1]: Starting Rancher Kubernetes Engine v2...
Jul 07 23:07:21 ip-172-31-15-215 rke2[2299]: time="2020-07-07T23:07:21Z" level=warning msg="not running in CIS 1.5 mode"
Jul 07 23:07:21 ip-172-31-15-215 rke2[2299]: time="2020-07-07T23:07:21Z" level=info msg="Starting rke2 v0.0.1-alpha.6 (HEAD)"
Jul 07 23:07:21 ip-172-31-15-215 rke2[2299]: time="2020-07-07T23:07:21Z" level=fatal msg="starting kubernetes: preparing server: start cluster and https: user: unknown user etcd"

Error while running kubectl while using install script method of installation.

Version:
Rke2 v0.0.1-alpha.4

Describe the issue:

  1. kubectl errors with below message instead of command not found
    rke2 kubectl get nodes --kubeconfig=/etc/rancher/rke2/rke2.yaml
    WARN[0000] not running in CIS 1.5 mode
    No help topic for 'kubectl'
2. After installing kubectl, default path of kubeconfig need to be passed explicitly.

kubectl get nodes --kubeconfig=/etc/rancher/rke2/rke2.yaml
NAME STATUS ROLES AGE VERSION
ip-172-31-13-222 Ready etcd,master 9m9s v1.18.4




Install rke2 using commit id is broken

Version:
Rke v0.0.1-alpha.7

Describe the bug:
Install rke2 using commit id.

INSTALL_RKE2_COMMIT= ./install.sh

# INSTALL_RKE2_COMMIT=4ccaa37d20b38e7d95a1ccd577894d4689b36a84 ./install.sh
[INFO]  using commit 4ccaa37d20b38e7d95a1ccd577894d4689b36a84 as release
[INFO]  downloading hash https://storage.googleapis.com/rke2-ci-builds/rke2-4ccaa37d20b38e7d95a1ccd577894d4689b36a84.sha256sum
# 

Test New Kubelet argument --protect-kernel-defaults

A new argument has been added to kubelet that needs to be set to true to comply with CIS 1.5 requires. The work to accomplish this was done in #87 .

To test

grep protect /var/lib/rancher/rke2//logs/kubelet.log 

This command should return a string with the argument and it being set to false.

When second node is added console gets flooded

Version:
rke2 v0.0.1-alpha.4
Ubuntu 20.04

Issue:
Console get flooded with below msgs. Node is successfully added.

To reproduce:

  1. Install node1 rke2 server
  2. Join node2 passing server and token

Additional info:

ERRO[1700] Failed to connect to proxy                    error="dial tcp 172.31.33.122:9345: connect: connection refused"
ERRO[1700] Remotedialer proxy error                      error="dial tcp 172.31.33.122:9345: connect: connection refused"
INFO[1705] Connecting to proxy                           url="wss://172.31.33.122:9345/v1-rke2/connect"
ERRO[1705] Failed to connect to proxy                    error="dial tcp 172.31.33.122:9345: connect: connection refused"
ERRO[1705] Remotedialer proxy error                      error="dial tcp 172.31.33.122:9345: connect: connection refused"
INFO[1710] Connecting to proxy                           url="wss://172.31.33.122:9345/v1-rke2/connect"

RPM Installer

Based on recent discussions with the Rancher Federal team and Will, a full RPM installer is a must for MVP.

If a supplemental RPM is needed such as for SELinux policy this is okay. Best case one RPM does everything (is this possible?)

Is waiting on internal eio issue #36 to be completed.

Stress Testing RKE2

We should do some load/stress testing of RKE2 as there may be a performance impact due to implemented crypto. It would be a good idea to do the following:

  • Test with upstream tests (similar to what Hussein has done in the past with K3s)
  • Test with Rancher with many clusters (harsher test, similar to what Dan has done in the past)

While testing consider that we may have a performance impact compared to k3s tests we have done. We should test when we have an alpha available and try to complete this within a couple of weeks well before beta release.

nginx-ingress-controller service is in pending state

Version:

rke2 v0.0.1-alpha.4

Issue:
nginx-ingress-controller service is in pending state. Since we dont have servicelb it should not expected to be of type LoadBalancer

kube-system   nginx-ingress-controller        LoadBalancer   10.43.5.161     <pending>     80:30782/TCP,443:30488/TCP   4h32m

Go FIPS Support

Review Go's use of BoringCrypto. Determine what needs to be done to get a FIPS-Compliant go build going.

RHEL8 Support

This is a high level task that encompasses the work required to support RHEL8. It expands on work done as a part of #2

Autodetect binaries, if they are not available in the container bind-mount to host w/ chroot.

Test image pull secrets

Based on 6/10/20 call with Rancher Federal team, there was some concern that image pull secrets do not work with containerd. We believe this is not the case but proposed to have QA briefly check this area to verify.

Second master node not able to join the master node

Version:
Rke v0.0.1-alpha.4

Describe the bug:
Install first node

INSTALL_RKE2_VERSION=v0.0.1-alpha.4 ./install.sh

Join second master
.Node is available but not joined to master

INSTALL_RKE2_VERSION=v0.0.1-alpha.4 INSTALL_RKE2_EXEC='server' RKE2_URL='MasterIP:9345' RKE2_TOKEN='<TOKEN>' ./install.sh

Logs:

Jun 30 23:56:22 ip-172-31-1-120 rke2[448714]: time="2020-06-30T23:56:22Z" level=info msg="Shutting down /v1, Kind=Node workers"
Jun 30 23:56:22 ip-172-31-1-120 rke2[448714]: time="2020-06-30T23:56:22Z" level=info msg="Shutting down /v1, Kind=Secret workers"
Jun 30 23:56:22 ip-172-31-1-120 rke2[448714]: time="2020-06-30T23:56:22Z" level=fatal msg="server stopped: http: Server closed"

Support for etcd snapshot and restore

Add support for RKE2 snapshot, backup, and restore via Rancher via CLI.

  • must also take automatic snapshots periodically.
  • On by default
  • Controlled via config args
  • Supports S3 (need this capability similar to RKE1) Not need for 1.0
  • Restore functionality folded into rke2 --cluster-reset - You restore by triggering a cluster-reset with a restore path arg specifified

Test and Verify Etcd Runs as Etcd User

The functionality introduced in PR: #56 needs to be validated. This work is in conjunction to the install script for adding CIS mode.

This can be done by:

./rke2 --profile=cis-1.5 server

If it runs, it thinks it succeeded.

To get the etcd process, run the command below.

ps aux | grep etcd

Check the pod manifest for a security context section that has the etcd user id and group id. Those id's can be references from the output from cat /etc/passwd | grep etcd. To see the manifest:

cat /var/lib/rancher/rke2/agent/pod-manifests/etcd.yaml

Support for Rancher Logging v2

Make sure static pods go into the default logging v2 and also the supervisor process log.
Basically we need to ensure all logs can get into Rancher log v2.

Config file support for RKE2

Work should start on k3s first, then port this into RKE2. k3s-io/k3s#1505

Add support for flat config file which specifies flags to run binary with. Please reference the K3s issue for details. This issue is for tracking the work to port this over to RKE2

Create Helm chart for Nginx

Effort to get a helm chart for the nginx controller.
Nginx will not be FIPS-compiled. After much research this is a large effort and not feasible for MVP release.

Logs are flooded with TLS handshake error messages.

RKE2 version:
v0.0.1-alpha.4

Describe the issue:

  1. Run rke2 using binary using the command
    rke2 server 2>&1 &
  2. Notice the logs being flooded with the TLS hankshake error with IP of the Loadbalancer.
I0629 22:43:19.270728    1995 log.go:181] http: TLS handshake error from 172.31.7.168:48004: EOF
I0629 22:43:19.813270    1995 log.go:181] http: TLS handshake error from 172.31.7.168:46580: EOF
I0629 22:43:20.260827    1995 log.go:181] http: TLS handshake error from 172.31.7.168:60068: EOF
I0629 22:43:20.265857    1995 log.go:181] http: TLS handshake error from 172.31.7.168:41756: EOF
I0629 22:43:20.572638    1995 log.go:181] http: TLS handshake error from 172.31.7.168:63126: EOF
I0629 22:43:20.868685    1995 log.go:181] http: TLS handshake error from 172.31.7.168:48299: EOF
I0629 22:43:21.174996    1995 log.go:181] http: TLS handshake error from 172.31.7.168:28564: EOF
I0629 22:43:22.825398    1995 log.go:181] http: TLS handshake error from 172.31.7.168:24861: EOF

Validating embedded etcd

RKE2 works only with embedded etcd driver defined in k3s repo https://github.com/rancher/k3s/blob/master/pkg/etcd/etcd.go, the etcd driver is responsible for the following:

  • Registering the driver
  • Start the controller (responsible for deleting nodes and update etcd node information)
  • Start the etcd node with a specific configuration, if the node is joining a cluster then it will add the etcd node as a member

Join etcd cluster:

To join the cluster you just need to start a new rke2 server and it will join the cluster automatically.

Remove etcd member

To remove an etcd member all you need to do is just remove the node from the cluster using kubectl:

kubectl delete node <node-name>

Reset cluster

In case of any quorum loss you can reset the cluster with the same data on the server by passing --cluster-reset to rke2, after it resets the cluster you should remove --cluster-reset flag and restart rke2 again.

Basic Test Scenarios

Test 1 (Join a new node)

  • start with 3 master nodes
  • add a new rke2 server node to cluster

expected

A new node should join the cluster and etcd member should be added

verify

You can exec to any etcd pod running in kube-system and verify using the following command:

etcdctl --cert /var/lib/rancher/rke2/server/tls/etcd/server-client.crt --key /var/lib/rancher/rke2/server/tls/etcd/server-client.key --endpoints https://127.0.0.1:2379 --cacert /var/lib/rancher/rke2/server/tls/etcd/server-ca.crt member list

Test 2 (Remove a node)

  • start with 3 master nodes
  • remove a node using kubectl delete node

expected

A node should be removed from k8s cluster as well as from etcd cluster as a member

verify

you can verify by exec-ing to any of the etcd pods left and run the following command to list the members

etcdctl --cert /var/lib/rancher/rke2/server/tls/etcd/server-client.crt --key /var/lib/rancher/rke2/server/tls/etcd/server-client.key --endpoints https://127.0.0.1:2379 --cacert /var/lib/rancher/rke2/server/tls/etcd/server-ca.crt member list

Test 3 (Reset cluster)

  • start 3 master nodes
  • shut down two master nodes
  • cluster should lose quorum
  • restart the first node with --cluster-reset
  • remove the failed nodes from the k8s cluster

Expected

cluster should come back up again

Test 4 (Restore functionality)

  • start 3 master nodes
  • shutdown two nodes
  • cluster should lose quorum
  • start the two failed nodes

Expected

cluster should restore quorum and come back up again

At build time scan for non-fips algorithms (or panic)

FIPS-140 does not permit some algorithms from being used. For example, MD5 may not be allowed.

We should determine a solution that allows us to either parse and scan thru go code and alert on an invalid algorithm or perhaps we just want to halt the build process and panic when an invalid algorithm is detected.

May just involve updating our shim or something to this effect? (As we cannot touch GoBoring library).

Support for other linux distros and versions

RKE2 version:
v0.0.1-alpha.4

Node OS:
Ubuntu 18.04:

Describe the issue:
rke2 installation errors on ubuntu18.04.

Logs:

Using binary for rke2. now seeing this error `rke2: /lib/x86_64-linux-gnu/libc.so.6: version GLIBC_2.28' not found (required by rke2)

Test INSTALL_RKE2_CIS_MODE Functionality in Install Script

Functionality introduced in PR needs to be tested and verified. #58

Verify the etcd user has been created:

grep etcd /etc/passwd

Verify kernel parameters have been updated, run the commands below:

sysctl vm.panic_on_oom
sysctl kernel.panic
sysctl kernel.panic_on_oops
sysctl kernel.keys.root_maxbytes

Expected values:

vm.panic_on_oom=0
sysctl kernel.panic=10
sysctl kernel.panic_on_oops=1
sysctl kernel.keys.root_maxbytes=25000000

Certified Image Pipeline

Epic covering building drone pipeline and images and the following below.

  • GoBoring compilation for FIPS compliance
  • UBI7 base image
  • Vulnerability scanning via trivvy
  • Support for multiple architectures
  • Leveraging multi-stage builds with common build image base
  • Template project for creating new certified image pipelines

Note: Rancher Federal team to take this and STIG these images. Then, via their own private repo/pipeline publish the STIG'ed images.

Related K3s issue: k3s-io/k3s#1503

Autodetect binaries and bind-mount if needed

Autodetect binaries, if they are not available int he container bind-mount to host w/ chroot.

Rationale:
UBI8 seems to be missing many common binaries needed to get a kubernetes cluster up and running. This is our workaround for this.

rke2-uninstall.sh does not remove etcd user

Version:
version v0.0.1-alpha.6

Issue:
etcd user persists after running rke2-uninstall.sh, thus failing re-install of rke2 on the same node.

rke2 -v
-bash: /usr/local/bin/rke2: No such file or directory
cat /etc/passwd|grep etcd
etcd:x:997:997:ETCD Service User:/var/lib/rancher/rke2:/usr/sbin/nologin

Support Install Feature of using commit id

Installation using commit id fails at download

INSTALL_RKE2_COMMIT=1dd8d99d86daac97b2cf2a060288c86e0059e7b6 ./install.sh 
[INFO]  using commit 1dd8d99d86daac97b2cf2a060288c86e0059e7b6 as release
[INFO]  downloading hash https://storage.googleapis.com/rke2-ci-builds/rke2-1dd8d99d86daac97b2cf2a060288c86e0059e7b6.sha256sum
root@ip-172-31-4-195:~# 

An health care use case feedback of RKE

Hello!

I'm excited about the future of RKE, though the current version does not yet fit into our use case.
I work at a french state hospital called APHP for "Assistance Publique - Hôpitaux de Paris".

We are a little lightweight on system administration and development resources, so the ease of use of RKE was a great fit to us. The support of Airgap installations which are also really important for us is there, so that's good too. We run behind an HTTP proxy for anything that goes outside and our base systems run CentOS FYI.

The current project is about making environments available for remote computation such as JupyterHub with strict confidentiality requirements.

We have users with different rights to a big data warehouse, so they must not step onto each other's permissions and access data they're not allowed to.

So we determined that Kubernetes orchestration mechanisms were great for resource management and fast to spin up and down as well but the isolation between users in their own pods is insufficient.

We could study that the Kubernetes eco-system is currently evolving towards better security mainly with efforts driven by Red Hat, Google and Intel/OpenStack.

  • Red Hat is pushing unprivileged containers with podman, buildah, etc.
  • Google is pushing gVisor.
  • Intel/OpenStack is pushing Kata Containers.

And where's RKE in all this? Well RKE depends on Docker so it can't use gVisor, Kata Containers or any other custom runtime such as cri-o.

So here's me saying that I'd really love if RKE2 could support non-Docker deployments while keeping the ease of use!

Thanks a lot for the awesome work!

By the way I wish we could fund you in some way but the process that leads to such a thing is complicated, but if I have a working prototype it'll be easier for me to justify it to the people that can do it. But again, it's not your responsability to drive us to a working prototype, but know that with my rather limited knowledge of Golang (but I'm a fast learner), I'm happy to help in any way I can.

And by the way x2, I'm currently experimenting with k3s with multi master HA deployments with dqlite but it's not quite there yet. I could also get Kata Containers running with k3s so that's good!

Leo

Support for Centos and RHEL

Node OS: Centos, RHEL

Issue:
rpms are not available. as mentioned in the issue #49

Additional info:

rpm -i https://rpm.rancher.io/rke2-selinux-0.1.1-rc1.el7.noarch.rpm
curl: (22) The requested URL returned error: 404 Not Found
error: skipping https://rpm.rancher.io/rke2-selinux-0.1.1-rc1.el7.noarch.rpm - transfer failed

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.