Coder Social home page Coder Social logo

remche / terraform-openstack-rke2 Goto Github PK

View Code? Open in Web Editor NEW
47.0 6.0 27.0 146 KB

Deploy Kubernetes on OpenStack with RKE2

License: Mozilla Public License 2.0

HCL 85.74% Smarty 14.26%
kubernetes terraform openstack kubernetes-deployment terraform-module rke2 rancher

terraform-openstack-rke2's Introduction

terraform-openstack-rke2

Terraform Registry test-fast test-full

Terraform module to deploy Kubernetes with RKE2 on OpenStack.

Unlike RKE version this module is not opinionated and let you configure everything via RKE2 configuration file.

Prerequisites

Features

  • HA controlplane
  • Multiple agent node pools
  • Upgrade mechanism

Examples

See examples directory.

Documentation

See USAGE.md for all available options.

Keypair

You can either specify a ssh key file to generate new keypair via ssh_key_file (default) or specify already existent keypair via ssh_keypair_name.

Warning

Default config will try to use ssh agent for ssh connections to the nodes. Add use_ssh_agent = false if you don't use it.

Secgroup

You can define your own rules (e.g. limiting port 22 and 6443 to admin box).

secgroup_rules      = [ { "source" = "x.x.x.x", "protocol" = "tcp", "port" = 22 },
                        { "source" = "x.x.x.x", "protocol" = "tcp", "port" = 6443 },
                        { "source" = "0.0.0.0/0", "protocol" = "tcp", "port" = 80 },
                        { "source" = "0.0.0.0/0", "protocol" = "tcp", "port" = 443}
                      ]

Nodes affinity

You can set affinity policy for controlplane and each nodes pool server_group_affinity. Default is soft-anti-affinity.

Warning

soft-anti-affinity and soft-affinity needs Compute service API 2.15 or above.

Boot from volume

Some providers require to boot the instances from an attached boot volume instead of the nova ephemeral volume. To enable this feature, provide the variables to the config file. You can use different value for server and agent nodes.

boot_from_volume = true
boot_volume_size = 20
boot_volume_type = "rbd-1"

Kubernetes version

You can specify rke2 version with rke2_version variables. Refer to RKE2 supported version.

Upgrade by setting the target version via rke2_version and do_upgrade = true. It will upgrade the nodes one-by-one, server nodes first.

Warning

In-place upgrade mechanism is not battle-tested and relies on Terraform provisioners.

Addons

Set the manifests_path variable to point out the directory containing your manifests and HelmChart (see JupyterHub example).

If you need a template step for your manifests, you can use manifests_gzb64 (see cinder-csi-plugin example).

Warning

Modifications made to manifests after cluster deployement wont have any effect.

Downscale

You need to manually drain and remove node before downscaling a pool nodes.

You can tell the module to output kubernetes config by setting output_kubernetes_config = true.

Warning

Interpolating provider variables from module output is not the recommended way to achieve integration. See here and here.

Use of a data sources is recommended.

(Not recommended) You can use this module to populate Terraform Kubernetes Provider :

provider "kubernetes" {
  host     = module.controlplane.kubernetes_config.host
  client_certificate     = module.controlplane.kubernetes_config.client_certificate
  client_key             = module.controlplane.kubernetes_config.client_key
  cluster_ca_certificate = module.controlplane.kubernetes_config.cluster_ca_certificate
}

Recommended way needs two apply operations, and setting the proper terraform_remote_state data source :

provider "kubernetes" {
  host     = data.terraform_remote_state.rke2.outputs.kubernetes_config.host
  client_certificate     = data.terraform_remote_state.rke2.outputs.kubernetes_config.client_certificate
  client_key             = data.terraform_remote_state.rke2.outputs.kubernetes_config.client_key
  cluster_ca_certificate = data.terraform_remote_state.rke2.outputs.kubernetes_config.cluster_ca_certificate
}

Node lifecycle Assumptions

Note

Changes to certain module arguments will intentionally not cause the recreation of instances.

To provide users a better and more manageable experience, several arguments have been included in the instance's ignore_changes lifecycle. You must manually taint the instance for force the recreation of the resource :

terraform taint 'module.controlplane.module.server.openstack_compute_instance_v2.instance'

Proxy

You can specify a proxy via proxy_url variable. Private address ranges are automatically excluded, you can add more addresses via no_proxy variable. You might want to add you organization's DNS domain (that of the Keystone OpenStack API endpoint).

terraform-openstack-rke2's People

Contributors

dependabot[bot] avatar dhrp avatar github-actions[bot] avatar klar42 avatar powellchristoph avatar remche avatar zifeo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

terraform-openstack-rke2's Issues

Sensitive outputs?

Very nice work! Question though, why are several of the outputs configured as sensitive? IPs and IDs are typically thought of as secrets.

floating_ip, internal_ip, subnet_id

Support load balancers to expose API

In our environment, IPs are expensive and it's a lot of effort to get DNS RR going. Because of this, we would like to create a loadbalancer to expose the K8s API instead of assigning lots of floating IPs.

This is kind of related to #53, although we don't want to reuse sth, but let this module handle it.

(For HTTP/HTTPS, we will create a separate loadbalancer with an octavia controller to expose the actual services. This can happen after setting up the RKE cluster, so doesn't need to be a part of this, as you show in your examples.)

Would you like to see a PR and accept a contribution for this? (I'm already aware that you emphasize on backward compat, so I will make sure of that if possible)

Outputs

Module should optionally outputs variables to feed Terraform Kubernetes provider.

Ability to use existing network

Hello,

I'm looking into using your terraform provider, however I'm actually looking for a way to use an existing network, complete with security groups and subnets. I can't really seem to find a way to do it without forking your setup.

Is this something you already can do, or something you have thought about adding?

Option for registries.yaml

We should be able to provide a registries.yaml file.
Using a templated (b64+gz) string is probably the best way as it might contain sensitive values.

config.toml template and base64

Hi,

First of all, thanks to your project, we are able to deploy nodes with rke2 on openstack.

However, FYI, we had some issues using terraform and terraform-openstack-rke2.

Indeed, the main.tf at the root folder does not contain containerd_config_file = filebase64(var.containerd_config_file), but containerd_config_file = var.containerd_config_file.
Is this the normal behaviour ?
Instead, should we use directly the base64 value as a string in variables.tf ?

Thanks,

Best regards,

Unregister node from RKE2 after agent deletion

Currently RKE2 keeps the agent as NotReady:

NAME                     STATUS     ROLES                       AGE   VERSION
rke-cluster-blue-001     Ready      <none>                      65m   v1.21.5+rke2r2
rke-cluster-green-001    NotReady   <none>                      65m   v1.21.5+rke2r2

Local Exec does not use specified ssh key

I'm testing the minimal example on OpenStack to deploy a minimal cluster on OpenStack.
The image I'm using is a CentOS-8-GenericCloud-8.2.2004-20200611.2.x86_64 and requires to log using "centos" user and to use sudo for most things.

When launching the setup I've got the following error:

│ Error: local-exec provisioner error

│ with module.controlplane.null_resource.write_kubeconfig[0],
│ on .terraform/modules/controlplane/kubernetes.tf line 15, in resource "null_resource" "write_kubeconfig":
│ 15: provisioner "local-exec" {

│ Error running command 'scp -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null centos@...:/etc/rancher/rke2/rke2-remote.yaml rke2.yaml': exit status 1.
│ Output: Warning: Permanently added '...' (ECDSA) to the list of known hosts.
│ Load key "/root/.ssh/id_rsa": invalid format

The script tries to load the private key from default location although I've specified a different location and that I'm not using any ssh_agent.

I think the test in https://github.com/remche/terraform-openstack-rke2/blob/master/kubernetes.tf#L16 should be inverted...

Here is my file:

module "controlplane" {
source = "remche/rke2/openstack"
cluster_name = var.cluster_name
dns_servers = var.dns_servers
write_kubeconfig = true

image_name = "ubuntu-20.04-focal-x86_64"

image_name = "CentOS-8-GenericCloud-8.2.2004-20200611.2.x86_64"
flavor_name = "m4.medium"
public_net_name = "fips_pool1"
ssh_key_file = "/data/id_rsa"
boot_volume_size = 20
boot_from_volume = true
use_ssh_agent = false
system_user = "centos"
}

Cinder CSI won't start

The Cinder CSI example won't start out of the box:

$ kubectl get all -A | grep csi
kube-system   pod/helm-install-cinder-csi-plugin-pcsbm                     0/1     Completed          0               11m
kube-system   pod/openstack-cinder-csi-controllerplugin-856876dd97-tp6sx   5/6     CrashLoopBackOff   5 (79s ago)     10m
kube-system   pod/openstack-cinder-csi-nodeplugin-bc8ql                    2/3     CrashLoopBackOff   6 (61s ago)     10m
kube-system   pod/openstack-cinder-csi-nodeplugin-cdtsq                    2/3     CrashLoopBackOff   6 (81s ago)     10m
kube-system   pod/openstack-cinder-csi-nodeplugin-r2dp9                    2/3     CrashLoopBackOff   6 (41s ago)     10m
kube-system   pod/openstack-cinder-csi-nodeplugin-vc5w8                    2/3     CrashLoopBackOff   6 (81s ago)     10m
kube-system   daemonset.apps/openstack-cinder-csi-nodeplugin   4         4         0       4            0           <none>                   10m
kube-system   deployment.apps/openstack-cinder-csi-controllerplugin   0/1     1            0           10m
kube-system   replicaset.apps/openstack-cinder-csi-controllerplugin-856876dd97   1         1         0       10m
kube-system   job.batch/helm-install-cinder-csi-plugin                  1/1           76s        12m

Deployed with:

  • openstack-cinder-csi-2.27.1
  • rke2 version v1.25.11+rke2r1

"module.controlplane.null_resource.write_kubeconfig[0] (remote-exec): Waiting for rke2 to start" keeps looping for ever

I'm testing the minimal example on OpenStack to deploy a minimal cluster on OpenStack.
The image I'm using is a CentOS-8-GenericCloud-8.2.2004-20200611.2.x86_64 and requires to log using "centos" user and to use sudo for most things.

The setup seems to loop because the code here https://github.com/remche/terraform-openstack-rke2/blob/master/kubernetes.tf#L12 does not use sudo and thus cannot see the /etc/rancher/rke2/rke2-remote.yaml which has been indeed generated.

Here is my modified main.cf

module "controlplane" {
source = "remche/rke2/openstack"
cluster_name = var.cluster_name
dns_servers = var.dns_servers
write_kubeconfig = true

image_name = "ubuntu-20.04-focal-x86_64"

image_name = "CentOS-8-GenericCloud-8.2.2004-20200611.2.x86_64"
flavor_name = "m4.medium"
public_net_name = "fips_pool1"
ssh_key_file = "/data/id_rsa"
boot_volume_size = 20
boot_from_volume = true
use_ssh_agent = false
system_user = "centos"
}

Install script

Install script should not be run if :
1/ rke2 is installed
2/ rke2_version metadata is not set

Support creation of separated bastion host

Hi there!

First: thank you very much for putting this module online and mainting it. Open Source is hard. 😺

In our environment, we would like to have a separated bastion host, as we want to add more services aside from RKE, like separation of concerns and don't want to have so many SSH servers exposed.

I am hacking on an addition to this project which would allow the creation of a distinct bastion node before the RKE servers are created (I couldn't think of a way to do this outside, as we network information is not yet available before the server nodes are created.)

Would you like to see a PR and accept a contribution?

document how upgrades work

This project is great;

Now we're trying to do some upgrades. Before applying it to your production cluster I'm trying to see how it works, but there is virtually no information about how it works; besides that there is a flag "do_upgrade"

Setting the flag did trigger a remote shell execution on the server

Provisioning with 'local-exec'.
(local-exec): Executing: ["/bin/sh" "-c" "touch ./.terraform/tmp/rke2/upgrade-a25fdd70-8f4d-4b2e-8336-b0cb75389ab0-"]

but it does not appear to have done much.

The best that I have is to upgrade a 6 weeks old cluster (test) cluster with 1.24.4+rke2r1
I expect it to move to v1.24.7+rke2r1 (which is the version of a cluster that is a couple of days old)

cloud-init.yml.tpl: operator precedence wrong on JQ installation magic.

The construct
which jq 2>&1 > /dev/null || sudo curl -sfL $JQ_URL -o $JQ_BIN && sudo chmod +x $JQ_BIN
should correctly be
which jq 2>&1 > /dev/null || { sudo curl -sfL $JQ_URL -o $JQ_BIN && sudo chmod +x $JQ_BIN ; }

If "jq" binary is available, "sudo chmod ..." is executed and fails.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.