projectsyn / component-rook-ceph Goto Github PK

View Code? Open in Web Editor NEW

0.0 9.0 0.0 552 KB

Commodore component to manage Rook.io rook-ceph operator, Ceph cluster, and CSI drivers

License: BSD 3-Clause "New" or "Revised" License

Makefile 12.48% Jsonnet 85.03% Go 2.49%

commodore-component rook rook-ceph csi-driver storage

component-rook-ceph's Introduction

Commodore Component: Rook Ceph

This is a Commodore Component for Rook Ceph.

This repository is part of Project Syn. For documentation on Project Syn and this component, see syn.tools.

Documentation

The rendered documentation for this component is available on the Commodore Components Hub.

Documentation for this component is written using Asciidoc and Antora. It can be found in the docs folder. We use the Divio documentation structure to organize our documentation.

Run the make docs-serve command in the root of the project, and then browse to http://localhost:2020 to see a preview of the current state of the documentation.

After writing the documentation, please use the make docs-vale command and correct any warnings raised by the tool.

Contributing and license

This library is licensed under BSD-3-Clause. For information about how to contribute, see CONTRIBUTING.

component-rook-ceph's People

Contributors

Watchers

component-rook-ceph's Issues

Properly expose `storageClassDeviceSet` in component parameters

Context

Currently, the component offers parameters to configure some values of the single default storageClassDeviceSet entry in the CephCluster CR spec, cf.

component-rook-ceph/class/defaults.yml

Lines 6 to 19 in 9e30a34

    
           ceph_cluster: 
        
             name: cluster 
        
             namespace: syn-rook-ceph-${rook_ceph:ceph_cluster:name} 
        
             node_count: 3 
        
             block_storage_class: localblock 
        
             # Configure volume size here, if block storage PVs are provisioned 
        
             # dynamically 
        
             block_volume_size: 1 
        
             # set to true if backing storage is SSD 
        
             tune_fast_device_class: false 
        
             # Control placement of osd pods. 
        
             osd_placement: {} 
        
             # Mark OSDs as portable (doesn't bind OSD to a host) 
        
             osd_portable: false

and

component-rook-ceph/class/defaults.yml

Lines 199 to 213 in 9e30a34

    
           - name: ${rook_ceph:ceph_cluster:name} 
        
             count: ${rook_ceph:ceph_cluster:node_count} 
        
             volumeClaimTemplates: 
        
               - spec: 
        
                   storageClassName: ${rook_ceph:ceph_cluster:block_storage_class} 
        
                   volumeMode: Block 
        
                   accessModes: 
        
                     - ReadWriteOnce 
        
                   resources: 
        
                     requests: 
        
                       storage: ${rook_ceph:ceph_cluster:block_volume_size} 
        
             encrypted: true 
        
             tuneFastDeviceClass: ${rook_ceph:ceph_cluster:tune_fast_device_class} 
        
             placement: ${rook_ceph:ceph_cluster:osd_placement} 
        
             portable: ${rook_ceph:ceph_cluster:osd_portable}

We should refactor this config to simply provide parameter ceph_cluster.storageClassDeviceSet which is used verbatim as the first entry of cephClusterSpec value storage.storageClassDeviceSets. This would make it much easier to add additional configurations through the config hierarchy, e.g. annotations on the PVC template to force Rook to use a specific OSD device class (cf. https://github.com/rook/rook/blob/1db2ecf99b77394258c458ed6782ad26ebe8255b/deploy/examples/cluster-on-pvc.yaml#L123-L124)

In fact, we should probably also handle the pvcTemplates for the first storageClassDeviceSet as a separate parameter, since each storageClassDeviceSet has an array of pvcTemplates which is also not adjustable in the hierarchy.

Alternatives

Continue adding fields to ceph_cluster which allow users to set individual fields in the first entry of storage.storageClassDeviceSets.
Completely refactor the parameters to generate the list of storageClassDeviceSets from a map in the component parameters

Aggregate access to introduced CRDs to default ClusterRoles

Context

This component introduces multiple CRDs. To improve user experience we should aggregate access to them to the default ClusterRoles according to our best practices.

Rook caused creating a mon canary deployment lead to duplicate mon endpoint entries

The Rook operator did somehow create a monitor canary deployment after the node was drained. This was not prevented by the config

$ kubectl -n syn-rook-ceph-cluster get cephcluster cluster -o jsonpath={.spec.mon.allowMultiplePerNode}
false

Because we are running the monitors on the host network rather than the Kubernetes SDN the monitor ports are already occupied, causing the mon canary deployment duck in state pending. As a result a new monitor is going to deployed, Rook did add the monitor endpoint twice to the config map.

kubectl -n syn-rook-ceph-cluster get configmap rook-ceph-mon-endpoints -o jsonpath={.data.csi-cluster-config-json}
[{"clusterID":"syn-rook-ceph-cluster","monitors":["172.18.200.162:6789","172.18.200.146:6789","172.18.200.132:6789","172.18.200.132:6789"],"namespace":""}]

This caused ceph components to crash, because the amount of monitor endpoints was wrong:

FAILED ceph_assert(addr_mons.count(a) == 0)

Removing the duplicate IPs from the configmap rook-ceph-mon-endpoints and update maxMonId n-1 did resolve it and the components could start without an issue.

Edit(@bastjan): identifier to id mapping is <idchar> - 'a'

This ended up in:

Ceph stays in Warning state due to 'X OSDs or CRUSH {nodes, device-classes} have NOOUT flags set'

Couldn't be reproduced with the rook version v1.9.10.

Potential related issues:

Steps to Reproduce the Problem

Use the rook version 1.6.5
Drain a node

Resolved in the component version v3.4.1 #88.

Dependency Dashboard

This issue provides visibility into Renovate updates and their statuses. Learn more

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Update actions/checkout action to v3

Check this box to trigger a request for Renovate to run again on this repository

Update Rook to v1.10

Context

Rook v1.10 has been a long time around. High likely it's stable enough to migrate to.

Mention

#101
#82

Reduce or remove MON and OSD alerts during the maintenance

Context

The maintenance causes MON and OSD to be restarted.
This is a regular process and no issue, as long as just a qualified amount of components are down at the same time.

Current state is that we get P1 alerts out of MON and OSDs down caused by the regular maintenance process.
This is misleading the operator, because it it not an actionable alert, recover automatically as the maintenance processes.

Implementation idea

Relax the alerts, so they are P3 rather than P1. This still causes noice.
Relax the time MON and OSD can be down until an alert happens. Increases the delay in a real event.
Figure a way MON and OSD downs are just counted, if more the the minimum amount of running services covering the service are down

Reevaluate default resource requests

Context

We've seen that the Ceph cluster doesn't really use the resources we request during normal operation, requiring us to provision relatively large storage nodes, which are then mostly idle. This is not great for financial reasons. We should reevaluate the component's default resource requests and limits based on actual usage numbers from production environments.

Since Rook now provides a ceph-cluster Helm chart which defines default resource requests for the Ceph components, we should also check if those default requests are suitable for us. See https://github.com/rook/rook/blob/1ae867049b49079b76696e68ee9b8f30216528bd/deploy/charts/rook-ceph-cluster/values.yaml#L233-L289

Alternatives

Don't do anything.

Custom labels for rook-ceph alerts

Context

In order to identify OnCall relevant alerts, we use alert labels that can then be used in Opsgenie alert routing. In order to route rook-ceph alerts to OnCall, support for custom alert labels is required.

Upgrade to Rook v1.9

Context

Upgrade the component to use Rook v1.9 by default. Please note that Rook has updated the bundled Prometheus alerts for 1.9, and we'll need to ensure the runbooks bundled with the component are updated as part of this issue. With the upgrade to Rook 1.9, we should also upgrade the default ceph-csi version to v3.6 as that's the default version which ships with Rook 1.9.

Out of scope

Upgrade to Ceph v17

Acceptance criteria

Component installs Rook 1.9 by default
Alert runbooks are upgraded to match the new set of alerts shipped by Rook

Improve metrics scraping configuration for CephCSI drivers

Summary

Review and improve metrics scraping config for Ceph CSI drivers

Background

The current implementation for the metrics scraping config for the CSI drivers hasn't been reviewed or updated for Rook (and associated CephCSI) upgrades. Rook 1.9 / CephCSI 3.6 moved one of the metrics endpoints to an optional side-car container which can be enabled via Helm value enableLiveness of the Rook operator Helm chart (cf. #90). Additionally, it appears that the CephFS grpc metrics endpoint is deprecated, cf.

$ kubectl -n syn-rook-ceph-operator logs csi-cephfsplugin-zg9dc csi-cephfsplugin 
W1005 09:09:33.674333 1754966 driver.go:150] EnableGRPCMetrics is deprecated

Goal

We understand the available metrics for the Ceph CSI drivers and provide an appropriate default config in the component.

CephPGNotScrubbed

http://localhost:8070/CephPGNotScrubbed

Dependency Dashboard

This issue provides visibility into Renovate updates and their statuses. Learn more

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Update dependency quay.io/ceph/ceph to v16.2.7
Update Helm release rook-ceph to v1.8.8
Update dependency docker.io/rook/ceph to v1.8.8
Update dependency quay.io/cephcsi/cephcsi to v3.6.0
Update dependency quay.io/ceph/ceph to v17
Click on this checkbox to rebase all open PRs at once

Action Required: Fix Renovate Configuration

There is an error with this repository's Renovate configuration that needs to be fixed. As a precaution, Renovate will stop PRs until it is resolved.

Location: renovate.json
Error type: The renovate configuration file contains some invalid settings
Message: Invalid configuration option: packageRules[0].customChangelogUrl

Upgrade Rook to v1.8

Context

Rook v1.8.0 is out. Upgrade it 🚀

Release Blog.

Default Kubelet and Ceph disk available threshold do not match

By default the Kubelet ships with imageGCLowThresholdPercent (Default 85) and imageGCHighThresholdPercent (Default 80). Means the garbage collector starts at 20% free disk available to drop images.

Ceph has a default threshold of 30% where the monitors start complaining about not enough disk space HEALTH_WARN.

This leads to flapping alerts, because the alert of Ceph is triggered and the Kubelet not yet started the cleanup.

Steps to Reproduce the Problem

Install the compoent ceph-rook
Wait until a Kubernetes node uses more han 70% for container images

Actual Behavior

The ceph mon starts to complain about with a HEALTH_WARN.

Expected Behavior

The ceph mon never complains, because the Kubelet image garbage collector does the cleanup ahead the threshold reached.

Rook incorrectly creates new MON deployments during maintenance

Sometimes during cluster maintenance (on cloudscale.ch), the Rook-Ceph operator creates new mon deployments when a storage node is marked as unschedulable, instead of just waiting for the node to come back after maintenance.

Possible root causes

One configuration which can cause the observed issues is:

kubectl --as=cluster-admin -n syn-rook-ceph-cluster patch cephcluster cluster --type=json \
  -p '[{
    "op": "replace",
    "path": "/spec/healthCheck/daemonHealth/mon",
    "value": {
      "disabled": false,
      "interval": "10s",
      "timeout": "10s"
    }
  }]'

Which configures the operator to treat mons as failed after 10 seconds (down from the default 10 minutes). This config is intended to be used when replacing storage nodes (see e.g. https://kb.vshn.ch/oc4/how-tos/cloudscale/replace-storage-node.html#_remove_the_old_mon) and should be reverted once the mon has been moved to the new storage node. During a maintenance this config will cause the operator to treat the mon on cordoned nodes as failed after 10s which triggers creation of a replacement mon.

Steps to Reproduce the Problem

TBD: Some form of cordon/drain/restart nodes and observing the rook operator creating a new unnecessary mon

Actual Behavior

New mon gets created on a node which already has a mon and added to the monmap configmap, causing lots of issues because we now have a configuration of mons which can't work since the mons bind to host ports in our setup.

Expected Behavior

No new mon is created when a node is unschedulable due to node maintenance or similar.

	ceph_cluster:
	name: cluster
	namespace: syn-rook-ceph-${rook_ceph:ceph_cluster:name}
	node_count: 3
	block_storage_class: localblock
	# Configure volume size here, if block storage PVs are provisioned
	# dynamically
	block_volume_size: 1
	# set to true if backing storage is SSD
	tune_fast_device_class: false
	# Control placement of osd pods.
	osd_placement: {}
	# Mark OSDs as portable (doesn't bind OSD to a host)
	osd_portable: false

	- name: ${rook_ceph:ceph_cluster:name}
	count: ${rook_ceph:ceph_cluster:node_count}
	volumeClaimTemplates:
	- spec:
	storageClassName: ${rook_ceph:ceph_cluster:block_storage_class}
	volumeMode: Block
	accessModes:
	- ReadWriteOnce
	resources:
	requests:
	storage: ${rook_ceph:ceph_cluster:block_volume_size}
	encrypted: true
	tuneFastDeviceClass: ${rook_ceph:ceph_cluster:tune_fast_device_class}
	placement: ${rook_ceph:ceph_cluster:osd_placement}
	portable: ${rook_ceph:ceph_cluster:osd_portable}

projectsyn / component-rook-ceph Goto Github PK

component-rook-ceph's Introduction

Commodore Component: Rook Ceph

Documentation

Contributing and license

component-rook-ceph's People

Contributors

Watchers

component-rook-ceph's Issues

Context

Alternatives

Context

Steps to Reproduce the Problem

Open

Context

Mention

Context

Implementation idea

Context

Alternatives

Context

Context

Out of scope

Acceptance criteria

Summary

Background

Goal

CephPGNotScrubbed

Open

Context

Steps to Reproduce the Problem

Actual Behavior

Expected Behavior

Possible root causes

Steps to Reproduce the Problem

Actual Behavior

Expected Behavior

Recommend Projects

Recommend Topics

Recommend Org