Coder Social home page Coder Social logo

csi-driver-lvm's Introduction

metal-stack

we believe kubernetes runs best on bare metal, this is all about providing metal as a service

csi-driver-lvm's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

csi-driver-lvm's Issues

How to limit nodes that cdi-driver is trying to provision a volume to?

I have different node types, with disks eligible for csi and without. disks for csi have a different name, so I am using this config:

devicePattern: /dev/nvme[1-3]n1

that worked well while I had 90% of nodes with nvme disks. but now the ratio is 50/50 and I see a lot of errors like

failed to provision volume with StorageClass "csi-driver-lvm-linear": rpc error: code = ResourceExhausted desc = volume creation failed

in plugin logs I see:

csi-driver-lvm-plugin-bvp86 csi-driver-lvm-plugin I1017 09:13:37.151506       1 controllerserver.go:126] creating volume pvc-facd5e9e-dda2-4bb0-94a5-0ec724e823a6 on node: worker-h1-1081261-183
csi-driver-lvm-plugin-bvp86 csi-driver-lvm-plugin I1017 09:13:37.151530       1 lvm.go:243] start provisionerPod with args:[createlv --lvsize 1073741824 --devices /dev/nvme[1-3]n1 --lvmtype linear --lvname pvc-facd5e9e-dda2-4bb0-94a5-0ec724e823a6 --vgname csi-lvm]
csi-driver-lvm-plugin-bvp86 csi-driver-lvm-plugin I1017 09:13:37.178143       1 lvm.go:395] provisioner pod status:Pending
csi-driver-lvm-plugin-bvp86 csi-driver-lvm-plugin I1017 09:13:38.184603       1 lvm.go:395] provisioner pod status:Pending
csi-driver-lvm-plugin-bvp86 csi-driver-lvm-plugin I1017 09:13:39.192834       1 lvm.go:395] provisioner pod status:Pending
csi-driver-lvm-plugin-bvp86 csi-driver-lvm-plugin I1017 09:13:40.199854       1 lvm.go:395] provisioner pod status:Pending
csi-driver-lvm-plugin-bvp86 csi-driver-lvm-plugin I1017 09:13:41.206348       1 lvm.go:395] provisioner pod status:Running
csi-driver-lvm-plugin-bvp86 csi-driver-lvm-plugin I1017 09:13:42.212716       1 lvm.go:395] provisioner pod status:Running
csi-driver-lvm-plugin-bvp86 csi-driver-lvm-plugin I1017 09:13:43.220177       1 lvm.go:385] provisioner pod terminated with failure
csi-driver-lvm-plugin-bvp86 csi-driver-lvm-plugin E1017 09:13:43.235743       1 controllerserver.go:142] error creating provisioner pod :rpc error: code = ResourceExhausted desc = volume creation failed
csi-driver-lvm-plugin-bvp86 csi-driver-lvm-plugin E1017 09:13:43.235783       1 server.go:114] GRPC error: rpc error: code = ResourceExhausted desc = volume creation failed

node worker-h1-1081261-183 doesn't have nvme disks, so it makes sense the provisioning fails.

it looks like plugin is jumping from the node to node and is trying to provision volume and that takes time. I would be nice if I can limit it to nodes, that only have a certain label, so it will not waste time on the nodes that will definitely fail.

PS: I've tried to schedule plugin daemonset only on the nodes with nvme disks, but it doesn't help - plugin is still trying all the available nodes

provide a way/tool to migrate volumes from csi-lvm to csi-driver-lvm

Migrate volume from csi-lvm to csi-driver-lvm

It may sooner or later be necessary to provide a way to migrate volumes from csi-lvm to csi-driver-lvm.

Idea: provide a tool, like others do, e.g.: tridenctl upgrade volume <name-of-trident-volume>

Following manually executed flow workes:
(DO NOT USE THIS EXAMPLE. I might be full of bugs)
These steps should better be done by a cli migration tool.

0.) make sure volume is not attached to a pod (scale sts to replicas=0 or delete a standalone pod)

1.) get the name of the node where the volume is mounted on:

k get pvc lvm-pvc-linear -o jsonpath='{.metadata.annotations.volume\.kubernetes\.io/selected-node}' && echo
shoot--ph2j95--mwen2-default-worker-579867dc45-w6r6c

2.) find pv for the given pvc

NAME             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
lvm-pvc-linear   Bound    pvc-47b80ab6-e98f-4ff1-bdc1-f4f739e2e281   10Mi       RWO            csi-lvm        2m7s

3.) set reclaimPolicy to retain:
k patch pv pvc-47b80ab6-e98f-4ff1-bdc1-f4f739e2e281 -p '{"spec": {"persistentVolumeReclaimPolicy": "Retain"}}'

k get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                    STORAGECLASS   REASON   AGE
pvc-47b80ab6-e98f-4ff1-bdc1-f4f739e2e281   10Mi       RWO            Retain           Bound    default/lvm-pvc-linear   csi-lvm                 4m25s

4.) delete old pvc and replace with new pvc

 apiVersion: v1
 kind: PersistentVolumeClaim
 metadata:
   name: lvm-pvc-linear
   namespace: default
-  annotations:
-    csi-lvm.metal-stack.io/type: "linear"
 spec:
   accessModes:
     - ReadWriteOnce
-  storageClassName: csi-lvm
+  storageClassName: csi-lvm-sc-linear
   resources:
     requests:
-      storage: 10Mi
+      storage: 1Mi  # small dummy value, will be resized later

5.) force the actual creation of the volume on the target node (i.e. create a pod on that node which claims the pvc), e.g.

apiVersion: batch/v1
kind: Job
metadata:
  name: mount
spec:
  template:
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchFields:
              - key: metadata.name
                operator: In
                values:
                - shoot--ph2j95--mwen2-default-worker-579867dc45-w6r6c
      serviceAccount: csi-lvmplugin
      serviceAccountName: csi-lvmplugin
      containers:
      - name: busybox
        image: busybox
        command: ["sleep",  "10"]
        volumeMounts:
        - name: dummy
          mountPath: /dummy
      restartPolicy: Never
      volumes:
      - name: dummy
        persistentVolumeClaim:
          claimName: lvm-pvc-linear

6.) get volume name of the newly created volume:

 k get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS     CLAIM                    STORAGECLASS        REASON   AGE
pvc-47b80ab6-e98f-4ff1-bdc1-f4f739e2e281   10Mi       RWO            Retain           Released   default/lvm-pvc-linear   csi-lvm                      41m
pvc-9cba4237-1e56-4f2a-9b49-ca7f94bfdb6a   1Mi        RWO            Delete           Bound      default/lvm-pvc-linear   csi-lvm-sc-linear            16m

7.) rename the old lv to the newly created one, and change tags

  umount /tmp/pvc-47b80ab6-e98f-4ff1-bdc1-f4f739e2e281 && \
  lvremove -y csi-lvm/pvc-9cba4237-1e56-4f2a-9b49-ca7f94bfdb6a && \
  lvrename csi-lvm/pvc-47b80ab6-e98f-4ff1-bdc1-f4f739e2e281 csi-lvm/pvc-9cba4237-1e56-4f2a-9b49-ca7f94bfdb6a && \
  lvchange --deltag lv.metal-stack.io/csi-lvm csi-lvm/9cba4237-1e56-4f2a-9b49-ca7f94bfdb6a && \
  lvchange --addtag vg.metal-stack.io/csi-lvm-driver csi-lvm/9cba4237-1e56-4f2a-9b49-ca7f94bfdb6a

For demonstration purposes, this can be done by a job. Another way would be to exec into the corresponding csi-lvm-reviver pod.

apiVersion: batch/v1
kind: Job
metadata:
  name: rename-lvm-pv
spec:
  backoffLimit: 0
  template:
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchFields:
              - key: metadata.name
                operator: In
                values:
                - shoot--ph2j95--mwen2-default-worker-579867dc45-w6r6c
      containers:
      - name: rename-lvm-pv
        image: metalstack/lvmplugin:latest
        command: ["/bin/sh", "-c"]
        args: [ "umount /tmp/pvc-47b80ab6-e98f-4ff1-bdc1-f4f739e2e281 && \
                 lvremove -y csi-lvm/pvc-9cba4237-1e56-4f2a-9b49-ca7f94bfdb6a && \
                 lvrename csi-lvm/pvc-47b80ab6-e98f-4ff1-bdc1-f4f739e2e281 csi-lvm/pvc-9cba4237-1e56-4f2a-9b49-ca7f94bfdb6a && \
                 lvchange --deltag lv.metal-stack.io/csi-lvm csi-lvm/9cba4237-1e56-4f2a-9b49-ca7f94bfdb6a && \
                 lvchange --addtag vg.metal-stack.io/csi-lvm-driver csi-lvm/9cba4237-1e56-4f2a-9b49-ca7f94bfdb6a" ]
        securityContext:
          privileged: true
        terminationMessagePath: /termination.log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /dev
          mountPropagation: Bidirectional
          name: dev-dir
        - mountPath: /lib/modules
          name: mod-dir
        - mountPath: /etc/lvm/backup
          mountPropagation: Bidirectional
          name: lvmbackup
        - mountPath: /etc/lvm/cache
          mountPropagation: Bidirectional
          name: lvmcache
        - mountPath: /run/lock/lvm
          mountPropagation: Bidirectional
          name: lvmlock
        - mountPath: /tmp/csi-lvm
          mountPropagation: Bidirectional
          name: data
      restartPolicy: Never
      serviceAccount: csi-lvmplugin
      serviceAccountName: csi-lvmplugin
      volumes:
      - hostPath:
          path: /dev
          type: Directory
        name: dev-dir
      - hostPath:
          path: /lib/modules
          type: Directory
        name: mod-dir
      - hostPath:
          path: /etc/lvm/backup
          type: DirectoryOrCreate
        name: lvmbackup
      - hostPath:
          path: /etc/lvm/cache
          type: DirectoryOrCreate
        name: lvmcache
      - hostPath:
          path: /run/lock/lvm
          type: DirectoryOrCreate
        name: lvmlock
      - hostPath:
          path: /tmp/csi-lvm
          type: DirectoryOrCreate
        name: data

8.) "resize" the pvc entry back to its original size (doesn't actually change anything, since the volume on disk already has this size)
k patch pvc lvm-pvc-linear -p '{"spec": {"resources": {"requests": { "storage": "10Mi"}}}}'

9.) start old pod/sts with updated pvc (now has the new storage class but the original content)

10.) delete the pv entry of the old, unused csi-lvm volume

k delete pv pvc-9cba4237-1e56-4f2a-9b49-ca7f94bfdb6a

Operator documentation + resize + adding new volumes

I realized that I created a volume far too small for my needs. I created a second one and added to the devicePattern option from the Helm chart. However, it seems to not be taken in account. It would be nice to have some operators documentation. I could contribute to this section as well since I think it can help others. For the moment I need guidance on how to proceed. Can someone point me to the right direction (also to help me troubleshoot), maybe it is somewhere but I could not find it.

Feature Request: hashicorp nomad support

Hello,

Recently, I began to use nomad for my personal homelab.
Nomad supports CSI plugin for storage.

Actually, I am not familiar enough with Nomad and CSI spec, to create the CSI plugin.

So, can you make a doc for the use of your CSI plugin with Nomad or at least, provides some guidelines to help me to implement it ?

Some links to ease your discovery of Nomad if you accept to help me :

I've looked for some inspiration also here, before opening this feature request :

Can csi-driver-lvm work with existing vg?

I've an existing vg where I want the csi-driver-lvm to create lvs for provisioning. Is it possible to omit the devicepattern and give existing vgname? I don't want csi-driver-lvm to manage the vg instead it should use existing vg. I have pvs already allocated into vg and I use some diskspace in that vg for some other mounts. So I can't give raw devices to csi-driver-lvm. Can someone help me in understanding the behavior of csi-driver-lvm if I give existing vgname and empty devicepattern.

failure if logical volume is missing

if - for some reason - the logical volume is missing on the node, the controller tries to delete it again and again:

delete-pvc-5767ef75-314c-420d-9783-5b1db2b78c44 csi-lvmplugin-delete I0211 13:14:29.716839       1 main.go:38] starting csi-lvmplugin-provisioner
delete-pvc-5767ef75-314c-420d-9783-5b1db2b78c44 csi-lvmplugin-delete I0211 13:14:29.717050       1 deletelv.go:45] delete lv pvc-5767ef75-314c-420d-9783-5b1db2b78c44 vg:csi-lvm
delete-pvc-5767ef75-314c-420d-9783-5b1db2b78c44 csi-lvmplugin-delete I0211 13:14:30.011312       1 lvm.go:550] unable to list existing volumes:exit status 5
delete-pvc-5767ef75-314c-420d-9783-5b1db2b78c44 csi-lvmplugin-delete F0211 13:14:30.011351       1 deletelv.go:27] Error deleting lv: unable to delete lv: logical volume pvc-5767ef75-314c-420d-9783-5b1db2b78c44 does not exist output:
delete-pvc-5767ef75-314c-420d-9783-5b1db2b78c44 csi-lvmplugin-delete goroutine 1 [running]:
delete-pvc-5767ef75-314c-420d-9783-5b1db2b78c44 csi-lvmplugin-delete k8s.io/klog/v2.stacks(0xc000010001, 0xc00046e000, 0xa5, 0x1ba)
delete-pvc-5767ef75-314c-420d-9783-5b1db2b78c44 csi-lvmplugin-delete    /go/pkg/mod/k8s.io/klog/[email protected]/klog.go:996 +0xb9
delete-pvc-5767ef75-314c-420d-9783-5b1db2b78c44 csi-lvmplugin-delete k8s.io/klog/v2.(*loggingT).output(0x1db9420, 0xc000000003, 0x0, 0x0, 0xc00045e0e0, 0x1d2990c, 0xb, 0x1b, 0x0)
delete-pvc-5767ef75-314c-420d-9783-5b1db2b78c44 csi-lvmplugin-delete    /go/pkg/mod/k8s.io/klog/[email protected]/klog.go:945 +0x191
csi-driver-lvm      56m         Normal    Started                  pod/delete-pvc-b02b02e9-c48a-44de-aed5-fdc4febcc0b3                Started container csi-lvmplugin-delete
csi-driver-lvm      51m         Normal    Started                  pod/delete-pvc-b02b02e9-c48a-44de-aed5-fdc4febcc0b3                Started container csi-lvmplugin-delete
csi-driver-lvm      46m         Normal    Started                  pod/delete-pvc-b02b02e9-c48a-44de-aed5-fdc4febcc0b3                Started container csi-lvmplugin-delete
csi-driver-lvm      41m         Normal    Started                  pod/delete-pvc-b02b02e9-c48a-44de-aed5-fdc4febcc0b3                Started container csi-lvmplugin-delete
csi-driver-lvm      36m         Normal    Started                  pod/delete-pvc-b02b02e9-c48a-44de-aed5-fdc4febcc0b3                Started container csi-lvmplugin-delete

Does not start on >= kubernetes v1.27.0

With this error:

Warning  FailedCreate  12m (x56 over 10h)  statefulset-controller  create Pod csi-driver-lvm-controller-0 in StatefulSet csi-driver-lvm-controller failed error: pods "csi-driver-lvm-controller-0" is forbidden: violates PodSecurity "rest
ricted:latest": privileged (containers "csi-attacher", "csi-provisioner", "csi-resizer" must not set securityContext.privileged=true), allowPrivilegeEscalation != false (containers "csi-attacher", "csi-provisioner", "csi-resizer" must set
 securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (containers "csi-attacher", "csi-provisioner", "csi-resizer" must set securityContext.capabilities.drop=["ALL"]), restricted volume types (volume "socket-dir" use
s restricted volume type "hostPath"), runAsNonRoot != true (pod or containers "csi-attacher", "csi-provisioner", "csi-resizer" must set securityContext.runAsNonRoot=true), seccompProfile (pod or containers "csi-attacher", "csi-provisioner
", "csi-resizer" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")

implement GET_VOLUME_STATS capability

implement csi.NodeServiceCapability_RPC_GET_VOLUME_STATS capability, to get stats like

kubelet_volume_stats_available_bytes{...}
kubelet_volume_stats_used_bytes{...}

from kubelet.

func (ns *nodeServer) NodeGetVolumeStats(ctx context.Context, in *csi.NodeGetVolumeStatsRequest) (*csi.NodeGetVolumeStatsResponse, error) {

Helm installation from README doesn't work

Hi, I'm trying to install using this tool using the command in README.md but the command fails.

I'm new to Helm so it may be a problem on my side.

$ helm version
version.BuildInfo{Version:"v3.13.2", GitCommit:"2a2fb3b98829f1e0be6fb18af2f6599e0f4e8243", GitTreeState:"clean", GoVersion:"go1.20.10"}
$ helm install --repo https://helm.metal-stack.io mytest helm/csi-driver-lvm --set lvm.devicePattern='/dev/nvme[0-9]n[0-9]'
Error: INSTALLATION FAILED: chart "helm/csi-driver-lvm" not found in https://helm.metal-stack.io repository

[BUG] Example files provisioner name inconsistent

There is 1 minor inconsistency in the examples directory

file: csi-storageclass-*.yaml

StorageClass Provisioner
Chart: lvm.csi.metal-stack.io
vs
example: lvm.csi.k8s.io

Since people most likely will deploy it via the helm chart, perhaps it would be best to remove them out of the examples directory?

review requested

Since "metal-stack" has now more public visibility, please review before we move this to metal-stack.

PVCs pending with WaitForFirstConsumer on fresh install

Not sure if this is a bug report or a support request, but in any case I can't spot what's going awry.

Fresh install of microk8s 1.23 and csi-driver-lvm v0.4.1 via the Helm chart at https://github.com/metal-stack/helm-charts/tree/master/charts/csi-driver-lvm (which supports StorageClass under storage.k8s.io/v1 ).

# Deploy CSI driver
$ cat values.yaml
lvm:
  devicePattern: /dev/sdb
rbac:
  pspEnabled: false
$ helm upgrade --install --create-namespace -n storage -f values.yaml csi-driver-lvm ./helm-charts/charts/csi-driver-lvm/

# Storage classes created
$ kubectl get storageclass
NAME                              PROVISIONER              RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
csi-driver-lvm-striped            lvm.csi.metal-stack.io   Delete          WaitForFirstConsumer   true                   27m
csi-driver-lvm-mirror             lvm.csi.metal-stack.io   Delete          WaitForFirstConsumer   true                   27m
csi-driver-lvm-linear (default)   lvm.csi.metal-stack.io   Delete          WaitForFirstConsumer   true                   27m

# Create a test PVC
$ cat pvc-test.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: test
  namespace: default
spec:
  storageClassName: csi-driver-lvm-linear
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: "2Gi"

$ kubectl apply -f pvc-test.yaml
$ kubectl describe -n default pvc/test
Name:          test
Namespace:     default
StorageClass:  csi-driver-lvm-linear
Status:        Pending
Volume:
Labels:        <none>
Annotations:   <none>
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Used By:       <none>
Events:
  Type    Reason                Age               From                         Message
  ----    ------                ----              ----                         -------
  Normal  WaitForFirstConsumer  4s (x4 over 42s)  persistentvolume-controller  waiting for first consumer to be created before binding

The first sign of trouble comes from the plugin pod, where it raises a couple errors:

$ kubectl -n storage logs csi-driver-lvm-plugin-9bqb4 -c csi-driver-lvm-plugin
2022/02/05 20:02:01 unable to configure logging to stdout:no such flag -logtostderr
I0205 20:02:01.834133       1 lvm.go:108] pullpolicy: IfNotPresent
I0205 20:02:01.834139       1 lvm.go:112] Driver: lvm.csi.metal-stack.io
I0205 20:02:01.834142       1 lvm.go:113] Version: dev
I0205 20:02:01.873219       1 lvm.go:411] unable to list existing volumegroups:exit status 5
I0205 20:02:01.873250       1 nodeserver.go:51] volumegroup: csi-lvm not found
I0205 20:02:02.119070       1 nodeserver.go:58] unable to activate logical volumes:  Volume group "csi-lvm" not found
  Cannot process volume group csi-lvm
 exit status 5
I0205 20:02:02.120111       1 controllerserver.go:259] Enabling controller service capability: CREATE_DELETE_VOLUME
I0205 20:02:02.120295       1 server.go:95] Listening for connections on address: &net.UnixAddr{Name:"//csi/csi.sock", Net:"unix"}

Over on the k8s node, /dev/sdb does exist per lvm.devicePattern:

$ blockdev --getsize64 /dev/sdb
32212254720

While the documentation doesn't say this is necessary, I didn't see any indication from the code that pvcreate is called. So I figured perhaps that was the problem, and explicitly created it (which also demonstrates that the LVM command line tools are functional on the host):

# On k8s host
$ pvcreate /dev/sdb
  Physical volume "/dev/sdb" successfully created.

# On client
$ kubectl -n storage rollout restart ds/csi-driver-lvm-plugin

No change. Still the Volume group "csi-lvm" not found" errors from the plugin pod logs. Ok, this ostensibly shouldn't be necessary, but let's create it manually:

# On k8s host
$ vgcreate csi-lvm /dev/sdb
  Volume group "csi-lvm" successfully created
$ vgs
  VG      #PV #LV #SN Attr   VSize   VFree
  csi-lvm   1   0   0 wz--n- <30.00g <30.00g

# On client
$ kubectl -n storage rollout restart ds/csi-driver-lvm-plugin

This has addressed the errors from the plugin logs:

INFO: defaulting to container "csi-driver-lvm-plugin" (has: node-driver-registrar, csi-driver-lvm-plugin, liveness-probe)
2022/02/05 20:23:53 unable to configure logging to stdout:no such flag -logtostderr
I0205 20:23:53.656589       1 lvm.go:108] pullpolicy: IfNotPresent
I0205 20:23:53.656596       1 lvm.go:112] Driver: lvm.csi.metal-stack.io
I0205 20:23:53.656598       1 lvm.go:113] Version: dev
I0205 20:23:53.738596       1 controllerserver.go:259] Enabling controller service capability: CREATE_DELETE_VOLUME
I0205 20:23:53.738891       1 server.go:95] Listening for connections on address: &net.UnixAddr{Name:"//csi/csi.sock", Net:"unix"}

But that didn't fix the pending PVC, even after recreating it:

$ kubectl describe -n default pvc/test
Name:          test
Namespace:     default
StorageClass:  csi-driver-lvm-linear
Status:        Pending
Volume:
Labels:        <none>
Annotations:   <none>
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Used By:       <none>
Events:
  Type    Reason                Age               From                         Message
  ----    ------                ----              ----                         -------
  Normal  WaitForFirstConsumer  4s (x2 over 16s)  persistentvolume-controller  waiting for first consumer to be created before binding

Hopefully it's clear where things have gone wrong. :)

Thanks!

Multiple deprecatation warning during deployment

I was testing https://metal-stack.github.io/helm-charts/csi-driver-lvm-0.4.0.tgz deployment to Kubernetes v1.21 and noticed logs contains these warnings:

policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
storage.k8s.io/v1beta1 CSIDriver is deprecated in v1.19+, unavailable in v1.22+; use storage.k8s.io/v1 CSIDriver
apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition


Read-only filesystems are not supported

I am setting this up on Talos OS v1.4.0 & K8s v1.26.4. On this OS, /etc is not a writable.

I had to make this change to csi-driver-lvm chart so that it installs:

diff --git a/charts/csi-driver-lvm/templates/plugin.yaml b/charts/csi-driver-lvm/templates/plugin.yaml
index b347cfa..c1b0dc0 100644
--- a/charts/csi-driver-lvm/templates/plugin.yaml
+++ b/charts/csi-driver-lvm/templates/plugin.yaml
@@ -61,10 +61,10 @@ metadata:
 spec:
   allowedHostPaths:
   - pathPrefix: /lib/modules
-  - pathPrefix: /etc/lvm/cache
+  - pathPrefix: {{ .Values.lvm.hostWritePath }}etc/lvm/cache
   - pathPrefix: {{ .Values.kubernetes.kubeletPath }}/plugins/{{ .Values.lvm.storageClassStub }}
-  - pathPrefix: /etc/lvm/backup
-  - pathPrefix: /run/lock/lvm
+  - pathPrefix: {{ .Values.lvm.hostWritePath }}etc/lvm/backup
+  - pathPrefix: /var/run/lock/lvm
   - pathPrefix: {{ .Values.kubernetes.kubeletPath }}/plugins
   - pathPrefix: {{ .Values.kubernetes.kubeletPath }}/plugins_registry
   - pathPrefix: /dev
@@ -267,15 +267,15 @@ spec:
           path: /lib/modules
         name: mod-dir
       - hostPath:
-          path: /etc/lvm/backup
+          path: {{ .Values.lvm.hostWritePath }}etc/lvm/backup
           type: DirectoryOrCreate
         name: lvmbackup
       - hostPath:
-          path: /etc/lvm/cache
+          path: {{ .Values.lvm.hostWritePath }}etc/lvm/cache
           type: DirectoryOrCreate
         name: lvmcache
       - hostPath:
-          path: /run/lock/lvm
+          path: {{ .Values.lvm.hostWritePath }}/run/lock/lvm
           type: DirectoryOrCreate
         name: lvmlock
 ---
diff --git a/charts/csi-driver-lvm/values.yaml b/charts/csi-driver-lvm/values.yaml
index e954dfb..f95598c 100644
--- a/charts/csi-driver-lvm/values.yaml
+++ b/charts/csi-driver-lvm/values.yaml
@@ -3,6 +3,9 @@ lvm:
   # This one you should change
   devicePattern: /dev/nvme[0-9]n[0-9]
 
+  # You will want to change this for read-only filesystems, e.g. Talos OS
+  hostWritePath: /
+
   # these are primariliy for testing purposes
   vgName: csi-lvm
   driverName: lvm.csi.metal-stack.io

Note
I can submit the above patch as a PR to https://github.com/metal-stack/helm-charts, no worries.

Once I got csi-driver-lvm installed on K8s on Talos OS, I am now hitting issues creating the pvc:

image

I tracked this issue down to these hard-coded /etc/lvm/backup, /etc/lvm/cache & /run/lock/lvm paths in the controller:

{
Name: "lvmbackup",
VolumeSource: v1.VolumeSource{
HostPath: &v1.HostPathVolumeSource{
Path: "/etc/lvm/backup",
Type: &hostPathType,
},
},
},
{
Name: "lvmcache",
VolumeSource: v1.VolumeSource{
HostPath: &v1.HostPathVolumeSource{
Path: "/etc/lvm/cache",
Type: &hostPathType,
},
},
},
{
Name: "lvmlock",
VolumeSource: v1.VolumeSource{
HostPath: &v1.HostPathVolumeSource{
Path: "/run/lock/lvm",
Type: &hostPathType,
},
},
},

Before I submit a PR that fixes it, I am wondering if this is something that you would be interested in accepting. I was thinking of adding a hostWritePath to the volumeAction, and then wire it through. Would you suggest a different approach? FTR:

type volumeAction struct {
action actionType
name string
nodeName string
size int64
lvmType string
devicesPattern string
provisionerImage string
pullPolicy v1.PullPolicy
kubeClient kubernetes.Clientset
namespace string
vgName string
}

Thanks for a great LVM CSI! I tried a few other ones before I settled on this one. It's so close to what I am looking for. 💪


cc @smira @andrewrynhard

Support multiple filesystems

It will be nice to add support of multiple filesystems.
Filesystem type can be specified in StorageClass, the same way as LVM type.
Filesystem operations can be grouped in some i.e. filesystem interface.

helm install CSIDriver.spec validation error

# helm install mytest helm/csi-driver-lvm --set lvm.devicePattern='/dev/loop[0-9]'
Error: unable to build kubernetes objects from release manifest: error validating "": error validating data: ValidationError(CSIDriver.spec): unknown field "volumeLifecycleModes" in io.k8s.api.storage.v1beta1.CSIDriverSpec
 # kc version
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.6", GitCommit:"dff82dc0de47299ab66c83c626e08b245ab19037", GitTreeState:"clean", BuildDate:"2020-07-15T16:58:53Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.8", GitCommit:"211047e9a1922595eaa3a1127ed365e9299a6c23", GitTreeState:"clean", BuildDate:"2019-10-15T12:02:12Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}

Not deleting pvc with claim policy = Delete on pod termination

When terminating a StatefulSet that was successfully provisioned by the CSI driver LVM, the associated PVC remains bound and is not deleted by the CSI driver LVM. we use karpenter and the new pod is unable to be scheduled due to the pvc remains in bound status with the old node dns label , node that not longer exists.

incompatible with provisioner "development-provisioner", daemonset overhead={"cpu":"610m","ephemeral-storage":"1280Mi","memory":"1334Mi","pods":"9"}, incompatible requirements, label "topology.lvm.csi/node" does not have known values
helm chart version:
name: csi-driver-lvm
version: 0.5.4
description: local persistend storage for lvm
appVersion: v0.5.2
apiVersion: v1
keywords:
- storage
- block-storage
- volume
home: https://metal-stack.io
sources:
- https://github.com/metal-stack

Kubernetes 1.22 and v1beta1 versus v1

I'm trying to run this in OpenShift 4.9 and am running into some issues due to the usage of v1beta1 APIs instead of v1. The helm chart was easy to update to v1 however once the driver is running I see in the lvm-provisioner log that it is expected to find CSINode on v1beta1 instead of v1

E1021 19:09:47.474514 1 reflector.go:156] k8s.io/client-go/informers/factory.go:135: Failed to list *v1beta1.CSINode: the server could not find the requested resource

Any thoughts on how to work around this?

tags on volumegroups

it seems that tags can not be applied:

docker run -it --rm golang:1.15-alpine                                                                                                                                                                                               
/go # apk add lvm2                                                                                                                                                                                                                                                                        
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/main/x86_64/APKINDEX.tar.gz                                                                                                                                                                                                              
fetch http://dl-cdn.alpinelinux.org/alpine/v3.12/community/x86_64/APKINDEX.tar.gz
(1/6) Installing libaio (0.3.112-r1)
(2/6) Installing libblkid (2.35.2-r0)
(3/6) Installing device-mapper-libs (2.02.186-r1)
(4/6) Installing device-mapper-event-libs (2.02.186-r1)
(5/6) Installing lvm2-libs (2.02.186-r1)
(6/6) Installing lvm2 (2.02.186-r1)
Executing busybox-1.31.1-r19.trigger
OK: 12 MiB in 21 packages
/go # vgcreate -h
  vgcreate - Create a volume group

  vgcreate VG_new PV ...
        [ -A|--autobackup y|n ]
        [ -c|--clustered y|n ]
        [ -l|--maxlogicalvolumes Number ]
        [ -p|--maxphysicalvolumes Number ]
        [ -M|--metadatatype lvm2 ]
        [ -s|--physicalextentsize Size[m|UNIT] ]
        [ -f|--force ]
        [ -Z|--zero y|n ]
        [    --addtag Tag ]

But in https://github.com/metal-stack/csi-driver-lvm/blob/master/pkg/lvm/lvm.go#L481 there is --add-tag ?

Ephemeral Volume support

Some use cases require more scratch filesystem space for the application than available via emptydir.
For these type of applications, kubernetes support since 1.16 Ephemeral Volumes:

csi-driver-host-path does support this: https://github.com/kubernetes-csi/csi-driver-host-path/blob/master/docs/example-ephemeral.md

From reading our code, this seems to be supported as well, but testing it and collecting the required steps in a example-ephemeral.md would be required.

unable to activate logical volumes: Volume group "csi-lvm" not found

Hello,

I am testing out a local setup with minikube and KubeVirt and I want to use the csi-driver-lvm on a freshly formatted disk but once I set up everything I eventually get some failures:

$ k logs csi-driver-lvm-plugin-f8rvp csi-driver-lvm-plugin 
2021/10/22 12:28:45 unable to configure logging to stdout:no such flag -logtostderr
I1022 12:28:45.692895       1 lvm.go:115] pullpolicy: IfNotPresent
I1022 12:28:45.692903       1 lvm.go:119] Driver: lvm.csi.metal-stack.io 
I1022 12:28:45.692907       1 lvm.go:120] Version: dev
I1022 12:28:45.780598       1 lvm.go:418] unable to list existing volumegroups:exit status 5
I1022 12:28:45.780621       1 nodeserver.go:51] volumegroup: csi-lvm not found
I1022 12:28:45.988701       1 nodeserver.go:58] unable to activate logical volumes:  Volume group "csi-lvm" not found
  Cannot process volume group csi-lvm
 exit status 5
I1022 12:28:45.989663       1 controllerserver.go:272] Enabling controller service capability: CREATE_DELETE_VOLUME
I1022 12:28:45.989911       1 server.go:95] Listening for connections on address: &net.UnixAddr{Name:"//csi/csi.sock", Net:"unix"}

Here are the setps that I take:

  1. Format the disk fdisk /dev/sdb -> d -> n -> p -> 1 -> Select Sectors -> Format -> w save
  2. Set up the helm repo:
helm repo add metal-stack https://helm.metal-stack.io
  1. Install everything:
helm install mytest metal-stack/csi-driver-lvm --set lvm.devicePattern='/dev/sdb1'
  1. get the pods
 k get po
NAME                          READY   STATUS    RESTARTS   AGE
csi-driver-lvm-controller-0   3/3     Running   0          7m33s
csi-driver-lvm-plugin-v4jfd   3/3     Running   0          9m14s
  1. Check vgs, pvs, create the example pods, check the storageclasses:
$ ​kubectl apply -f examples/csi-pvc-raw.yaml
kubectl apply -f examples/csi-pod-raw.yaml
kubectl apply -f examples/csi-pvc.yaml
kubectl apply -f examples/csi-app.yaml
persistentvolumeclaim/pvc-raw unchanged
pod/pod-raw configured
persistentvolumeclaim/csi-pvc unchanged
pod/my-csi-app configured
$ k get pvc
NAME      STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS        AGE
csi-pvc   Pending                                      csi-lvm-sc-linear   11m
pvc-raw   Pending                                      csi-lvm-sc-linear   11m
$ k get storageclasses.storage.k8s.io 
NAME                     PROVISIONER                RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
csi-driver-lvm-linear    lvm.csi.metal-stack.io     Delete          WaitForFirstConsumer   true                   17m
csi-driver-lvm-mirror    lvm.csi.metal-stack.io     Delete          WaitForFirstConsumer   true                   17m
csi-driver-lvm-striped   lvm.csi.metal-stack.io     Delete          WaitForFirstConsumer   true                   17m
standard (default)       k8s.io/minikube-hostpath   Delete          Immediate              false                  53m
$ 


$ kubectl apply -f examples/csi-pvc-raw.yaml
kubectl apply -f examples/csi-pod-raw.yaml


kubectl apply -f examples/csi-pvc.yaml
kubectl apply -f examples/csi-app.yaml
persistentvolumeclaim/pvc-raw unchanged
pod/pod-raw configured
persistentvolumeclaim/csi-pvc unchanged
pod/my-csi-app configured
$ k get pvs
error: the server doesn't have a resource type "pvs"
$ k get pvs
error: the server doesn't have a resource type "pvs"
$ k get pv
No resources found
$ k get pvc
NAME      STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS        AGE
csi-pvc   Pending                                      csi-lvm-sc-linear   11m
pvc-raw   Pending                                      csi-lvm-sc-linear   11m
k$k get storageclasses.storage.k8s.io 
NAME                     PROVISIONER                RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
csi-driver-lvm-linear    lvm.csi.metal-stack.io     Delete          WaitForFirstConsumer   true                   17m
csi-driver-lvm-mirror    lvm.csi.metal-stack.io     Delete          WaitForFirstConsumer   true                   17m
csi-driver-lvm-striped   lvm.csi.metal-stack.io     Delete          WaitForFirstConsumer   true                   17m
standard (default)       k8s.io/minikube-hostpath   Delete          Immediate              false                  53m

Any help would be of a great assistance, thanks.

Dockerfile dependency packages questions

Hi Team,

When reading the Dockerfile, I found that lvm2-extra and e2fsprogs-extra are required as dependency. I can see they are all installed by apk, but these packages are not available in my package manager (tdnf), so want to consult what is the usage of these packages, and are they necessary for csi-driver-lvm to be built and running?

Thanks,
liulanze@

csi-driver-lvm-plugin fails to mount /lib/modules directory

After deployment, the pods for csi-driver-lvm-plugin are unable to mount /lib/modules directory. I found it was a symlink, which should work i expect, so changed it to the full actual path to test but still hitting the same error

MountVolume.SetUp failed for volume "mod-dir" : hostPath type check failed: /usr/lib/modules is not a directory

The directory does exist on the host systems.

Versions of things:
OS: Ubuntu 20.04.1 LTS
Kubernetes: v1.19.7 ( deployed via Rancher )

Deployed using helm chart version

Provisioner pod created in default namespace

Talos clusters use Pod Security Standards by default and do not allow the creation of privileged pods. To create privileged pods in a namespace, you need to add special annotations to the namespace.

pod-security.kubernetes.io/enforce: privileged
pod-security.kubernetes.io/enforce-version: latest

Now the Provisioner Pod is created in the default namespace

        provisionerPod := &v1.Pod{
                ObjectMeta: metav1.ObjectMeta{
                        Name: string(va.action) + "-" + va.name,
                },

Since the Provisioner Pod is privileged, please create a Provisioner Pod in the namespace csi-driver-lvm so as not to add annotations to the default namespace

Resize failing despite free VG space

Hi,
I set this up on multiple identical nodes which all have 6TB VGs. I had some PVCs at 512GB and wanted to resize them all to 1TB but I always end up with this error:

csi-driver-lvm-plugin-6l8pq csi-driver-lvm-plugin E0510 12:41:07.171962 1 server.go:114] GRPC error: rpc error: code = OutOfRange desc = Requested capacity 1099511627776 exceeds maximum allowed 1099511627776

The same number appears no matter what size I choose.

My VGs are all half empty:
VG #PV #LV #SN Attr VSize VFree
csi-lvm 3 5 0 wz--n- <5.24t <3.49t

Any ideas where this is coming from?

Support disk type identifying

Identify and group disk not only by device pattern but also by type of the disk(ssd,hdd,nvme).
So we can use disks with similar type in one LVM volume group.
We can't use device pattern in every case, /dev/sda can be hdd and /dev/sdb can be ssd.

[Enhancment] Support for LVM-HA

Hey there,

I'd like to suggest a feature. Are there any plans to add LVM-HA/ClusteredLVM support? That would be incredibly useful as it opens the csi up for many more usecases!

Thank you!

Feel free to just close the issue, if this request is out of scope.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.