Coder Social home page Coder Social logo

kubernetes-sigs / blob-csi-driver Goto Github PK

View Code? Open in Web Editor NEW
117.0 117.0 75.0 113.43 MB

Azure Blob Storage CSI driver

License: Apache License 2.0

Makefile 1.01% Shell 7.71% Go 83.71% Dockerfile 0.50% Python 1.13% Mustache 5.92%
k8s-sig-cloud-provider

blob-csi-driver's People

Contributors

aarongalang avatar adrosa avatar andyzhangx avatar ashishranjan738 avatar avoltz avatar bishal7679 avatar boddumanohar avatar chyin6 avatar cvvz avatar dependabot[bot] avatar gossion avatar guilhem avatar harshika-kashyap avatar invidian avatar justin-jin avatar k8s-ci-robot avatar lizebang avatar martinforreal avatar mayankshah1607 avatar nlamirault avatar raffo avatar rhummelmose avatar rjsadow avatar robin-wayve avatar rui-tang avatar sakuralbj avatar satr avatar songjiaxun avatar umagnus avatar zeromagic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

blob-csi-driver's Issues

E2E example with persistentVolumeReclaimPolicy: Delete doesn't work

PV create with persistentVolumeReclaimPolicy: Delete is failing with the error
Error getting deleter volume plugin for volume "pv-blobfuse": no volume plugin matched

What you expected to happen:
PV should get deleted and the backing container should also get deleted.

How to reproduce it:

  • Use the E2E example with SAS token.
  • Create PV with persistentVolumeReclaimPolicy: Delete
  • Delete the PVC.

Anything else we need to know?:
SAS token has all the RBAC permission

Environment:

  • CSI Driver version: Master
  • Kubernetes version (use kubectl version): v1.15.10
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a): Linux my-nginx-f969cff84-2qprc 4.15.0-1071-azure #76-Ubuntu SMP Wed Feb 12 03:02:44 UTC 2020 x86_64 GNU/Linux
  • Install tools:
  • Others:

build blobfuse binary in Dockerfile

What happened:
error is like following, need to change deployment script:

  Type     Reason                  Age                   From                               Message
  ----     ------                  ----                  ----                               -------
  Normal   Scheduled               10m                   default-scheduler                  Successfully assigned default/nginx-blobfuse to k8s-agentpool-36598153-0
  Normal   SuccessfulAttachVolume  10m                   attachdetach-controller            AttachVolume.Attach succeeded for volume "pvc-74b46958-d912-11e9-ac82-000d3a00587d"
  Warning  FailedMount             2m34s (x12 over 10m)  kubelet, k8s-agentpool-36598153-0  MountVolume.SetUp failed for volume "pvc-74b46958-d912-11e9-ac82-000d3a00587d" : rpc error: code = Unknown desc = Mount failed with error: exit status 1, output: /usr/blob/blobfuse: /usr/lib/x86_64-linux-gnu/libcurl.so.4: version `CURL_OPENSSL_4' not found (required by /usr/blob/blobfuse)
/usr/blob/blobfuse: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `CXXABI_1.3.11' not found (required by /usr/blob/blobfuse)
/usr/blob/blobfuse: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.22' not found (required by /usr/blob/blobfuse)
  Warning  FailedMount  2m (x4 over 8m48s)  kubelet, k8s-agentpool-36598153-0  Unable to mount volumes for pod "nginx-blobfuse_default(955923f1-d912-11e9-ac82-000d3a00587d)": timeout expired waiting for volumes to attach or mount for pod "default"/"nginx-blobfuse". list of unmounted volumes=[blobfuse01]. list of unattached volumes=[blobfuse01 default-token-r5wll]
  Warning  FailedMount  19s                 kubelet, k8s-agentpool-36598153-0  MountVolume.SetUp failed for volume "pvc-74b46958-d912-11e9-ac82-000d3a00587d" : rpc error: code = Unknown desc = Mount failed with error: exit status 1, output: /usr/blob/blobfuse: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.25' not found (required by /usr/lib/x86_64-linux-gnu/libgnutls.so.30)
/usr/blob/blobfuse: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.25' not found (required by /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1)
/usr/blob/blobfuse: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.27' not found (required by /usr/lib/x86_64-linux-gnu/libgssapi_krb5.so.2)
/usr/blob/blobfuse: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.26' not found (required by /usr/lib/x86_64-linux-gnu/libp11-kit.so.0)

What you expected to happen:

How to reproduce it:

Anything else we need to know?:

Environment:

  • CSI Driver version:
  • Kubernetes version (use kubectl version):
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

enable readOnly flag

What happened:
Currenlty volume attribute readOnly is ignored, need to respect this flag in the driver

What you expected to happen:

How to reproduce it:

Anything else we need to know?:

Environment:

  • CSI Driver version:
  • Kubernetes version (use kubectl version):
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

collect Prometheus metrics from this driver

Is your feature request related to a problem? Please describe.

Describe the solution you'd like in detail

This driver use the same azure cloud provider lib with azuredisk. So we can open a port and then collect Prometheus metrics.

curl http://localhost:10252/metrics | grep cloudprovider_azure_api_request
cloudprovider_azure_api_request_duration_seconds_sum{request="vmssvm_create_or_update",resource_group="andy-vmss1141",source="attach_disk",subscription_id="b9d2281e-dcd5-4dfd-9a97-xxx"} 40.985180089
cloudprovider_azure_api_request_duration_seconds_count{request="vmssvm_create_or_update",resource_group="andy-vmss1141",source="attach_disk",subscription_id="b9d2281e-dcd5-4dfd-9a97-xxx"} 2
cloudprovider_azure_api_request_duration_seconds_sum{request="vmssvm_create_or_update",resource_group="andy-vmss1141",source="detach_disk",subscription_id="b9d2281e-dcd5-4dfd-9a97-xxx"} 40.933383735
cloudprovider_azure_api_request_duration_seconds_count{request="vmssvm_create_or_update",resource_group="andy-vmss1141",source="detach_disk",subscription_id="b9d2281e-dcd5-4dfd-9a97-xxx"} 2

Describe alternatives you've considered

Additional context

install blobfuse binary in Docker container

What happened:
Azure/azure-storage-fuse#250
currently use the workaround way: copy blobfuse binary from a Ubuntu VM

What you expected to happen:

How to reproduce it:

Anything else we need to know?:

Environment:

  • CSI Driver version:
  • Kubernetes version (use kubectl version):
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

Patch sidecars for CVE-2019-11255

What happened:
check kubernetes/kubernetes#85233 for details

Longer term, upgrade your CSI driver with patched versions of the affected sidecars. Fixes are available in the following sidecar versions:

external-provisioner:
v0.4.3
v1.0.2
v1.2.2
v1.3.1
v1.4.0

external-snapshotter:
v0.4.2
v1.0.2
v1.2.2

external-resizer
v0.3.0

What you expected to happen:

How to reproduce it:

Anything else we need to know?:

Environment:

  • CSI Driver version:
  • Kubernetes version (use kubectl version):
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

migrate current repo to kubernetes-sigs org

Is your feature request related to a problem?/Why is this needed

Describe the solution you'd like in detail

Describe alternatives you've considered

Additional context

/assign

SAS token support in static provisioning scenario

Is your feature request related to a problem?/Why is this needed

Describe the solution you'd like in detail

Currently blobfuse supports SAS token, we could add SAS token support in static provisioning scenario

Describe alternatives you've considered

Additional context

switch azure cloud provider to 1.14

Is your feature request related to a problem?/Why is this needed

Describe the solution you'd like in detail

Describe alternatives you've considered

Additional context

/assign

add MSI e2e test

Is your feature request related to a problem?/Why is this needed

Describe the solution you'd like in detail

Describe alternatives you've considered

Additional context

add sovereign cloud support

What happened:
currently it's not supported, blocking issue:
Azure/azure-storage-fuse#199 (comment)

What you expected to happen:

How to reproduce it:

Anything else we need to know?:

Environment:

  • CSI Driver version:
  • Kubernetes version (use kubectl version):
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

OWNERS and OWNERS_ALIASES files are invalid and causing errors.

  1. The OWNERS_ALIASES file isn't a valid format and doesn't parse at all.
{
  base: "master"   
  client: "repoowners"   
  component: "hook"   
  error: "error unmarshaling JSON: while decoding JSON: json: cannot unmarshal array into Go struct field .aliases of type map[string][]string"   
  file: "prow/repoowners/repoowners.go:349"   
  func: "k8s.io/test-infra/prow/repoowners.loadAliasesFrom"   
  level: "error"   
  msg: "Failed to unmarshal aliases from "/tmp/git558231759/OWNERS_ALIASES". Using empty alias map."   
  org: "kubernetes-sigs"   
  repo: "blobfuse-csi-driver"   
}

Here is an example of a valid OWNERS_ALIASES file: https://github.com/kubernetes/test-infra/blob/master/OWNERS_ALIASES

  1. The OWNERS file only includes @andyzhangx which makes it impossible to approve a PR authored by @andyzhangx. e.g. This PR (#82) had to be manually merged instead of letting Tide merge it as is policy.
{
  author: "andyzhangx"   
  component: "hook"   
  event-GUID: "e3731080-20c0-11ea-9a5f-9d203b6a99db"   
  event-type: "pull_request"   
  file: "prow/plugins/blunderbuss/blunderbuss.go:238"   
  func: "k8s.io/test-infra/prow/plugins/blunderbuss.handle"   
  level: "warning"   
  msg: "Not enough reviewers found in OWNERS files for files touched by this PR. 0/2 reviewers found."   
  org: "kubernetes-sigs"   
  plugin: "blunderbuss"   
  pr: 82   
  repo: "blobfuse-csi-driver"   
  url: "https://github.com/kubernetes-sigs/blobfuse-csi-driver/pull/82"   
}

These are both causing Prow errors that are visible to oncall so please address at your earliest convenience.
/assign @andyzhangx

fix sanity test failures

What happened:
follow: https://github.com/csi-driver/blobfuse-csi-driver/tree/master/test/sanity

You will get following sanity test failures:

Summarizing 6 Failures:

[Fail] Controller Service CreateVolume [It] should not fail when requesting to create a volume with already existing name and same capacity.
/root/go/src/github.com/csi-driver/blobfuse-csi-driver/vendor/github.com/kubernetes-csi/csi-test/pkg/sanity/controller.go:371

[Fail] Controller Service CreateVolume [It] should fail when requesting to create a volume with already existing name and different capacity.
/root/go/src/github.com/csi-driver/blobfuse-csi-driver/vendor/github.com/kubernetes-csi/csi-test/pkg/sanity/controller.go:413

[Fail] Controller Service CreateVolume [It] should not fail when creating volume with maximum-length name
/root/go/src/github.com/csi-driver/blobfuse-csi-driver/vendor/github.com/kubernetes-csi/csi-test/pkg/sanity/controller.go:491

[Fail] Controller Service DeleteVolume [It] should return appropriate values (no optional values added)
/root/go/src/github.com/csi-driver/blobfuse-csi-driver/vendor/github.com/kubernetes-csi/csi-test/pkg/sanity/controller.go:570

[Fail] Controller Service ValidateVolumeCapabilities [It] should return appropriate values (no optional values added)
/root/go/src/github.com/csi-driver/blobfuse-csi-driver/vendor/github.com/kubernetes-csi/csi-test/pkg/sanity/controller.go:642

[Fail] Node Service [It] should work
/root/go/src/github.com/csi-driver/blobfuse-csi-driver/vendor/github.com/kubernetes-csi/csi-test/pkg/sanity/node.go:375

Ran 26 of 57 Specs in 1.100 seconds
FAIL! -- 20 Passed | 6 Failed | 0 Pending | 31 Skipped
--- FAIL: TestSanity (1.11s)

What you expected to happen:

How to reproduce it:

Anything else we need to know?:

Environment:

  • CSI Driver version:
  • Kubernetes version (use kubectl version):
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

check whether static provisioning works

Is your feature request related to a problem?/Why is this needed

Describe the solution you'd like in detail

Describe alternatives you've considered

Additional context

VHD disk feature on blobfuse

Is your feature request related to a problem? Please describe.

Describe the solution you'd like in detail

Create VHD disk in blob storage container directly, and then blobfuse mount to local VM, by this way, disk attach/detach could only costs < 1s

Describe alternatives you've considered

Additional context

add negative mountOptions e2e test

Is your feature request related to a problem? Please describe.

Describe the solution you'd like in detail

add mountOptions e2e test including:

Describe alternatives you've considered

Additional context

support one PV mounted by multiple pods on one node

Is your feature request related to a problem? Please describe.

Describe the solution you'd like in detail

Current design is that every pod using blobfuse volume would do one blobfuse mount on the node, that may cause wasted resource usage if there are lots of pods using same blobfuse volume on the node, we could improve this in next release. Looks like there is not much code change cost.

// NodeStageVolume mount the volume to a staging path
// todo: we may implement this for blobfuse
// The reason that mounting is a two step operation is
// because Kubernetes allows you to use a single volume by multiple pods.
// This is allowed when the storage system supports it or if all pods run on the same node.

https://github.com/kubernetes-sigs/blobfuse-csi-driver/blob/7e65a2b49dc8ecd10995304ca95953fddadf5f90/pkg/blobfuse/nodeserver.go#L161-L175

Describe alternatives you've considered

Additional context

volume resize support

Is your feature request related to a problem? Please describe.

Describe the solution you'd like in detail

Azure storage container actually supports arbitrary size, need to support resize for this driver.

Describe alternatives you've considered

Additional context

a few issues in keyvault scenario

What happened:

  1. I got following error when try to follow up with guide: https://github.com/csi-driver/blobfuse-csi-driver/blob/master/deploy/example/keyvault/README.md
kubelet, k8s-agentpool-32686255-0  MountVolume.SetUp failed for volume "pv-blobfuse" : kubernetes.io/csi: mounter.SetupAt failed: rpc error: code = Unknown desc = failed to use vaultURL(/subscriptions/b9d2281e-dcd5-4dfd-9a97-0d50377cdf76/resourcegroups/andy-mg1160alpha3/providers/Microsoft.KeyVault/vaults/andytestx), sercretName(sastoken), secretVersion() to get secret: keyvault.BaseClient#GetSecret: Failure preparing request: StatusCode=0 -- Original Error: autorest: No scheme detected in URL /subscriptions/xxx/resourcegroups/andy-mg1160alpha3/providers/Microsoft.KeyVault/vaults/andytestx
  1. following secret config is not necessary in the config
    https://github.com/csi-driver/blobfuse-csi-driver/blob/f554f91d2ed1c09de682e106565a99961067f11e/deploy/example/keyvault/pv-blobfuse-csi-keyvault.yaml#L21-L23

My config is like this:

piVersion: v1
kind: PersistentVolume
metadata:
  name: pv-blobfuse
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain #If set as "Delete" container would be removed after pvc deletion
  csi:
    driver: blobfuse.csi.azure.com
    readOnly: false
    volumeHandle: arbitrary-volumeid
    volumeAttributes:
      containerName: public
      storageAccountName: fde1626e2ba5e11e999ae36
      keyVaultURL: /subscriptions/xxx/resourcegroups/andy-mg1160alpha3/providers/Microsoft.KeyVault/vaults/andytestx
      keyVaultSecretName: sastoken

What you expected to happen:

How to reproduce it:

Anything else we need to know?:

Environment:

  • CSI Driver version:
  • Kubernetes version (use kubectl version):
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

/assign @ZeroMagic

[test] set e2e test CI

Is your feature request related to a problem?/Why is this needed

Describe the solution you'd like in detail

Describe alternatives you've considered

Additional context

restart csi-blobfuse-node daemonset would make current blobfuse mount unavailable

What happened:

  1. install blobfuse csi driver and run a nginx-blobfuse pod example
  2. kubectl delete po csi-blobfuse-node-8ttf5 -n kube-system would make current blobfuse mount inaccessible
  • workaround
    delete current nginx-blobfuse pod and create a new nginx-blobfuse pod
$ kubectl exec -it nginx-blobfuse bash
root@nginx-blobfuse:/# df -h
df: /mnt/blobfuse: Transport endpoint is not connected
Filesystem      Size  Used Avail Use% Mounted on
overlay          29G   15G   15G  50% /
tmpfs            64M     0   64M   0% /dev
tmpfs           3.4G     0  3.4G   0% /sys/fs/cgroup
/dev/sda1        29G   15G   15G  50% /etc/hosts
shm              64M     0   64M   0% /dev/shm
tmpfs           3.4G   12K  3.4G   1% /run/secrets/kubernetes.io/serviceaccount
tmpfs           3.4G     0  3.4G   0% /proc/acpi
tmpfs           3.4G     0  3.4G   0% /proc/scsi
tmpfs           3.4G     0  3.4G   0% /sys/firmware


$ mount | grep blobfuse
blobfuse on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-0433847e-03fd-422f-b053-5534510eb338/globalmount type fuse (rw,nosuid,nodev,relatime,user_id=0,group_id=0,max_read=131072)
blobfuse on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-0433847e-03fd-422f-b053-5534510eb338/globalmount type fuse (rw,nosuid,nodev,relatime,user_id=0,group_id=0,max_read=131072)
blobfuse on /var/lib/kubelet/pods/f5f56d79-553e-416d-a852-4ef8224e6422/volumes/kubernetes.io~csi/pvc-0433847e-03fd-422f-b053-5534510eb338/mount type fuse (rw,nosuid,nodev,relatime,user_id=0,group_id=0,max_read=131072)
blobfuse on /var/lib/kubelet/pods/f5f56d79-553e-416d-a852-4ef8224e6422/volumes/kubernetes.io~csi/pvc-0433847e-03fd-422f-b053-5534510eb338/mount type fuse (rw,nosuid,nodev,relatime,user_id=0,group_id=0,max_read=131072)
azureuser@k8s-agentpool-10150444-0:~$ sudo ls /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-0433847e-03fd-422f-b053-5534510eb338/globalmount
ls: cannot access '/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-0433847e-03fd-422f-b053-5534510eb338/globalmount': Transport endpoint is not connected

What you expected to happen:

How to reproduce it:

Anything else we need to know?:

Environment:

  • CSI Driver version: v0.5.0
  • Kubernetes version (use kubectl version):
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

Set priority class to system critical

Is your feature request related to a problem? Please describe.

Describe the solution you'd like in detail

add priorityClassName: system-cluster-critical and priorityClassName: system-node-critical to controller and node YAMLs respectively

Make the driver high priority and less likely to be evicted

Describe alternatives you've considered

Additional context

Support customer specified MSI endpoint

Is your feature request related to a problem? Please describe.

Describe the solution you'd like in detail

Describe alternatives you've considered

Additional context

add retry for DeleteVolume

What happened:
DeleteVolume operation may fail sometimes, need to add retry for this operation:
https://github.com/csi-driver/blobfuse-csi-driver/blob/6cf17ce8561c7d4f4a634ac6336db3e18cd766bb/pkg/blobfuse/controllerserver.go#L160

I0709 06:45:43.898351    9057 utils.go:112] GRPC call: /csi.v1.Controller/DeleteVolume
I0709 06:45:43.898380    9057 utils.go:113] GRPC request: volume_id:"andy-ci#blobfuseci#citest-1562654687" 
I0709 06:45:44.015941    9057 controllerserver.go:152] deleting container(citest-1562654687) rg(andy-ci) account(blobfuseci) volumeID(andy-ci#blobfuseci#citest-1562654687)
E0709 06:45:44.053696    9057 utils.go:117] GRPC error: failed to delete container(citest-1562654687) on account(blobfuseci), error: storage: service returned error: StatusCode=409, ErrorCode=ContainerBeingDeleted, ErrorMessage=The specified container is being deleted. Try operation later.
RequestId:9f31f89a-301e-001f-4d21-3670be000000
Time:2019-07-09T06:45:44.0363876Z, RequestInitiated=Tue, 09 Jul 2019 06:45:43 GMT, RequestId=9f31f89a-301e-001f-4d21-3670be000000, API Version=2016-05-31, QueryParameterName=, QueryParameterValue=
failed to delete container(citest-1562654687) on account(blobfuseci), error: storage: service returned error: StatusCode=409, ErrorCode=ContainerBeingDeleted, ErrorMessage=The specified container is being deleted. Try operation later.
RequestId:9f31f89a-301e-001f-4d21-3670be000000
Time:2019-07-09T06:45:44.0363876Z, RequestInitiated=Tue, 09 Jul 2019 06:45:43 GMT, RequestId=9f31f89a-301e-001f-4d21-3670be000000, API Version=2016-05-31, QueryParameterName=, QueryParameterValue=
Please use -h,--help for more information
Makefile:33: recipe for target 'integration-test' failed
make: *** [integration-test] Error 2

What you expected to happen:

How to reproduce it:

Anything else we need to know?:
implementation could refer to https://github.com/kubernetes/kubernetes/blob/c6eb9a8ed51f5c63cb351e2a4c13494bf5c303a2/staging/src/k8s.io/legacy-cloud-providers/azure/azure_backoff.go#L427-L435, and since it's a 409 error, it should retry:
https://github.com/kubernetes/kubernetes/blob/262e59b2c04bf71ec4e56fdc2f3a91a5878336f4/staging/src/k8s.io/legacy-cloud-providers/azure/azure_backoff.go#L611

waiting for PR: kubernetes/kubernetes#79981 merged

Environment:

  • CSI Driver version:
  • Kubernetes version (use kubectl version):
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

secret value should not be exposed in logs

What happened:
Currently secret value would be exposed in logs, while it should not

I0421 02:40:19.501807       1 utils.go:112] GRPC response: capabilities:<rpc:<> >
I0421 02:40:19.506657       1 utils.go:106] GRPC call: /csi.v1.Node/NodePublishVolume
I0421 02:40:19.506674       1 utils.go:107] GRPC request: volume_id:"arbitrary-volumeid" target_path:"/var/lib/kubelet/pods/cc9552fd-63de-11e9-8b22-000d3a004ffb/volumes/kubernetes.io~csi/pv-blobfuse/mount" volume_capability:<mount:<> access_mode:<mode:MULTI_NODE_MULTI_WRITER > > secrets:<key:"accountkey" value:"xxx" > secrets:<key:"accountname" value:"andyacidiag" > volume_context:<key:"containerName" value:"test" > volume_context:<key:"resourceGroup" value:"andy-aci" > volume_context:<key:"storageAccount" value:"andyacidiag" >

We could disable this logging easily, while I would like to filter out only the secret value, for the other request info, it's useful for debugging:
https://github.com/csi-driver/blobfuse-csi-driver/blob/e7b3a4c2dc4f54a3d033ec55dc282613f856225f/pkg/csi-common/utils.go#L107

What you expected to happen:

How to reproduce it:

Anything else we need to know?:

Environment:

  • CSI Driver version:
  • Kubernetes version (use kubectl version):
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

add Pod Identity support

Is your feature request related to a problem?/Why is this needed

Describe the solution you'd like in detail

add support for Pod Identity to enable finer grain scope of identity as an alternative to using the cluster’s identity. We do see cases where Azure resources are closely associated with the cluster. But we also see customers partitioning AKS clusters using namespaces, and in those cases it’s more likely that RBAC grants to resources like storage/key vault would be to Managed Identities that were scoped to a Kubernetes namespace.

Describe alternatives you've considered

Additional context

MSI support

Is your feature request related to a problem? Please describe.

Describe the solution you'd like in detail

https://github.com/Azure/azure-storage-fuse#config-file-options said MSI auth is only supported from 1.2.0, while current latest release is 1.1.1

Managed Identity auth: (Only available for 1.2.0 or above)

identityClientId: If a MI endpoint is specified, this is the only parameter used, in the form of the Secret header. Only one of these three parameters are needed if multiple identities are present on the system.
identityObjectId: Only one of these three parameters are needed if multiple identities are present on the system.
identityResourceId: Only one of these three parameters are needed if multiple identities are present on the system.
msiEndpoint: Specifies a custom managed identity endpoint, as IMDS may not be available under some scenarios. Uses the identityClientId parameter as the Secret header.

Describe alternatives you've considered

Additional context

Storageclass credentials with statefulset ?

How can I pass azure storage account credentials for storageclass if I want to use statefulset and volumeClaimTemplates ? ( https://github.com/kubernetes-sigs/blobfuse-csi- driver/blob/master/deploy/example/statefulset.yaml )
Thanks

Is /etc/kubernetes/azure.json file only way?

EDIT:
Solved with:

az role assignment create --assignee $(az aks show -g myRG -n myAKS --query servicePrincipalProfile.clientId -o tsv) --role "Storage Account Contributor" --scope mySA

and

kubectl.exe rollout restart daemonset csi-blobfuse-node -n kube-system
kubectl.exe rollout restart deployment csi-blobfuse-controller -n kube-system

^ not sure if its needed to restart both.

Create a blobfuse with a new storage account would lead to error

What happened:
Create a blobfuse with a new storage account would lead to error:

# k describe pvc persistent-storage-statefulset-blobfuse-0
Name:          persistent-storage-statefulset-blobfuse-0
Namespace:     default
StorageClass:  blobfuse.csi.azure.com
Status:        Pending
Volume:
Labels:        app=nginx
Annotations:   volume.beta.kubernetes.io/storage-class: blobfuse.csi.azure.com
               volume.beta.kubernetes.io/storage-provisioner: blobfuse.csi.azure.com
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode:    Filesystem
Mounted By:    statefulset-blobfuse-0
Events:
  Type     Reason                 Age                From                                                                                                  Message
  ----     ------                 ----               ----                                                                                                  -------
  Warning  ProvisioningFailed     16s                blobfuse.csi.azure.com_csi-blobfuse-controller-64d6bfc79d-vf7w4_f4a0b74f-045f-447c-adcb-b35a614c1580  failed to provision volume with StorageClass "blobfuse.csi.azure.com": rpc error: code = DeadlineExceeded desc = context deadline exceeded
  Normal   ExternalProvisioning   11s (x3 over 26s)  persistentvolume-controller                                                                           waiting for a volume to be created, either by external provisioner "blobfuse.csi.azure.com" or manually created by system administrator
  Warning  ProvisioningFailed     8s (x3 over 15s)   blobfuse.csi.azure.com_csi-blobfuse-controller-64d6bfc79d-vf7w4_f4a0b74f-045f-447c-adcb-b35a614c1580  failed to provision volume with StorageClass "blobfuse.csi.azure.com": rpc error: code = Unknown desc = could not get storage key for storage account : could not get storage key for storage account fuse372e1d799fb14a6ca5e: Retriable: false, RetryAfter: 0s, HTTPStatusCode: 409, RawError: Retriable: false, RetryAfter: 0s, HTTPStatusCode: 409, RawError: {"error":{"code":"StorageAccountIsNotProvisioned","message":"The storage account provisioning state must be 'Succeeded' before executing the operation."}}
  Normal   Provisioning           0s (x5 over 26s)   blobfuse.csi.azure.com_csi-blobfuse-controller-64d6bfc79d-vf7w4_f4a0b74f-045f-447c-adcb-b35a614c1580  External provisioner is provisioning volume for claim "default/persistent-storage-statefulset-blobfuse-0"
  Normal   ProvisioningSucceeded  0s                 blobfuse.csi.azure.com_csi-blobfuse-controller-64d6bfc79d-vf7w4_f4a0b74f-045f-447c-adcb-b35a614c1580  Successfully provisioned volume pvc-704d37a8-862b-11ea-a0db-8e7fc28d9e05
I0424 12:59:34.797516       1 utils.go:112] GRPC call: /csi.v1.Controller/CreateVolume
I0424 12:59:34.797540       1 utils.go:113] GRPC request: name:"pvc-704d37a8-862b-11ea-a0db-8e7fc28d9e05" capacity_range:<required_bytes:107374182400 > volume_capabilities:<mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > > parameters:<key:"skuName" value:"Standard_LRS" >
I0424 12:59:34.892021       1 utils.go:112] GRPC call: /csi.v1.Identity/Probe
I0424 12:59:34.892170       1 utils.go:113] GRPC request:
I0424 12:59:34.892216       1 utils.go:119] GRPC response: ready:<value:true >
I0424 12:59:35.210625       1 azure_storageaccount.go:121] azure - no matching account found, begin to create a new account fuse372e1d799fb14a6ca5e in resource group mc_andy-msi1148_andy-msi1148_uksouth, location: uksouth, accountType: Standard_LRS, accountKind: StorageV2
I0424 12:59:45.804757       1 utils.go:112] GRPC call: /csi.v1.Controller/CreateVolume
I0424 12:59:45.804788       1 utils.go:113] GRPC request: name:"pvc-704d37a8-862b-11ea-a0db-8e7fc28d9e05" capacity_range:<required_bytes:107374182400 > volume_capabilities:<mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > > parameters:<key:"skuName" value:"Standard_LRS" >
I0424 12:59:45.850156       1 azure_storageaccount.go:103] found a matching account fuse372e1d799fb14a6ca5e type Standard_LRS location uksouth
I0424 12:59:45.903897       1 azure_storageaccountclient.go:185] Received error in storageaccount.listkeys.request: resourceID: /subscriptions/b9d2281e-dcd5-4dfd-9a97-0d50377cdf76/resourceGroups/mc_andy-msi1148_andy-msi1148_uksouth/providers/Microsoft.Storage/storageAccounts/fuse372e1d799fb14a6ca5e, error: Retriable: false, RetryAfter: 0s, HTTPStatusCode: 409, RawError: Retriable: false, RetryAfter: 0s, HTTPStatusCode: 409, RawError: {"error":{"code":"StorageAccountIsNotProvisioned","message":"The storage account provisioning state must be 'Succeeded' before executing the operation."}}
E0424 12:59:45.903948       1 utils.go:117] GRPC error: could not get storage key for storage account : could not get storage key for storage account fuse372e1d799fb14a6ca5e: Retriable: false, RetryAfter: 0s, HTTPStatusCode: 409, RawError: Retriable: false, RetryAfter: 0s, HTTPStatusCode: 409, RawError: {"error":{"code":"StorageAccountIsNotProvisioned","message":"The storage account provisioning state must be 'Succeeded' before executing the operation."}}
I0424 12:59:47.919377       1 utils.go:112] GRPC call: /csi.v1.Controller/CreateVolume
I0424 12:59:47.919393       1 utils.go:113] GRPC request: name:"pvc-704d37a8-862b-11ea-a0db-8e7fc28d9e05" capacity_range:<required_bytes:107374182400 > volume_capabilities:<mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > > parameters:<key:"skuName" value:"Standard_LRS" >
I0424 12:59:47.961418       1 azure_storageaccount.go:103] found a matching account fuse372e1d799fb14a6ca5e type Standard_LRS location uksouth
I0424 12:59:48.015122       1 azure_storageaccountclient.go:185] Received error in storageaccount.listkeys.request: resourceID: /subscriptions/b9d2281e-dcd5-4dfd-9a97-0d50377cdf76/resourceGroups/mc_andy-msi1148_andy-msi1148_uksouth/providers/Microsoft.Storage/storageAccounts/fuse372e1d799fb14a6ca5e, error: Retriable: false, RetryAfter: 0s, HTTPStatusCode: 409, RawError: Retriable: false, RetryAfter: 0s, HTTPStatusCode: 409, RawError: {"error":{"code":"StorageAccountIsNotProvisioned","message":"The storage account provisioning state must be 'Succeeded' before executing the operation."}}
E0424 12:59:48.015166       1 utils.go:117] GRPC error: could not get storage key for storage account : could not get storage key for storage account fuse372e1d799fb14a6ca5e: Retriable: false, RetryAfter: 0s, HTTPStatusCode: 409, RawError: Retriable: false, RetryAfter: 0s, HTTPStatusCode: 409, RawError: {"error":{"code":"StorageAccountIsNotProvisioned","message":"The storage account provisioning state must be 'Succeeded' before executing the operation."}}
I0424 12:59:52.025504       1 utils.go:112] GRPC call: /csi.v1.Controller/CreateVolume
I0424 12:59:52.025526       1 utils.go:113] GRPC request: name:"pvc-704d37a8-862b-11ea-a0db-8e7fc28d9e05" capacity_range:<required_bytes:107374182400 > volume_capabilities:<mount:<fs_type:"ext4" > access_mode:<mode:SINGLE_NODE_WRITER > > parameters:<key:"skuName" value:"Standard_LRS" >

What you expected to happen:

How to reproduce it:

Anything else we need to know?:

Environment:

  • CSI Driver version:
  • Kubernetes version (use kubectl version):
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

switch azure cloud provider to 1.14

Is your feature request related to a problem?/Why is this needed

Describe the solution you'd like in detail

Describe alternatives you've considered

Additional context

/assign

Possible CA problem in blobfuse

Sorry to post this here, but from my work computer I can not reach any .io domain.

Server OS: Red Hat Enterprise Linux Server release 7.6 (Maipo)
blobfuse version: blobfuse.x86_64 1.0.2-1

In our configuration we are not connecting directly from our server to the Azure storage. We are going through a proxy/load balancer that intercepts the traffic and then forwards it to Azure. When we try to connect we get the following entries for blobfuse in /var/log/messages:

Aug 12 15:02:55 radfw6 blobfuse[60122]: list_blobs_hierarchical failed for the 1 time with errno = -1352207232.
Aug 12 15:10:06 radfw6 blobfuse[60122]: list_blobs_hierarchical failed for the 2 time with errno = 1932808281.
Aug 12 15:17:39 radfw6 blobfuse[60122]: list_blobs_hierarchical failed for the 3 time with errno = 1932808281.
Aug 12 15:24:33 radfw6 blobfuse[60122]: list_blobs_hierarchical failed for the 4 time with errno = 1932808281.
Aug 12 15:31:44 radfw6 blobfuse[60122]: list_blobs_hierarchical failed for the 5 time with errno = 1932808281.
Aug 12 15:38:36 radfw6 blobfuse[60122]: list_blobs_hierarchical failed for the 6 time with errno = 1932808281.
Aug 12 15:46:16 radfw6 blobfuse[60122]: list_blobs_hierarchical failed for the 7 time with errno = 1932808281.
Aug 12 15:53:09 radfw6 blobfuse[60122]: list_blobs_hierarchical failed for the 8 time with errno = 1932808281.
Aug 12 15:59:49 radfw6 blobfuse[60122]: list_blobs_hierarchical failed for the 9 time with errno = 1932808281.
Aug 12 16:06:40 radfw6 blobfuse[60122]: list_blobs_hierarchical failed for the 10 time with errno = 1932808281.
Aug 12 16:14:04 radfw6 blobfuse[60122]: list_blobs_hierarchical failed for the 11 time with errno = 1932808281.
Aug 12 16:20:49 radfw6 blobfuse[60122]: list_blobs_hierarchical failed for the 12 time with errno = 64.
Aug 12 16:27:25 radfw6 blobfuse[60122]: list_blobs_hierarchical failed for the 13 time with errno = 64.
Aug 12 16:34:15 radfw6 blobfuse[60122]: list_blobs_hierarchical failed for the 14 time with errno = 64.
Aug 12 16:40:59 radfw6 blobfuse[60122]: list_blobs_hierarchical failed for the 15 time with errno = 64.
Aug 12 16:47:38 radfw6 blobfuse[60122]: list_blobs_hierarchical failed for the 16 time with errno = 64.
Aug 12 16:54:05 radfw6 blobfuse[60122]: list_blobs_hierarchical failed for the 17 time with errno = 64.
Aug 12 17:01:14 radfw6 blobfuse[60122]: list_blobs_hierarchical failed for the 18 time with errno = 64.
Aug 12 17:07:57 radfw6 blobfuse[60122]: list_blobs_hierarchical failed for the 19 time with errno = 64.
Aug 12 17:15:41 radfw6 blobfuse[60122]: list_blobs_hierarchical failed for the 20 time with errno = 64.
Aug 12 17:15:41 radfw6 blobfuse[60122]: Failed to list blobs under directory /mnt/ramdisk/root/ on the service
during readdir operation. errno = 64.

When we do a network capture we see the following error:
10 0.222327 ww.xx.60.101 yy.zz.95.130 TLSv1.2 75 Alert (Level: Fatal, Description: Unknown CA)

After that line we see a connection reset.

I tried to look up the errors from the syslog, but did not have much luck. The error in the network traffic seemed to indicate that it did not like the certificate authority that signed the certificate on the proxy. Since blobfuse is written in Python, and Python uses OpenSSL I added the CA information to the OpenSSL certs directory and I can now verify the proxy certificates with OpenSSL but I still can't get blobfuse to connect. So my questions are:
Is this just a problem with the Certificate Authority that signed the certificate?
Where does blobfuse store certificate information?
How do I add additional Certificate Authorities to blobfuse?

Any help would be greatly appreciated.

Create one blobfuse PVC will sometimes lead to two containers created finally

What happened:
With below blobfuse CSI storage class:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: blobfuse.csi.azure.com
provisioner: blobfuse.csi.azure.com
parameters:
  skuName: Standard_LRS  #available values: Standard_LRS, Standard_GRS, Standard_RAGRS
reclaimPolicy: Retain #If set as "Delete" container would be removed after pvc deletion
volumeBindingMode: Immediate

Per below logs, the first CreateVolume request costs about 18s(including creating a new storage account), that would lead to another CreateVolume request.

I0423 07:20:45.195925       1 utils.go:106] GRPC call: /csi.v1.Controller/CreateVolume
I0423 07:20:45.195942       1 utils.go:107] GRPC request: name:"pvc-4a92e455-6598-11e9-9989-0e3cd97702b7" capacity_range:<required_bytes:10737418240 > volume_capabilities:<mount:<fs_type:"ext4" mount_flags:"--file-cache-timeout-in-seconds=120" mount_flags:"--use-https=true" > access_mode:<mode:MULTI_NODE_MULTI_WRITER > > parameters:<key:"skuName" value:"Standard_LRS" >
I0423 07:20:45.606774       1 azure_storageaccount.go:121] azure - no matching account found, begin to create a new account fuse4f945a18659811e99ef in resource group mc_andy-aks1135_andy-aks1135_eastus2, location: eastus2, accountType: Standard_LRS, accountKind: StorageV2
I0423 07:21:01.625287       1 utils.go:106] GRPC call: /csi.v1.Identity/GetPluginInfo
I0423 07:21:01.625305       1 utils.go:107] GRPC request:
I0423 07:21:01.625391       1 identityserver.go:32] Using default GetPluginInfo
I0423 07:21:01.625396       1 utils.go:112] GRPC response: name:"blobfuse.csi.azure.com" vendor_version:"v0.1.0-alpha"
I0423 07:21:01.626609       1 utils.go:106] GRPC call: /csi.v1.Identity/Probe
I0423 07:21:01.626619       1 utils.go:107] GRPC request:
I0423 07:21:01.626642       1 utils.go:112] GRPC response: ready:<value:true >
I0423 07:21:03.478535       1 controllerserver.go:98] begin to create container(pvc-fuse-dynamic-5a3b5ddd-6598-11e9-9ef8-000d3a0400a1) on account(fuse4f945a18659811e99ef) type(Standard_LRS) rg(mc_andy-aks1135_andy-aks1135_eastus2) location() size(10)
I0423 07:21:03.598393       1 controllerserver.go:119] create container pvc-fuse-dynamic-5a3b5ddd-6598-11e9-9ef8-000d3a0400a1 on storage account fuse4f945a18659811e99ef successfully
I0423 07:21:03.598410       1 utils.go:112] GRPC response: volume:<capacity_bytes:10737418240 volume_id:"mc_andy-aks1135_andy-aks1135_eastus2#fuse4f945a18659811e99ef#pvc-fuse-dynamic-5a3b5ddd-6598-11e9-9ef8-000d3a0400a1" volume_context:<key:"skuName" value:"Standard_LRS" > >
I0423 07:21:05.195464       1 utils.go:106] GRPC call: /csi.v1.Controller/CreateVolume
I0423 07:21:05.195499       1 utils.go:107] GRPC request: name:"pvc-4a92e455-6598-11e9-9989-0e3cd97702b7" capacity_range:<required_bytes:10737418240 > volume_capabilities:<mount:<fs_type:"ext4" mount_flags:"--file-cache-timeout-in-seconds=120" mount_flags:"--use-https=true" > access_mode:<mode:MULTI_NODE_MULTI_WRITER > > parameters:<key:"skuName" value:"Standard_LRS" >
I0423 07:21:05.238612       1 azure_storageaccount.go:103] found a matching account fuse4f945a18659811e99ef type Standard_LRS location eastus2
I0423 07:21:05.270246       1 controllerserver.go:98] begin to create container(pvc-fuse-dynamic-5b4cc2b5-6598-11e9-9ef8-000d3a0400a1) on account(fuse4f945a18659811e99ef) type(Standard_LRS) rg(mc_andy-aks1135_andy-aks1135_eastus2) location() size(10)
I0423 07:21:05.276393       1 controllerserver.go:119] create container pvc-fuse-dynamic-5b4cc2b5-6598-11e9-9ef8-000d3a0400a1 on storage account fuse4f945a18659811e99ef successfully

fix solution

  • option#1
  1. search container name if containerName is empty
  2. use a lock in CreateVolume func and release lock when quit CreateVolume func
  • option#2
  1. Use a unique name for every CreateVolume request

Workaround
Use below storage class with containerName specified

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: blobfuse.csi.azure.com
provisioner: blobfuse.csi.azure.com
parameters:
  skuName: Standard_LRS  #available values: Standard_LRS, Standard_GRS, Standard_RAGRS
  containerName: test #if container "test" does not exist, driver will create such container
reclaimPolicy: Retain #If set as "Delete" container would be removed after pvc deletion
volumeBindingMode: Immediate

BTW, also need to check whether azure file CSI driver has such issue.

What you expected to happen:

How to reproduce it:

Anything else we need to know?:

Environment:

  • CSI Driver version:
  • Kubernetes version (use kubectl version):
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

/assign

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.