Coder Social home page Coder Social logo

Comments (10)

gman0 avatar gman0 commented on August 14, 2024 1

@afgane you can track the discussion for this issue in CVMFS forum here https://cernvm-forum.cern.ch/t/cannot-mount-malformed-url/292

from cvmfs-csi.

gman0 avatar gman0 commented on August 14, 2024

Hi @afgane, thanks for reporting. Would it be possible for you to build the driver from master and deploy that? For convenience I've pre-built the image and chart if you want to use that:

  • Image registry.cern.ch/rvasek/cvmfs-csi:master-869e661f
  • Chart:
    • helm repo add rvasek-cern https://registry.cern.ch/chartrepo/rvasek
    • helm repo update
    • helm upgrade --install <release name> rvasek-cern/cvmfs-csi --version v2.1.1-master-869e661f

Please note that upgrading between v2 versions of the driver will still require you to perform the step 4 (and only 4) of described in the upgrading docs.

After deploying, the node plugin Pods run a container named automount that provides more details about the autofs/cvmfs mounts which would be helpful in troubleshooting this further (the current latest release doesn't offer these logs, like you have noted).

If it's helpful for debugging purposes, we can also give access to one of our temp dev clusters where the issue has occurred.

Let's try getting the logs from the automount container first and then we can see how to continue. This would definitely help though, thanks!

from cvmfs-csi.

afgane avatar afgane commented on August 14, 2024

Thank you for the suggestion and packaging the dev version of the chart. We've deployed it and are now seeing the automount log messages. I've posted the entire log in this gist, but here is the seemingly most relevant part:

I0301 19:34:40.002312 32376 automount.go:142] automount[32436]: >> (catalog) Initialize catalog [03-01-2023 19:34:39 UTC]
I0301 19:34:40.002443 32376 automount.go:142] automount[32436]: >> (cache) unable to read local checksum [03-01-2023 19:34:39 UTC]
I0301 19:34:40.003479 32376 automount.go:142] automount[32436]: >> (download) escaped /.cvmfspublished to /.cvmfspublished [03-01-2023 19:34:39 UTC]
I0301 19:34:40.003488 32376 automount.go:142] automount[32436]: >> (download) Verify downloaded url /.cvmfspublished, proxy DIRECT (curl error 3) [03-01-2023 19:34:39 UTC]
I0301 19:34:40.003492 32376 automount.go:142] automount[32436]: >> (download) download failed (error 2 - malformed URL) [03-01-2023 19:34:39 UTC]
I0301 19:34:40.003508 32376 automount.go:142] automount[32436]: >> (cvmfs) failed to download repository manifest (2 - malformed URL) [03-01-2023 19:34:39 UTC]
I0301 19:34:40.003822 32376 automount.go:142] automount[32436]: >> (cache) failed to fetch manifest (1 - failed to download) [03-01-2023 19:34:39 UTC]
I0301 19:34:40.004095 32376 automount.go:142] automount[32436]: >> (cache) miss /cvmfs-aliencache/00/00000000000000000000000000000000000000 (-2) [03-01-2023 19:34:39 UTC]
I0301 19:34:40.004551 32376 automount.go:142] automount[32436]: >> (cache) miss /cvmfs-aliencache/00/00000000000000000000000000000000000000 (-2) [03-01-2023 19:34:39 UTC]
I0301 19:34:40.004936 32376 automount.go:142] automount[32436]: >> (cache) downloading file catalog at data.galaxyproject.org:/ [03-01-2023 19:34:39 UTC]
I0301 19:34:40.005001 32376 automount.go:142] automount[32436]: >> (cache) start transaction on /cvmfs-aliencache/txn/fetchW5PkFn has result 15 [03-01-2023 19:34:39 UTC]
I0301 19:34:40.005538 32376 automount.go:142] automount[32436]: >> (cache) miss: file catalog at data.galaxyproject.org:/ /data/00/00000000000000000000000000000000000000C [03-01-2023 19:34:39 UTC]
I0301 19:34:40.006020 32376 automount.go:142] automount[32436]: >> (download) escaped /data/00/00000000000000000000000000000000000000C to /data/00/00000000000000000000000000000000000000C [03-01-2023 19:34:39 UTC]
I0301 19:34:40.006561 32376 automount.go:142] automount[32436]: >> (download) Verify downloaded url /data/00/00000000000000000000000000000000000000C, proxy DIRECT (curl error 3) [03-01-2023 19:34:39 UTC]
I0301 19:34:40.006708 32376 automount.go:142] automount[32436]: >> (download) download failed (error 2 - malformed URL) [03-01-2023 19:34:39 UTC]
I0301 19:34:40.007249 32376 automount.go:142] automount[32436]: >> (cache) failed to fetch file catalog at data.galaxyproject.org:/ (hash: 0000000000000000000000000000000000000000, error 2 [malformed URL]) [03-01-2023 19:34:39 UTC]
I0301 19:34:40.007555 32376 automount.go:142] automount[32436]: >> (cache) abort /cvmfs-aliencache/txn/fetchW5PkFn [03-01-2023 19:34:39 UTC]
I0301 19:34:40.007955 32376 automount.go:142] automount[32436]: >> (catalog) failed to load catalog '' (3 - failed to load catalog) [03-01-2023 19:34:39 UTC]
I0301 19:34:40.007994 32376 automount.go:142] automount[32436]: >> (catalog) failed to initialize root catalog [03-01-2023 19:34:39 UTC]
I0301 19:34:40.008228 32376 automount.go:142] automount[32436]: >> Failed to initialize root file catalog (16 - file catalog failure)
I0301 19:34:40.008511 32376 automount.go:142] automount[32436]: >> (cache) unpinning / unloading all catalogs [03-01-2023 19:34:39 UTC]
I0301 19:34:40.009217 32376 automount.go:142] automount[32436]: mount(generic): failed to mount data.galaxyproject.org (type cvmfs) on /cvmfs/data.galaxyproject.org
I0301 19:34:40.009381 32376 automount.go:142] automount[32436]: failed to mount /cvmfs/data.galaxyproject.org

This is the extraConfigMaps entry we're using:

extraConfigMaps:
    cvmfs-csi-config-d:
      data.galaxyproject.org.conf: |
        CVMFS_SERVER_URL="http://cvmfs1-iu0.galaxyproject.org/cvmfs/@fqrn@;http://cvmfs1-tacc0.galaxyproject.org/cvmfs/@fqrn@;http://cvmfs1-psu0.galaxyproject.org/cvmfs/@fqrn@;http://cvmfs1-mel0.gvl.org.au/cvmfs/@fqrn@;http://cvmfs1-ufr0.galaxyproject.eu/cvmfs/@fqrn@"
        CVMFS_PUBLIC_KEY="/etc/cvmfs/config.d/data.galaxyproject.org.pub"
        CVMFS_HTTP_PROXY=DIRECT
      data.galaxyproject.org.pub: |
        -----BEGIN PUBLIC KEY-----
        MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA5LHQuKWzcX5iBbCGsXGt
        6CRi9+a9cKZG4UlX/lJukEJ+3dSxVDWJs88PSdLk+E25494oU56hB8YeVq+W8AQE
        3LWx2K2ruRjEAI2o8sRgs/IbafjZ7cBuERzqj3Tn5qUIBFoKUMWMSIiWTQe2Sfnj
        GzfDoswr5TTk7aH/FIXUjLnLGGCOzPtUC244IhHARzu86bWYxQJUw0/kZl5wVGcH
        maSgr39h1xPst0Vx1keJ95AH0wqxPbCcyBGtF1L6HQlLidmoIDqcCQpLsGJJEoOs
        NVNhhcb66OJHah5ppI1N3cZehdaKyr1XcF9eedwLFTvuiwTn6qMmttT/tHX7rcxT
        owIDAQAB
        -----END PUBLIC KEY-----

We also had to repeat the default contents of the cvmfs-csi-default-local configmap and set CVMFS_HTTP_PROXY=DIRECT because without it, we were seeing default Cern proxy being used despite the fact we set it for this particular file system.

Overall, there are a couple of mentions of malformed URL in the logs but I'm not sure which URL that is. As I mentioned initially, this same configuration works sometimes (although since upgrading to 2.1.1 and now the dev version, it's consistently happening when the first Job tries to mount the file system).

from cvmfs-csi.

afgane avatar afgane commented on August 14, 2024

Hi @gman0, just checking in to see if you had a chance to look at the logs I posted last week and if you have advice or ideas on what might be wrong?

from cvmfs-csi.

gman0 avatar gman0 commented on August 14, 2024

Hi @afgane. Could you please try cleaning the alien cache or re-deploying the driver without it for the moment?

https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#alien-cache

It is safe to have the alien directory shared by multiple CernVM-FS processes and it is safe to unlink files from the alien cache directory anytime.

from cvmfs-csi.

afgane avatar afgane commented on August 14, 2024

I've redeployed it all on a new VM/cluster with alien cache disabled but are still seeing the same outcome. I've posted the full automount log in the original gist but as an additional file.

from cvmfs-csi.

afgane avatar afgane commented on August 14, 2024

It's been a while now since there's been any activity here or the linked help forum so I wanted to check in again on any updates from your side?

From our side, as linked in the associated PR just above, manually restarting the nodeplugin pods after all the app services have started seemed to make the initial issue go away. However, at seemingly random intervals the issue is now coming back. Restarting the nodeplugin pods once again makes the issue go away for a while but that's not a sustainable solution.

from cvmfs-csi.

gman0 avatar gman0 commented on August 14, 2024

Hi @afgane, are you adding the cvmfs-csi-config-d ConfigMap into nodeplugin.extraVolumes and nodeplugin.automount.extraVolumeMounts?

Without doing that, I can confirm I'm seeing log output identical to yours.

I got it working with following config:

# cvmfs.helm.yaml
# Create an extra "cvmfs-csi-config-d" ConfigMap and mount it in nodeplugin's automount container.

extraConfigMaps:
    cvmfs-csi-config-d:
      data.galaxyproject.org.conf: |
        CVMFS_SERVER_URL="http://cvmfs1-iu0.galaxyproject.org/cvmfs/@fqrn@;http://cvmfs1-tacc0.galaxyproject.org/cvmfs/@fqrn@;http://cvmfs1-psu0.galaxyproject.org/cvmfs/@fqrn@;http://cvmfs1-mel0.gvl.org.au/cvmfs/@fqrn@;http://cvmfs1-ufr0.galaxyproject.eu/cvmfs/@fqrn@"
        CVMFS_PUBLIC_KEY="/etc/cvmfs/config.d/data.galaxyproject.org.pub"
        CVMFS_HTTP_PROXY=DIRECT
      data.galaxyproject.org.pub: |
        -----BEGIN PUBLIC KEY-----
        MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA5LHQuKWzcX5iBbCGsXGt
        6CRi9+a9cKZG4UlX/lJukEJ+3dSxVDWJs88PSdLk+E25494oU56hB8YeVq+W8AQE
        3LWx2K2ruRjEAI2o8sRgs/IbafjZ7cBuERzqj3Tn5qUIBFoKUMWMSIiWTQe2Sfnj
        GzfDoswr5TTk7aH/FIXUjLnLGGCOzPtUC244IhHARzu86bWYxQJUw0/kZl5wVGcH
        maSgr39h1xPst0Vx1keJ95AH0wqxPbCcyBGtF1L6HQlLidmoIDqcCQpLsGJJEoOs
        NVNhhcb66OJHah5ppI1N3cZehdaKyr1XcF9eedwLFTvuiwTn6qMmttT/tHX7rcxT
        owIDAQAB
        -----END PUBLIC KEY-----

nodeplugin:
  extraVolumes:
  - name: etc-cvmfs-default-conf
    configMap:
      name: cvmfs-csi-default-local
  - name: etc-cvmfs-config-d
    configMap:
      name: cvmfs-csi-config-d

  automount:
    extraVolumeMounts:
    - name: etc-cvmfs-default-conf
      mountPath: /etc/cvmfs/default.local
      subPath: default.local
    - name: etc-cvmfs-config-d
      mountPath: /etc/cvmfs/config.d
helm upgrade --install <release name> rvasek-cern/cvmfs-csi --version v2.1.1-master-869e661f -f cvmfs.helm.yaml
# Demo pod.
apiVersion: v1
kind: Pod
metadata:
  name: cvmfs-galaxy
spec:
  volumes:
  - name: cvmfs
    persistentVolumeClaim:
      claimName: cvmfs
  containers:
  - name: cvmfs-galaxy
    image: alpine
    command: [sleep, inf]
    volumeMounts:
    - name: cvmfs
      mountPath: /cvmfs/data.galaxyproject.org
      subPath: data.galaxyproject.org
      mountPropagation: HostToContainer
$ kubectl exec cvmfs-galaxy -- ls -l /cvmfs/data.galaxyproject.org
total 9
drwxr-xr-x  210 999      997           4096 Apr 21  2022 byhand
drwxr-xr-x   18 999      997           4096 Nov 24  2020 managed

from cvmfs-csi.

gman0 avatar gman0 commented on August 14, 2024

This is most likely a regression in the helm chart -- normally users shouldn't have to do this (according to the docs too https://github.com/cvmfs-contrib/cvmfs-csi/blob/master/docs/how-to-use.md#adding-cvmfs-repository-configuration). This will be fixed with the next point release. Sorry for the trouble!

#72

from cvmfs-csi.

afgane avatar afgane commented on August 14, 2024

Thanks for tracking this down and sorry for a delay in applying the fix. I just upgraded our chart to use your v2.1.2 and we'll let it run our automated tests for a while to monitor the mount issue.

from cvmfs-csi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.