Comments (8)
I have been using GCS bucket as a shared drive for a while now. You will need to include more mount options
"implicit-dirs,uid=1000,gid=100"
Also, make sure the workload identity has permission to access the bucket, โstorage object adminโ role
You may also need the following settings
singleuser:
networkPolicy:
egressAllowRules:
cloudMetadataServer: true
from zero-to-jupyterhub-k8s.
@jdbates wasn't this configuration enough for you? If not, are you also using Cilium (part of GCP's dataplane v2)?
singleuser:
networkPolicy:
egressAllowRules:
cloudMetadataServer: true
from zero-to-jupyterhub-k8s.
@consideRatio After further testing, it looks like just using
singleUser:
networkPolicy:
egressAllowRules:
cloudMetadataServer: true
is sufficient. As far as my environment, I'm using a GKE Autopilot cluster with whatever the default configuration is.
from zero-to-jupyterhub-k8s.
Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! ๐ค
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! ๐
Welcome to the Jupyter community! ๐
from zero-to-jupyterhub-k8s.
Thank you for the help, @vizeit. It was the networkPolicy which was the culprit; apparently the default setting were blocking the gcsfuse sidecar container from connecting with the bucket. As a result, the sidecar never finished mounting the bucket, which in turn caused the notebook container to never finish spawning, which is what was responsible for the context deadline exceeded
errors I was receiving.
I was able to connect successfully using:
singleUser:
networkPolicy:
enabled: false
Obviously this is not ideal; I'll need to figure out how to specify the exact egress rule I need (maybe just allow the IP of the bucket). Also, the additional mountOptions ("uid=1000,gid=100") did not seem to matter for my specific use case.
This issue can be closed, since it turned out to be a configuration issue and not a bug. However, it would be nice to have a little more documentation surrounding this, since none of the error messages I received were useful in diagnosing the problem.
from zero-to-jupyterhub-k8s.
@jdbates I think you will need all the mount options I mentioned in my previous comment; try to write/save something to the shared drive without uid -gid mount options. If you browse through closed issues related to GKE Autopilot here in this repo, you may get more understanding of Dataplane V2
from zero-to-jupyterhub-k8s.
As of right now, the following config seems to be working for me:
singleuser:
image:
name: jupyter/datascience-notebook
tag: latest
cmd: null
startTimeout: 600
storage:
dynamic:
storageClass: premium-rwo
extraVolumes:
- name: gcs-shared
csi:
driver: gcsfuse.csi.storage.gke.io
volumeAttributes:
bucketName: scg-datascience-shared
mountOptions: "implicit-dirs"
extraVolumeMounts:
- name: gcs-shared
mountPath: /home/shared
extraFiles:
jupyter_notebook_config.json:
mountPath: /etc/jupyter/jupyter_notebook_config.json
data:
MappingKernelManager:
cull_idle_timeout: 3600 # default: 0
cull_interval: 300 # default: 300
cull_connected: true # default: false
cull_busy: false # default: false
serviceAccountName: gcsfuse
extraAnnotations:
gke-gcsfuse/volumes: "true"
cloudMetadata:
blockWithIptables: false
networkPolicy:
egressAllowRules:
cloudMetadataServer: true
I'll keep an eye on this and check whether or not the "uid=1000,gid=100" mountOptions matter, but as of right now they don't seem to be affecting anything.
Also, I can't believe I didn't see the issue you had posted @vizeit - would have saved me a weeks worth of trouble if I had. Closing this, since it's resolved and a duplicate.
from zero-to-jupyterhub-k8s.
I have described detailed steps in my post if anyone wants to fully setup GCS bucket as a shared drive with zero-to-jupyterhub
https://www.vizeit.com/gcs-bucket-with-jupyterhub-on-gke/
from zero-to-jupyterhub-k8s.
Related Issues (20)
- In June 2024 or later - drop support for k8s 1.26
- Helm chart option to switch prePuller to Always HOT 3
- All oauth users not being allowed by default HOT 4
- Suggestion: Allow memory requirements and limits within a ProfileList without using kubespawner_override HOT 1
- Impossible to deploy in a specific namespace HOT 5
- Document deploying JupyterHub to Jetstream2 using Kubernetes and ClusterAPI HOT 5
- iptables or kernel needs to upgraded HOT 8
- `singleuser.cloudMetadata.blockWithIptables` to fail with better error messages HOT 1
- Can I use Email instead of UPN as a login method to the jupyter hub servers HOT 3
- Open forum about user server pre-startup security patch script
- Z2JH 3.3.0 is broken - pycurl issues with certificates
- Regression for `singleuser.cloudMetadata.blockWithIptables` in z2jh 3.3.0 and 3.3.1 - workaround in 3.3.2 HOT 1
- Cull jupyter-user pods that were started before hub restart HOT 3
- JupyterHub Deployments Using GitOps Tools (FluxCD/ArgoCD) HOT 11
- JupyterHub 3+ would block internal DNS by default? HOT 7
- resource request behavior difference within Z2jh 1.2 and2.0 HOT 1
- Planning for release 4.0 with JupyterHub 5.0 HOT 6
- GKE Deployment with gvisor fails with Cloud DNS but not with kube-dns HOT 4
- 403 Forbidden XSRF cookie does not match POST argument after updating to the latest helm chart version (3.3.7) HOT 11
- User not able to login after Admin edits the user name from admin panel HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from zero-to-jupyterhub-k8s.