Comments (31)
Hmm, not sure what might be causing this issue. I don't think we currently have a way of passing the --debug
flag down to the Juju commands from microk8s.enable kubeflow
, which would help debug this (though @ktsakalozos may correct me there), so I'll need to work on getting that in there. In the meantime, can you try running juju --debug deploy kubeflow
, to see if that outputs anything useful?
from bundle-kubeflow.
re-test again today, now microk8s.enable kubeflow can finish running.
However, as I try to create notebook, it failed to create.
BTW, kubeflow 0.6.2 is release, and currently 1.15/edge/kubeflow still use kubeflow v0.5.
Steps in https://ubuntu.com/kubeflow/install can properly install kubeflow 0.6.
from bundle-kubeflow.
@ycheng: Can you list the steps you took to create the notebook, and how it failed? Additionally, can you attach output from these commands?
microk8s.kubectl logs --tail 1000 --all-containers -l juju-app=jupyter-controller
microk8s.kubectl logs --tail 1000 --all-containers -l juju-app=jupyter-web
from bundle-kubeflow.
installation:
snap core: r7396
microk8s: v1.15.3, r802, channel: 1.15/edge/kubeflow
Steps
microk8s.reset
microk8s.enable kubeflow
microk8s.kubectl get po -n kubeflow => make sure all pod are Running.
kubectl get svc -n kubeflow | grep ambassador
=> get the ip of ambassador, open browser http://ip/ to go the main ui.
Choose Notebooks from the left side menu
Click New Server
Fill in the server name, nothing else, click "Spawin" in the buttom of the page.
"No Status Available" for the new created server
Both command ("microk8s.kubectl logs ....") output nothing.
/var/log/pods/kubeflow_jupyter-controller-operator-0_d4811256-5f33-4613-9068-4792c179c3ae/juju-operator/ and get 0.log as jupyter-controller.log
/var/log/pods/kubeflow_jupyter-web-7979d96ff9-2z58r_c39c2161-4c9f-4aac-b933-4d560bbfc978/jupyterhub/ and get 0.log as jupyter-web.log
jupyter-web.log
jupyter-controller.log
2019-09-20 21-01-52_screenshot
from bundle-kubeflow.
@ycheng: can you try switching microk8s to the 1.16/edge/kubeflow
channel and trying again?
from bundle-kubeflow.
@knkski, just try today. microk8s is r946. it need user name and password to login.
do you know what's the default one?
from bundle-kubeflow.
@ycheng: you can find the username / password to log into the kubeflow dashboard with these two commands:
juju config ambassador-auth username
juju config ambassador-auth password
from bundle-kubeflow.
@ycheng: Is this working for you then? Or is it still hanging when you run microk8s.enable kubeflow
? If it is still an issue for you, can you run switch to the latest version of microk8s edge with sudo snap switch microk8s --channel edge && sudo snap refresh microk8s
, and then post the output from KUBEFLOW_DEBUG=true microk8s.enable kubeflow
?
from bundle-kubeflow.
@knkski I can log in now. While try to create a notebook, it shows an error message
Warning!notebooks.kubeflow.org is forbidden: User "system:serviceaccount:kubeflow:default" cannot list resource "notebooks" in API group "kubeflow.org" in the namespace "kubeflow"
from bundle-kubeflow.
@ycheng: Sorry about that. I've got a fix in the edge bundle, but in the meantime, you could try running microk8s.disable rbac
, which should fix that issue.
from bundle-kubeflow.
@knkski hi, I reinstall microk8s from edge and got microk8s r1056 + core r8038
microk8s.enable kubeflow failed with log attached.
from bundle-kubeflow.
@ycheng we recently (yesterday) pushed a patch [1] to address this. Could you try reinstalling from edge?
from bundle-kubeflow.
microk8s r1071:
03:42:21 INFO juju.util.exec exec.go:209 run result: exit status 1
ERROR The microk8s user group is created during the microk8s snap installation.
Users in that group are granted access to microk8s commands and this
is needed for Juju to be able to interact with microk8s.
Add yourself to that group before trying again:
sudo usermod -a -G microk8s root
03:42:21 DEBUG cmd supercommand.go:519 error stack:
/build/juju/parts/juju/go/src/github.com/juju/juju/caas/kubernetes/provider/cloud.go:337: The microk8s user group is created during the microk8s snap installation.
Users in that group are granted access to microk8s commands and this
is needed for Juju to be able to interact with microk8s.
Add yourself to that group before trying again:
sudo usermod -a -G microk8s root
/build/juju/parts/juju/go/src/github.com/juju/juju/caas/kubernetes/provider/cloud.go:286:
/build/juju/parts/juju/go/src/github.com/juju/juju/cmd/juju/commands/bootstrap.go:996:
/build/juju/parts/juju/go/src/github.com/juju/juju/cmd/juju/commands/bootstrap.go:575:
Command '('microk8s-juju.wrapper', '--debug', 'bootstrap', 'microk8s', 'uk8s')' returned non-zero exit status 1
Failed to enable kubeflow
from bundle-kubeflow.
@wallyworld has already a fix for this issue and it should be available soon.
from bundle-kubeflow.
it seems microk8s r1077 still failed with the same error.
Did you have it test pass?
from bundle-kubeflow.
@ycheng the error you see if from the juju client. The microk8s.enable kubeflow
addon uses for now the juju client from the snap edge channel (https://github.com/ubuntu/microk8s/blob/master/microk8s-resources/actions/enable.juju.sh#L13). @wallyworld may know more on when the fix will land there or if we should be using a different channel. Thanks.
from bundle-kubeflow.
@ycheng: Are you still running into this issue?
from bundle-kubeflow.
hi all, I am getting this error:
Revoked:false Label:admin Invalid:false InvalidReason:}]}
18:45:28 INFO juju.util.exec exec.go:209 run result: exit status 1
ERROR microk8s:
running: false
18:45:28 DEBUG cmd supercommand.go:519 error stack:
/var/lib/jenkins/workspace/BuildJuju-centos-amd64/_build/src/github.com/juju/juju/caas/kubernetes/provider/cloud.go:384: microk8s:
running: false
/var/lib/jenkins/workspace/BuildJuju-centos-amd64/_build/src/github.com/juju/juju/caas/kubernetes/provider/cloud.go:349:
/var/lib/jenkins/workspace/BuildJuju-centos-amd64/_build/src/github.com/juju/juju/caas/kubernetes/provider/cloud.go:286:
/var/lib/jenkins/workspace/BuildJuju-centos-amd64/_build/src/github.com/juju/juju/cmd/juju/commands/bootstrap.go:996:
/var/lib/jenkins/workspace/BuildJuju-centos-amd64/_build/src/github.com/juju/juju/cmd/juju/commands/bootstrap.go:575:
Command '('microk8s-juju.wrapper', '--debug', 'bootstrap', 'microk8s', 'uk8s')' returned non-zero exit status 1
Failed to enable kubeflow
anyone fixed this?
from bundle-kubeflow.
@ricpet: This may be a race condition. Can you try running microk8s.disable kubeflow; microk8s.enable kubeflow
to see if you run into the same error?
from bundle-kubeflow.
@knkski thanks for your reply. I tried but nothing, still the same issue.
from bundle-kubeflow.
I actually managed to fix that issue (there was some conflict with an old installation), however right now, I get this:
Kubeflow could not be enabled:
Creating Juju controller "uk8s" on microk8s/localhost
Creating k8s resources for controller "controller-uk8s"
ERROR failed to bootstrap model: creating controller stack for controller: creating statefulset for controller: timed out waiting for controller pod: pending: -
Command '('microk8s-juju.wrapper', 'bootstrap', 'microk8s', 'uk8s')' returned non-zero exit status 1
Failed to enable kubeflow
from bundle-kubeflow.
I actually managed to fix that issue (there was some conflict with an old installation), however right now, I get this:
Kubeflow could not be enabled:
Creating Juju controller "uk8s" on microk8s/localhost
Creating k8s resources for controller "controller-uk8s"
ERROR failed to bootstrap model: creating controller stack for controller: creating statefulset for controller: timed out waiting for controller pod: pending: -
Command '('microk8s-juju.wrapper', 'bootstrap', 'microk8s', 'uk8s')' returned non-zero exit status 1
Failed to enable kubeflow
from bundle-kubeflow.
@ricpet: What was the fix involved in the previous error you were running into?
I haven't seen this new error before. Can you try again with the KUBEFLOW_DEBUG=true
environment variable set? Offhand, it looks like a networking issue, which you might have if you're running behind a proxy/firewall/etc.
from bundle-kubeflow.
hi @knkski the error is actually the same as before:
19:09:21 INFO juju.util.exec exec.go:209 run result: exit status 1
ERROR microk8s:
running: false
19:09:21 DEBUG cmd supercommand.go:519 error stack:
/var/lib/jenkins/workspace/BuildJuju-centos-amd64/_build/src/github.com/juju/juju/caas/kubernetes/provider/cloud.go:384: microk8s:
running: false
/var/lib/jenkins/workspace/BuildJuju-centos-amd64/_build/src/github.com/juju/juju/caas/kubernetes/provider/cloud.go:349:
/var/lib/jenkins/workspace/BuildJuju-centos-amd64/_build/src/github.com/juju/juju/caas/kubernetes/provider/cloud.go:286:
/var/lib/jenkins/workspace/BuildJuju-centos-amd64/_build/src/github.com/juju/juju/cmd/juju/commands/bootstrap.go:996:
/var/lib/jenkins/workspace/BuildJuju-centos-amd64/_build/src/github.com/juju/juju/cmd/juju/commands/bootstrap.go:575:
Command '('microk8s-juju.wrapper', '--debug', 'bootstrap', 'microk8s', 'uk8s')' returned non-zero exit status 1
from bundle-kubeflow.
just to add more context... I am using Ubuntu 18.04 (desktop) and I installed microk8s following this link https://microk8s.io/ and kubeflow using (https://www.kubeflow.org/docs/other-guides/virtual-dev/getting-started-multipass/). When I enable kubeflow I get:
Enabling dns...
[sudo] password for USER:
Enabling storage...
Enabling dashboard...
Enabling ingress...
Enabling rbac...
Enabling juju...
Deploying Kubeflow...
Kubeflow could not be enabled:
ERROR microk8s:
running: false
Command '('microk8s-juju.wrapper', 'bootstrap', 'microk8s', 'uk8s')' returned non-zero exit status 1
Failed to enable kubeflow
from bundle-kubeflow.
@ricpet it could be possible that the machine you are deploying kubeflow is running out of memory and the OS killed the apiserver while kubeflow was coming up. What are the specs of the machine (virtual or not) where MicroK8s runs on? The microk8s.inspect
tarball has information we would need to debug this case. Thanks.
from bundle-kubeflow.
hi @ktsakalozos thanks for your help. The specs of my machine are:
- Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
- RAM 16 G
the outupt of microk8s.inspect
is:
Inspecting services
Service snap.microk8s.daemon-cluster-agent is running
Service snap.microk8s.daemon-flanneld is running
Service snap.microk8s.daemon-containerd is running
Service snap.microk8s.daemon-apiserver is running
Service snap.microk8s.daemon-apiserver-kicker is running
Service snap.microk8s.daemon-proxy is running
Service snap.microk8s.daemon-kubelet is running
Service snap.microk8s.daemon-scheduler is running
Service snap.microk8s.daemon-controller-manager is running
Service snap.microk8s.daemon-etcd is running
Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system information
Copy processes list to the final report tarball
Copy snap list to the final report tarball
Copy VM name (or none) to the final report tarball
Copy disk usage information to the final report tarball
Copy memory usage information to the final report tarball
Copy server uptime to the final report tarball
Copy current linux distribution to the final report tarball
Copy openSSL information to the final report tarball
Copy network configuration to the final report tarball
Inspecting kubernetes cluster
Inspect kubernetes cluster
WARNING: IPtables FORWARD policy is DROP. Consider enabling traffic forwarding with: sudo iptables -P FORWARD ACCEPT
The change can be made persistent with: sudo apt-get install iptables-persistent
WARNING: Docker is installed.
File "/etc/docker/daemon.json" does not exist.
You should create it and add the following lines:
{
"insecure-registries" : ["localhost:32000"]
}
and then restart docker with: sudo systemctl restart docker
Building the report tarball
Report tarball is at /var/snap/microk8s/1173/inspection-report-20200226_104633.tar.gz
from bundle-kubeflow.
@ricpet: Can you attach the tarball that microk8s.inspect
generated? It looks like it put it at /var/snap/microk8s/1173/inspection-report-20200226_104633.tar.gz
from bundle-kubeflow.
@knkski here you go https://www.dropbox.com/s/haby9tzpudx4l8m/inspection-report-20200226_104633.tar.gz?dl=0
from bundle-kubeflow.
Same issue here
Ubuntu 18
Revoked:false Label:admin Invalid:false InvalidReason:}]}
13:52:10 INFO juju.util.exec exec.go:209 run result: exit status 1
ERROR microk8s:
running: false
from bundle-kubeflow.
@ricpet, @mikejmills: If you switch to the edge version of microk8s, it includes a fix for this error:
# If you don't have it installed
sudo snap install microk8s --classic --edge
# If you have it installed
sudo snap switch microk8s --channel edge
sudo snap refresh microk8s
Note that with the edge version, you'll have to use the edge kubeflow bundle, so you'll need to enable microk8s like this:
KUBEFLOW_CHANNEL=edge microk8s.enable kubeflow
That requirement should disappear when microk8s 1.18 hits stable, which is targeted for this Thursday (March 26th).
from bundle-kubeflow.
Related Issues (20)
- Pin integration tests deployed dependencies of repos [21-30] HOT 2
- Pin integration tests deployed dependencies of repos [31-39] HOT 3
- Verify that podspec charms can be deployed using `juju 3.5` HOT 8
- bump version of `ops` used by all charms as part of the Charmed Kubeflow 1.9 release HOT 1
- Bump the `build-on`/`run-on` base for all Charmed Kubeflow charms HOT 1
- Update the Kubeflow notebook creation page docs for Kubeflow 1.9 HOT 1
- Ensure that Loki epic is ready for 24.10 HOT 1
- Ensure that Monitoring: Metrics epic is ready for 24.10 HOT 1
- Write a spec for a generic metrics exporter HOT 1
- Implement charms state grafana dashboard HOT 2
- docs: Add documentation page with information for each application metrics and/or current alerts HOT 2
- docs: Create a reference page with all current grafana dashboards available HOT 2
- chore: Bump o11y libs in CKF charms HOT 2
- `[mysql-k8s]` Create public documentation for data backups and restoration HOT 1
- `[PVCs]` Create public documentation for data backups and restoration HOT 4
- `ModuleNotFoundError: No module named 'markupsafe'` error at build time HOT 4
- Have automated CI that bumps observability (at least) libs HOT 2
- Refactor Charmed Kubeflow documentation to make it clear which docs are aimed at administrators vs users HOT 2
- Tracker for things to improve how we release rocks, integrate them into charms, and release charms HOT 1
- mysql ha HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bundle-kubeflow.