Coder Social home page Coder Social logo

Comments (31)

knkski avatar knkski commented on June 11, 2024

Hmm, not sure what might be causing this issue. I don't think we currently have a way of passing the --debug flag down to the Juju commands from microk8s.enable kubeflow, which would help debug this (though @ktsakalozos may correct me there), so I'll need to work on getting that in there. In the meantime, can you try running juju --debug deploy kubeflow, to see if that outputs anything useful?

from bundle-kubeflow.

ycheng avatar ycheng commented on June 11, 2024

re-test again today, now microk8s.enable kubeflow can finish running.

However, as I try to create notebook, it failed to create.

BTW, kubeflow 0.6.2 is release, and currently 1.15/edge/kubeflow still use kubeflow v0.5.
Steps in https://ubuntu.com/kubeflow/install can properly install kubeflow 0.6.

from bundle-kubeflow.

knkski avatar knkski commented on June 11, 2024

@ycheng: Can you list the steps you took to create the notebook, and how it failed? Additionally, can you attach output from these commands?

microk8s.kubectl logs --tail 1000 --all-containers -l juju-app=jupyter-controller
microk8s.kubectl logs --tail 1000 --all-containers -l juju-app=jupyter-web

from bundle-kubeflow.

ycheng avatar ycheng commented on June 11, 2024

@knkski:

installation:
snap core: r7396
microk8s: v1.15.3, r802, channel: 1.15/edge/kubeflow

Steps
microk8s.reset
microk8s.enable kubeflow
microk8s.kubectl get po -n kubeflow => make sure all pod are Running.
kubectl get svc -n kubeflow | grep ambassador
=> get the ip of ambassador, open browser http://ip/ to go the main ui.
Choose Notebooks from the left side menu
Click New Server
Fill in the server name, nothing else, click "Spawin" in the buttom of the page.
"No Status Available" for the new created server

Both command ("microk8s.kubectl logs ....") output nothing.

/var/log/pods/kubeflow_jupyter-controller-operator-0_d4811256-5f33-4613-9068-4792c179c3ae/juju-operator/ and get 0.log as jupyter-controller.log
/var/log/pods/kubeflow_jupyter-web-7979d96ff9-2z58r_c39c2161-4c9f-4aac-b933-4d560bbfc978/jupyterhub/ and get 0.log as jupyter-web.log

jupyter-web.log
jupyter-controller.log
2019-09-20 21-01-52_screenshot

from bundle-kubeflow.

knkski avatar knkski commented on June 11, 2024

@ycheng: can you try switching microk8s to the 1.16/edge/kubeflow channel and trying again?

from bundle-kubeflow.

ycheng avatar ycheng commented on June 11, 2024

@knkski, just try today. microk8s is r946. it need user name and password to login.
do you know what's the default one?

from bundle-kubeflow.

knkski avatar knkski commented on June 11, 2024

@ycheng: you can find the username / password to log into the kubeflow dashboard with these two commands:

juju config ambassador-auth username
juju config ambassador-auth password

from bundle-kubeflow.

knkski avatar knkski commented on June 11, 2024

@ycheng: Is this working for you then? Or is it still hanging when you run microk8s.enable kubeflow? If it is still an issue for you, can you run switch to the latest version of microk8s edge with sudo snap switch microk8s --channel edge && sudo snap refresh microk8s, and then post the output from KUBEFLOW_DEBUG=true microk8s.enable kubeflow?

from bundle-kubeflow.

ycheng avatar ycheng commented on June 11, 2024

@knkski I can log in now. While try to create a notebook, it shows an error message

Warning!notebooks.kubeflow.org is forbidden: User "system:serviceaccount:kubeflow:default" cannot list resource "notebooks" in API group "kubeflow.org" in the namespace "kubeflow"

from bundle-kubeflow.

knkski avatar knkski commented on June 11, 2024

@ycheng: Sorry about that. I've got a fix in the edge bundle, but in the meantime, you could try running microk8s.disable rbac, which should fix that issue.

from bundle-kubeflow.

ycheng avatar ycheng commented on June 11, 2024

@knkski hi, I reinstall microk8s from edge and got microk8s r1056 + core r8038

microk8s.enable kubeflow failed with log attached.

microk8s-edge-1056.log

from bundle-kubeflow.

ktsakalozos avatar ktsakalozos commented on June 11, 2024

@ycheng we recently (yesterday) pushed a patch [1] to address this. Could you try reinstalling from edge?

[1] canonical/microk8s#793

from bundle-kubeflow.

ycheng avatar ycheng commented on June 11, 2024

microk8s r1071:

03:42:21 INFO juju.util.exec exec.go:209 run result: exit status 1
ERROR The microk8s user group is created during the microk8s snap installation.
Users in that group are granted access to microk8s commands and this
is needed for Juju to be able to interact with microk8s.

Add yourself to that group before trying again:
sudo usermod -a -G microk8s root

03:42:21 DEBUG cmd supercommand.go:519 error stack:
/build/juju/parts/juju/go/src/github.com/juju/juju/caas/kubernetes/provider/cloud.go:337: The microk8s user group is created during the microk8s snap installation.
Users in that group are granted access to microk8s commands and this
is needed for Juju to be able to interact with microk8s.

Add yourself to that group before trying again:
sudo usermod -a -G microk8s root

/build/juju/parts/juju/go/src/github.com/juju/juju/caas/kubernetes/provider/cloud.go:286:
/build/juju/parts/juju/go/src/github.com/juju/juju/cmd/juju/commands/bootstrap.go:996:
/build/juju/parts/juju/go/src/github.com/juju/juju/cmd/juju/commands/bootstrap.go:575:

Command '('microk8s-juju.wrapper', '--debug', 'bootstrap', 'microk8s', 'uk8s')' returned non-zero exit status 1
Failed to enable kubeflow

from bundle-kubeflow.

ktsakalozos avatar ktsakalozos commented on June 11, 2024

@wallyworld has already a fix for this issue and it should be available soon.

from bundle-kubeflow.

ycheng avatar ycheng commented on June 11, 2024

it seems microk8s r1077 still failed with the same error.
Did you have it test pass?

from bundle-kubeflow.

ktsakalozos avatar ktsakalozos commented on June 11, 2024

@ycheng the error you see if from the juju client. The microk8s.enable kubeflow addon uses for now the juju client from the snap edge channel (https://github.com/ubuntu/microk8s/blob/master/microk8s-resources/actions/enable.juju.sh#L13). @wallyworld may know more on when the fix will land there or if we should be using a different channel. Thanks.

from bundle-kubeflow.

knkski avatar knkski commented on June 11, 2024

@ycheng: Are you still running into this issue?

from bundle-kubeflow.

ricpet avatar ricpet commented on June 11, 2024

hi all, I am getting this error:

Revoked:false Label:admin Invalid:false InvalidReason:}]}
18:45:28 INFO  juju.util.exec exec.go:209 run result: exit status 1
ERROR microk8s:
  running: false

18:45:28 DEBUG cmd supercommand.go:519 error stack:
/var/lib/jenkins/workspace/BuildJuju-centos-amd64/_build/src/github.com/juju/juju/caas/kubernetes/provider/cloud.go:384: microk8s:
  running: false

/var/lib/jenkins/workspace/BuildJuju-centos-amd64/_build/src/github.com/juju/juju/caas/kubernetes/provider/cloud.go:349:
/var/lib/jenkins/workspace/BuildJuju-centos-amd64/_build/src/github.com/juju/juju/caas/kubernetes/provider/cloud.go:286:
/var/lib/jenkins/workspace/BuildJuju-centos-amd64/_build/src/github.com/juju/juju/cmd/juju/commands/bootstrap.go:996:
/var/lib/jenkins/workspace/BuildJuju-centos-amd64/_build/src/github.com/juju/juju/cmd/juju/commands/bootstrap.go:575:

Command '('microk8s-juju.wrapper', '--debug', 'bootstrap', 'microk8s', 'uk8s')' returned non-zero exit status 1
Failed to enable kubeflow

anyone fixed this?

from bundle-kubeflow.

knkski avatar knkski commented on June 11, 2024

@ricpet: This may be a race condition. Can you try running microk8s.disable kubeflow; microk8s.enable kubeflow to see if you run into the same error?

from bundle-kubeflow.

ricpet avatar ricpet commented on June 11, 2024

@knkski thanks for your reply. I tried but nothing, still the same issue.

from bundle-kubeflow.

ricpet avatar ricpet commented on June 11, 2024

I actually managed to fix that issue (there was some conflict with an old installation), however right now, I get this:

Kubeflow could not be enabled:
Creating Juju controller "uk8s" on microk8s/localhost
Creating k8s resources for controller "controller-uk8s"
ERROR failed to bootstrap model: creating controller stack for controller: creating statefulset for controller: timed out waiting for controller pod: pending:  -

Command '('microk8s-juju.wrapper', 'bootstrap', 'microk8s', 'uk8s')' returned non-zero exit status 1
Failed to enable kubeflow

from bundle-kubeflow.

ricpet avatar ricpet commented on June 11, 2024

I actually managed to fix that issue (there was some conflict with an old installation), however right now, I get this:

Kubeflow could not be enabled:
Creating Juju controller "uk8s" on microk8s/localhost
Creating k8s resources for controller "controller-uk8s"
ERROR failed to bootstrap model: creating controller stack for controller: creating statefulset for controller: timed out waiting for controller pod: pending:  -

Command '('microk8s-juju.wrapper', 'bootstrap', 'microk8s', 'uk8s')' returned non-zero exit status 1
Failed to enable kubeflow

from bundle-kubeflow.

knkski avatar knkski commented on June 11, 2024

@ricpet: What was the fix involved in the previous error you were running into?

I haven't seen this new error before. Can you try again with the KUBEFLOW_DEBUG=true environment variable set? Offhand, it looks like a networking issue, which you might have if you're running behind a proxy/firewall/etc.

from bundle-kubeflow.

ricpet avatar ricpet commented on June 11, 2024

hi @knkski the error is actually the same as before:

19:09:21 INFO  juju.util.exec exec.go:209 run result: exit status 1
ERROR microk8s:
  running: false

19:09:21 DEBUG cmd supercommand.go:519 error stack:
/var/lib/jenkins/workspace/BuildJuju-centos-amd64/_build/src/github.com/juju/juju/caas/kubernetes/provider/cloud.go:384: microk8s:
  running: false

/var/lib/jenkins/workspace/BuildJuju-centos-amd64/_build/src/github.com/juju/juju/caas/kubernetes/provider/cloud.go:349:
/var/lib/jenkins/workspace/BuildJuju-centos-amd64/_build/src/github.com/juju/juju/caas/kubernetes/provider/cloud.go:286:
/var/lib/jenkins/workspace/BuildJuju-centos-amd64/_build/src/github.com/juju/juju/cmd/juju/commands/bootstrap.go:996:
/var/lib/jenkins/workspace/BuildJuju-centos-amd64/_build/src/github.com/juju/juju/cmd/juju/commands/bootstrap.go:575:

Command '('microk8s-juju.wrapper', '--debug', 'bootstrap', 'microk8s', 'uk8s')' returned non-zero exit status 1

from bundle-kubeflow.

ricpet avatar ricpet commented on June 11, 2024

just to add more context... I am using Ubuntu 18.04 (desktop) and I installed microk8s following this link https://microk8s.io/ and kubeflow using (https://www.kubeflow.org/docs/other-guides/virtual-dev/getting-started-multipass/). When I enable kubeflow I get:

Enabling dns...
[sudo] password for USER:
Enabling storage...
Enabling dashboard...
Enabling ingress...
Enabling rbac...
Enabling juju...
Deploying Kubeflow...
Kubeflow could not be enabled:
ERROR microk8s:
  running: false


Command '('microk8s-juju.wrapper', 'bootstrap', 'microk8s', 'uk8s')' returned non-zero exit status 1
Failed to enable kubeflow

from bundle-kubeflow.

ktsakalozos avatar ktsakalozos commented on June 11, 2024

@ricpet it could be possible that the machine you are deploying kubeflow is running out of memory and the OS killed the apiserver while kubeflow was coming up. What are the specs of the machine (virtual or not) where MicroK8s runs on? The microk8s.inspect tarball has information we would need to debug this case. Thanks.

from bundle-kubeflow.

ricpet avatar ricpet commented on June 11, 2024

hi @ktsakalozos thanks for your help. The specs of my machine are:

  • Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
  • RAM 16 G

the outupt of microk8s.inspect is:

Inspecting services
  Service snap.microk8s.daemon-cluster-agent is running
  Service snap.microk8s.daemon-flanneld is running
  Service snap.microk8s.daemon-containerd is running
  Service snap.microk8s.daemon-apiserver is running
  Service snap.microk8s.daemon-apiserver-kicker is running
  Service snap.microk8s.daemon-proxy is running
  Service snap.microk8s.daemon-kubelet is running
  Service snap.microk8s.daemon-scheduler is running
  Service snap.microk8s.daemon-controller-manager is running
  Service snap.microk8s.daemon-etcd is running
  Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system information
  Copy processes list to the final report tarball
  Copy snap list to the final report tarball
  Copy VM name (or none) to the final report tarball
  Copy disk usage information to the final report tarball
  Copy memory usage information to the final report tarball
  Copy server uptime to the final report tarball
  Copy current linux distribution to the final report tarball
  Copy openSSL information to the final report tarball
  Copy network configuration to the final report tarball
Inspecting kubernetes cluster
  Inspect kubernetes cluster

 WARNING:  IPtables FORWARD policy is DROP. Consider enabling traffic forwarding with: sudo iptables -P FORWARD ACCEPT
The change can be made persistent with: sudo apt-get install iptables-persistent
WARNING:  Docker is installed.
File "/etc/docker/daemon.json" does not exist.
You should create it and add the following lines:
{
    "insecure-registries" : ["localhost:32000"]
}
and then restart docker with: sudo systemctl restart docker
Building the report tarball
  Report tarball is at /var/snap/microk8s/1173/inspection-report-20200226_104633.tar.gz

from bundle-kubeflow.

knkski avatar knkski commented on June 11, 2024

@ricpet: Can you attach the tarball that microk8s.inspect generated? It looks like it put it at /var/snap/microk8s/1173/inspection-report-20200226_104633.tar.gz

from bundle-kubeflow.

ricpet avatar ricpet commented on June 11, 2024

@knkski here you go https://www.dropbox.com/s/haby9tzpudx4l8m/inspection-report-20200226_104633.tar.gz?dl=0

from bundle-kubeflow.

mikejmills avatar mikejmills commented on June 11, 2024

Same issue here

Ubuntu 18

Revoked:false Label:admin Invalid:false InvalidReason:}]}
13:52:10 INFO juju.util.exec exec.go:209 run result: exit status 1
ERROR microk8s:
running: false

from bundle-kubeflow.

knkski avatar knkski commented on June 11, 2024

@ricpet, @mikejmills: If you switch to the edge version of microk8s, it includes a fix for this error:

# If you don't have it installed
sudo snap install microk8s --classic --edge

# If you have it installed
sudo snap switch microk8s --channel edge
sudo snap refresh microk8s

Note that with the edge version, you'll have to use the edge kubeflow bundle, so you'll need to enable microk8s like this:

KUBEFLOW_CHANNEL=edge microk8s.enable kubeflow

That requirement should disappear when microk8s 1.18 hits stable, which is targeted for this Thursday (March 26th).

from bundle-kubeflow.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.