Coder Social home page Coder Social logo

kubeflow-book's People

Contributors

dependabot[bot] avatar kooock avatar mojokb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

kubeflow-book's Issues

Envoy proxy is NOT ready 관련 문제

안녕하세요~
우선 쿠베플로우 관련 도서를 집필해주신 것에 감사드립니다.

몇 가지를 제외하고는 책에서 주신 가이드를 따라 설치를 진행했습니다.
저는 서버 1대에 마스터 노드만 설치를 진행한 상태입니다.
그런데 쿠베플로우 대시보드가 로드되지 않아 질문을 드리고자 합니다.

제가 가이드를 따라가던 중 변경 또는 추가설치한 부분은 다음과 같습니다.

  • nvidia-docker까지는 설치가 잘 된 것 같습니다. (저는 CUDA 11.1 을 사용하고 있기 때문에 $ sudo docker run --runtime=nvidia --rm nvidia/cuda:11.1-base nvidia-smi로 명령을 실행했고, 정상적으로 결과를 얻었습니다.)
  • 이후 'sudo kubeadm init --pod-network-cidr=192.168.0.0/16'에서 swapoff 에러가 발생해 sudo swapoff -a로 설정했고 이후 진행에 이상이 없었습니다.
  • nfs가 설치되어있지 않아서 sudo apt install nfs-common으로 설치한 후 진행했고, helm install --name my-release --set nfs.server=127.0.0.1 --set nfs.path=/exported/path stable/nfs-client-provisioner는 그대로 따라했습니다.
  • 이후에 /etc/hosts를 설정하는 부분에서 127.0.0.1 kubeflow-registry.default.svc.cluster.local로 설정해주었구요.

이렇게 설치한 이후에 클러스터가 구성된 것을 확인했습니다.
image

그리고 31380 포트의 LISTEN 상태도 확인했습니다.
image

그런데, http://127.0.0.1:31380으로 접속하려고 하니 connection refused error가 발생했습니다. (방화벽은 열려있습니다.)
계속 원인을 찾다가 k9s에서 Crash를 발견했고,
image
로그를 보니 계속해서 Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected 에러가 발생하고 있었습니다.
image

원인을 찾아서 해결해보려고 했는데.. 도저히 해결이 되지 않아 질문을 드립니다. 도움을 부탁드립니다. ㅜㅜ

kubeflow pipeline rbac access denied

안녕하세요 쿠브플로우!를 따라하면서 kubeflow를 배우고 있는 독자입니다..
다름이 아니라 파이프라인 예제를 따라하려고 하는데
kfp_server_api.exceptions.ApiException: (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'Cache-Control': 'no-cache, private', 'Content-Length': '19', 'Content-Type': 'text/plain', 'Date': 'Wed, 21 Jul 2021 01:20:00 GMT', 'Server': 'istio-envoy', 'X-Envoy-Decorator-Operation': 'ml-pipeline.kubeflow.svc.cluster.local:8888/*'})
HTTP response body: RBAC: access denied

rabc 에러가 나서요..
kubeflow-pipeline-rolebinding 해줘도 에러가 나는데 혹시 어떻게 해야할까요...

fashion-mnist-fairing 예제 실행 오류 문의

예제로 있는 fashion-mnist-fairing.ipynb 파일을 실행 시키면 아래와 같은 오류가 발생 합니다.

[I 210105 19:06:39 config:134] Using preprocessor: <kubeflow.fairing.preprocessors.converted_notebook.ConvertNotebookPreprocessor object at 0x7f752045bbe0>
[I 210105 19:06:39 config:136] Using builder: <kubeflow.fairing.builders.append.append.AppendBuilder object at 0x7f74800fa198>
[I 210105 19:06:39 config:138] Using deployer: <kubeflow.fairing.deployers.job.job.Job object at 0x7f75203ffb70>
[W 210105 19:06:39 append:50] Building image using Append builder...
[I 210105 19:06:39 base:107] Creating docker context: /tmp/fairing_context_v1cjs1yb

TypeError Traceback (most recent call last)
in
43
44 )
---> 45 fairing.config.run()
46 else:
47 remote_train = MyFashionMnist()

~/.local/lib/python3.6/site-packages/kubeflow/fairing/config.py in run(self)
138 logging.info("Using deployer: %s", deployer)
139
--> 140 builder.build()
141 pod_spec = builder.generate_pod_spec()
142 deployer.deploy(pod_spec)

~/.local/lib/python3.6/site-packages/kubeflow/fairing/builders/append/append.py in build(self)
50 logger.warning("Building image using Append builder...")
51 start = timer()
---> 52 new_img = self._build(transport, src)
53 end = timer()
54 logger.warning("Image successfully built in {}s.".format(end-start))

~/.local/lib/python3.6/site-packages/kubeflow/fairing/builders/append/append.py in _build(self, transport, src)
64
65 def _build(self, transport, src):
---> 66 file, hash = self.preprocessor.context_tar_gz() # pylint:disable=redefined-builtin
67 self.context_file, self.context_hash = file, hash
68 self.image_tag = self.full_image_name(self.context_hash)

~/.local/lib/python3.6/site-packages/kubeflow/fairing/preprocessors/base.py in context_tar_gz(self, output_file)
106 , output_file = tempfile.mkstemp(prefix="/tmp/fairing_context")
107 logging.info("Creating docker context: %s", output_file)
--> 108 self.input_files = self.preprocess()
109 with tarfile.open(output_file, "w:gz", dereference=True) as tar:
110 for dst, src in self.context_map().items():

~/.local/lib/python3.6/site-packages/kubeflow/fairing/preprocessors/converted_notebook.py in preprocess(self)
119 exporter = nbconvert.PythonExporter()
120 exporter.register_preprocessor(self.notebook_preprocessor, enabled=True)
--> 121 contents, _ = exporter.from_filename(self.notebook_file)
122 converted_notebook = Path(self.notebook_file).with_suffix('.py')
123 if converted_notebook.exists() and not self.overwrite:

/usr/local/lib/python3.6/dist-packages/nbconvert/exporters/exporter.py in from_filename(self, filename, resources, **kw)
165 if not 'metadata' in resources or resources['metadata'] == '':
166 resources['metadata'] = ResourcesDict()
--> 167 path, basename = os.path.split(filename)
168 notebook_name = os.path.splitext(basename)[0]
169 resources['metadata']['name'] = notebook_name

/usr/lib/python3.6/posixpath.py in split(p)
105 """Split a pathname. Returns tuple "(head, tail)" where "tail" is
106 everything after the final slash. Either part may be empty."""
--> 107 p = os.fspath(p)
108 sep = _get_sep(p)
109 i = p.rfind(sep) + 1

TypeError: expected str, bytes or os.PathLike object, not NoneType

해당 내용에 대해서 문의 드립니다.

오타 수정

"kubernetes_install/nvidia_docker_install.txt" 26번 줄
$ sudo systemctl resatrt docker
$ sudo systemctl restart docker

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.