I have already enabled GPU sharing using CUDA MPS, but when deploying pods with yaml,

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Using CUDA MPS to enable GPU sharing, the pod occupies all GPU memory. about k8s-device-plugin HOT 11 OPEN

ysz-github commented on June 11, 2024

Using CUDA MPS to enable GPU sharing, the pod occupies all GPU memory.

from k8s-device-plugin.

Comments (11)

klueska commented on June 11, 2024

How did you set up MPS?

from k8s-device-plugin.

ysz-github commented on June 11, 2024

您是如何设置 MPS 的？

I haven't set MPS in the YAML, just applied for GPU resources like in Time-slicing mode. How should I set it up? Thank you!

from k8s-device-plugin.

ysz-github commented on June 11, 2024

How did you set up MPS?

The settings to enable CUDA MPS are as follows:

version: v1
flags:
  migStrategy: "none"
  failOnInitError: true
  nvidiaDriverRoot: "/"
  plugin:
    passDeviceSpecs: false
    deviceListStrategy: "envvar"
    deviceIDStrategy: "uuid"
  gfd:
    oneshot: false
    noTimestamp: false
    outputFile: /etc/kubernetes/node-feature-discovery/features.d/gfd
    sleepInterval: 60s
sharing:
  mps:
    resources:
    - name: nvidia.com/gpu
      replicas: 10

from k8s-device-plugin.

elezar commented on June 11, 2024

@ysz-github do you have an example application / podspec that you're using to confirm this?

Could you also please confirm your driver version? We are investigating an issue where setting the device memory limits by UUID are not having the desired effect.

from k8s-device-plugin.

aphrodite1028 commented on June 11, 2024

I have same issues using mps in docker cuda process, driver 535.129.03 and nvdp version is 0.15.0-rc1

from k8s-device-plugin.

elezar commented on June 11, 2024

There is a known issue with 0.15.0-rc.1 where memory limits were not correctly applied. This will be addressed in v0.15.0-rc.2 which we will release soon.

from k8s-device-plugin.

aphrodite1028 commented on June 11, 2024

There is a known issue with 0.15.0-rc.1 where memory limits were not correctly applied. This will be addressed in v0.15.0-rc.2 which we will release soon.

ok, i know, thanks for your reply!

from k8s-device-plugin.

elezar commented on June 11, 2024

@aphrodite1028 @ysz-github we have just released https://github.com/NVIDIA/k8s-device-plugin/releases/tag/v0.15.0-rc.2 which should address this issue. Please let us know if you're still experiencing problems.

from k8s-device-plugin.

aphrodite1028 commented on June 11, 2024

@aphrodite1028 @ysz-github we have just released https://github.com/NVIDIA/k8s-device-plugin/releases/tag/v0.15.0-rc.2 which should address this issue. Please let us know if you're still experiencing problems.

I found https://github.com/NVIDIA/k8s-device-plugin/blob/main/cmd/mps-control-daemon/mps/daemon.go#L77-L85 here.

if I do not set CUDA_VISIBLE_DEVICES env and start nvidia-cuda-mps-control -d and nvidia-cuda-mps-control, then limit device memory failed and not found nvidia-cuda-mps-server in container。 if I setting again, ignore mps-control-daemon ds config，will success in host machine, but Segmentation fault in container.

how to set device memory limit for client in container?

driver version is 535.129.03
GPU is RTX A6000

and i use helm deploy in k8s has an error like "linux mounts: path /run/nvidia/mps is mounted on /run but it is not a shared mount" when has mountPropagation

        volumeMounts:
        - mountPath: /mps
          mountPropagation: Bidirectional
          name: mps-root

from k8s-device-plugin.

klueska commented on June 11, 2024

@aphrodite1028 . You shouldn't need to do anything special in your user container. The system starts the MPS server for all GPUs on the machine and your client will be forced to make use of it.

These lines set the upper limit on the pinned device memory and thread percentage consumable by the client.
https://github.com/NVIDIA/k8s-device-plugin/blob/main/cmd/mps-control-daemon/mps/daemon.go#L111-L122

You can manually adjust the pinned memory limit and thread percentage to something smaller that this using the envvars when you start your container (but you can't set it to something larger).

from k8s-device-plugin.

aphrodite1028 commented on June 11, 2024

@aphrodite1028 . You shouldn't need to do anything special in your user container. The system starts the MPS server for all GPUs on the machine and your client will be forced to make use of it.

These lines set the upper limit on the pinned device memory and thread percentage consumable by the client. https://github.com/NVIDIA/k8s-device-plugin/blob/main/cmd/mps-control-daemon/mps/daemon.go#L111-L122

You can manually adjust the pinned memory limit and thread percentage to something smaller that this using the envvars when you start your container (but you can't set it to something larger).

thanks for your reply.

if mps pinned device memory has driver version limit when i use? I found using man nvidia-cuda-mps-control in driver 470, not found set_default_device_pinned_mem_limit method.

from k8s-device-plugin.

Using CUDA MPS to enable GPU sharing, the pod occupies all GPU memory. about k8s-device-plugin HOT 11 OPEN

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent