Run the MPS server from mps.yaml
, then the job from nbody.yaml
.
Change the CUDA_MPS_ACTIVE_THREAD_PERCENTAGE
in nbody.yaml
and see the performance changing.
Tried this on a server with an NVIDIA RTX A4000 and:
- Kubernetes 1.29 (installed via kubeadm)
- Docker 25.0.1
- cri-dockerd 0.3.9
- nvidia-container-toolkit 1.14.4
- k8s-device-plugin 0.14.4
- flannel 0.24.2
Contents of /etc/docker/daemon.json
:
{
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
},
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
}
}
Need to set the GPU compute mode to EXCLUSIVE_PROCESS
with:
nvidia-smi -c EXCLUSIVE_PROCESS