Comments (12)
I got it to work. I followed the following guide on OMV 6 to install nvidia-drivers and nvidia-docker2.
It indicates that ldconfig
should be set to /sbin/ldconfig.real
in /etc/nvidia-container-runtime/config.toml
. Leaving this set to @/sbin/ldconfig
(the default after I installed) works for both the Plex container and netdata.
from netdata-glibc.
@cryptoDevTrader you may also want to give the dev image & instructions a try. We'll be moving to that with the next netdata release.
image: d34dc3n73r/netdata-glibc:dev
instructions: https://github.com/D34DC3N73R/netdata-glibc/tree/dev
When the official release happens you'll have to change the image to :stable
or :latest
depending on your preference.
from netdata-glibc.
I haven't tested or run openmediavault before, but this sounds kind of similar to issue #3
Does it work if you run
docker exec netdata bash -c 'LDCONFIG=$(find /usr/lib64/ -name libnvidia-ml.so.*) nvidia-smi'
from netdata-glibc.
Here's the output,
~# docker exec netdata bash -c 'LDCONFIG=$(find /usr/lib64/ -name libnvidia-ml.so.*) nvidia-smi' NVIDIA-SMI couldn't find libnvidia-ml.so library in your system. Please make sure that the NVIDIA Display Driver is properly installed and present in your system.
My libnvidia on the host machine is in:
/usr/lib/x86_64-linux-gnu/
Not sure if that's the reason it's not working. But my other containers are working fine with it. Right now I've resorted to grafana.
from netdata-glibc.
/usr/lib/x86_64-linux-gnu/
is also where libnvidia is on my host system as well (ubuntu 20.04). But in the container, it should be in /usr/lib64/
. What steps did you take to install nvidia container toolkit as well as the nvidia drivers?
Edit: I also found this in regards to OMV + Nvidia
https://forum.openmediavault.org/index.php?thread/40883-nvidia-working-with-omv-6/
Also see this if you're running OMV 5
https://forum.openmediavault.org/index.php?thread/39413-nvidia-smi-couldn-t-find-libnvidia-ml-so-library-in-your-system-please-make-sure/
from netdata-glibc.
I had actually used this guide to set everything up, the drivers as well as installing the nvidia tool kit.
https://forum.openmediavault.org/index.php?thread/38013-howto-nvidia-hardware-transcoding-on-omv-5-in-a-plex-docker-container/
I removed and reinstalled drivers, but did not remove /usr/lib/x86_64-linux-gnu/
and anything in that directory manually. Maybe I should give that a try.
Just strange that everything else works with the GPU, just not the official netdata image, or yours.
Edit: Maybe it's an issues with /etc/nvidia-container-runtime/config.toml
. As mine is:
#ldconfig = "@/sbin/ldconfig"
#ldconfig = "/sbin/ldconfig"
ldconfig = "/sbin/ldconfig.real"
Edit: But plex and other containers error when setting it ldconfig to anything other than ldconfig.real.
from netdata-glibc.
config.toml is the default
$ cat /etc/nvidia-container-runtime/config.toml
disable-require = false
#swarm-resource = "DOCKER_RESOURCE_GPU"
#accept-nvidia-visible-devices-envvar-when-unprivileged = true
#accept-nvidia-visible-devices-as-volume-mounts = false
[nvidia-container-cli]
#root = "/run/nvidia/driver"
#path = "/usr/bin/nvidia-container-cli"
environment = []
#debug = "/var/log/nvidia-container-toolkit.log"
#ldcache = "/etc/ld.so.cache"
load-kmods = true
#no-cgroups = false
#user = "root:video"
ldconfig = "@/sbin/ldconfig.real"
[nvidia-container-runtime]
#debug = "/var/log/nvidia-container-runtime.log"
Did reinstalling help at all?
from netdata-glibc.
Tried reinstalling, didn't help. Changed my config.toml to ldconfig = "@/sbin/ldconfig
and getting this error when deploying the container:
OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: Running hook #1:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: ldcache error: open failed: /sbin/ldconfig.real: no such file or directory: unknown
No error when using ldconfig = "/sbin/ldconfig.real"
but still get the python.d error.
I resorted to using prometheus, nvidia smi exporter and grafana which works. But still cannot get it to work with netdata.
from netdata-glibc.
Any update/progress on this? I'm having the same exact issue on OMV 6.
from netdata-glibc.
Note that I also downgraded nvidia packages as per this post. Using up to date nvidia packages causes the plex container to not work with the configuration noted above. The netdata-glibc container does work.
https://forums.developer.nvidia.com/t/issue-with-setting-up-triton-on-jetson-nano/248485/2
from netdata-glibc.
@cryptoDevTrader you may also want to give the dev image & instructions a try. We'll be moving to that with the next netdata release. image: d34dc3n73r/netdata-glibc:dev instructions: https://github.com/D34DC3N73R/netdata-glibc/tree/dev
When the official release happens you'll have to change the image to
:stable
or:latest
depending on your preference.
This was hugely helpful!
I am running both netdata-glibc
and plex
via docker-compose. netdata-glibc
was already working properly with the previous config using the NVIDIA_VISIBLE_DEVICES
env and nvidia runtime. Plex, however, was not working with the same configuration and the latest version of nvidia packages (older versions worked fine). Upgrading the nvidia packages to the latest versions and using the deploy
method described in the dev branch worked for both deployments.
from netdata-glibc.
closing this, but feel free to reopen if it can be reproduced with the newest updates.
from netdata-glibc.
Related Issues (9)
- Question: nvidia-smi HOT 3
- workflow not running HOT 2
- unknow nvidia runtime HOT 2
- Can't run nvidia-smi in container HOT 5
- latest netdata releases not working? HOT 1
- Symbol not found /usr/bin/nvidia-smi HOT 5
- v1.37.0 and v1.37.1 aren't on the hub HOT 2
- netdata cloud? HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from netdata-glibc.