nvidia / ansible-role-nvidia-driver Goto Github PK
View Code? Open in Web Editor NEWLicense: BSD 3-Clause "New" or "Revised" License
License: BSD 3-Clause "New" or "Revised" License
Is it possible to specify the version to install? The documentation does not suggest that this is possible.
Controlling the version is important for reproducibility. For instance, as of today, if I install the latest CUDA driver I will get CUDA 11.0, but I would need 10.2 for compatibility with the latest binary installation of pytorch (1.5.1 targets CUDA 10.2).
I am trying to set up a headless render node for Davinci Resolve running on CentOS 7. The ansible playbook claims that everything ran successfully, but following a reboot I am getting errors from nvidia-persistenced
:
Failed to start NVIDIA Persistence Daemon.
Failed to query NVIDIA devices. Please ensure that the NVIDIA device files (/dev/nvidia*) exist, and that user 0 has read and write permissions for those files.
v1.0.0-2-g7f66955
CentOS-7-x86_64-Minimal-1810.iso
uname -r
3.10.0-957.27.2.el7.x86_64
lspci | grep -i --color 'vga|3d|2d'
01:00.0 VGA compatible controller: NVIDIA Corporation GM200 [GeForce GTX 980 Ti] (rev a1)
Fully up to date packages as at 2019-08-27
sudo yum upgrade
sudo yum install lshw pciutils
(for debugging)What logs would be useful to identify what's going wrong?
Please check the base url of the arm64 server
https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/sbsa/
There should be a patch like:
diff --git a/vars/main.yml b/vars/main.yml
index c994f5b..c01187d 100644
--- a/vars/main.yml
+++ b/vars/main.yml
@@ -1,2 +1,2 @@
-_ubuntu_repo_dir: "{{ ansible_distribution | lower }}{{ ansible_distribution_version | replace('.', '') }}/{{ ansible_architecture }}"
+_ubuntu_repo_dir: "{{ ansible_distribution | lower }}{{ ansible_distribution_version | replace('.', '') }}/{{ ansible_architecture | replace('aarch64', 'sbsa') }}"
_rhel_repo_dir: "rhel{{ ansible_distribution_major_version }}/{{ ansible_architecture }}"
is there any reason why this role doesn't include a task like this?
- name: Blacklist the nouveau driver module
community.general.kernel_blacklist:
name: nouveau
state: present
HI,
If i set nvidia_driver_ubuntu_branch: 455
i get No package matching 'nvidia-headless-455-server' is available
? (should this be trying to install something that doesn't exist?)
If i try also setting:
nvidia_driver_ubuntu_packages:
- nvidia-headless-450-server
- nvidia-utils-450-server
and 450 is installed not 455? I'm running this on k3s on docker and using ansible-role-nvidia-docker to setup that up. I need Nvidia's Triton Infernece Server 20.11 which needs CUDA 11.1 and thus minimum 455 drivers. I can't use 460 because the Nvidia k8s-device-plugin breaks.
So i'm stuck between a rock and a hard place. If i could get 455 drivers working here then I'd be OK (for now). Is this possible with this role?
Hello,
While installing this role on a fresh gpu server, I'm getting the following error:
FAILED! => {"changed": false, "msg": "Unable to enable service nvidia-persistenced: Failed to enable unit: Unit file /etc/systemd/system/nvidia-persistenced.service is masked.\n"}
How do i resolve this ?
Thanks!
Hello, I get that error when trying to provision a EC2 instance running on ubuntu 22.04
TASK [nvidia.nvidia_driver : install driver packages] **********************************************************************
failed: [13.37.164.47] (item=nvidia-headless-515-server) => {"ansible_loop_var": "item", "cache_update_time": 1675822253, "cache_updated": false, "changed": false, "item": "nvidia-headless-515-server", "msg": "'/usr/bin/apt-get -y -o "Dpkg::Options::=--force-confdef" -o "Dpkg::Options::=--force-confold" install 'nvidia-headless-515-server=515.86.01-0ubuntu0.22.04.1'' failed: E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/l/linux/linux-libc-dev_5.15.0-58.64_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/t/tiff/libtiff5_4.3.0-6ubuntu0.3_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-cfg1-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-compute-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-compute-utils-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-source-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-common-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-dkms-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-headless-no-dkms-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-headless-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?\n", "rc": 100, "stderr": "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/l/linux/linux-libc-dev_5.15.0-58.64_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/t/tiff/libtiff5_4.3.0-6ubuntu0.3_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-cfg1-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-compute-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-compute-utils-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-source-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-common-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-dkms-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-headless-no-dkms-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-headless-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?\n", "stderr_lines": ["E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/l/linux/linux-libc-dev_5.15.0-58.64_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/t/tiff/libtiff5_4.3.0-6ubuntu0.3_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-cfg1-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-compute-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-compute-utils-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-source-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-common-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-dkms-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-headless-no-dkms-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-headless-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?"], "stdout": "Reading package lists...\nBuilding dependency tree...\nReading state information...\nThe following additional packages will be installed:\n build-essential bzip2 cpp cpp-11 cpp-12 dctrl-tools dkms dpkg-dev fakeroot\n fontconfig-config fonts-dejavu-core g++ g++-11 gcc gcc-11 gcc-11-base gcc-12\n libalgorithm-diff-perl libalgorithm-diff-xs-perl libalgorithm-merge-perl\n libasan6 libasan8 libatomic1 libc-dev-bin libc-devtools libc6-dev libcc1-0\n libcrypt-dev libdeflate0 libdpkg-perl libfakeroot libfile-fcntllock-perl\n libfontconfig1 libgcc-11-dev libgcc-12-dev libgd3 libgomp1 libisl23 libitm1\n libjbig0 libjpeg-turbo8 libjpeg8 liblsan0 libmpc3 libnsl-dev\n libnvidia-cfg1-515-server libnvidia-compute-515-server libpciaccess0\n libquadmath0 libstdc++-11-dev libtiff5 libtirpc-dev libtsan0 libtsan2\n libubsan1 libwebp7 libxpm4 linux-libc-dev lto-disabled-list make\n manpages-dev nvidia-compute-utils-515-server nvidia-dkms-515-server\n nvidia-headless-no-dkms-515-server nvidia-kernel-common-515-server\n nvidia-kernel-source-515-server rpcsvc-proto\nSuggested packages:\n bzip2-doc cpp-doc gcc-11-locales gcc-12-locales debtags menu debian-keyring\n g++-multilib g++-11-multilib gcc-11-doc gcc-multilib autoconf automake\n libtool flex bison gdb gcc-doc gcc-11-multilib gcc-12-multilib gcc-12-doc\n glibc-doc bzr libgd-tools libstdc++-11-doc make-doc\nThe following NEW packages will be installed:\n build-essential bzip2 cpp cpp-11 cpp-12 dctrl-tools dkms dpkg-dev fakeroot\n fontconfig-config fonts-dejavu-core g++ g++-11 gcc gcc-11 gcc-11-base gcc-12\n libalgorithm-diff-perl libalgorithm-diff-xs-perl libalgorithm-merge-perl\n libasan6 libasan8 libatomic1 libc-dev-bin libc-devtools libc6-dev libcc1-0\n libcrypt-dev libdeflate0 libdpkg-perl libfakeroot libfile-fcntllock-perl\n libfontconfig1 libgcc-11-dev libgcc-12-dev libgd3 libgomp1 libisl23 libitm1\n libjbig0 libjpeg-turbo8 libjpeg8 liblsan0 libmpc3 libnsl-dev\n libnvidia-cfg1-515-server libnvidia-compute-515-server libpciaccess0\n libquadmath0 libstdc++-11-dev libtiff5 libtirpc-dev libtsan0 libtsan2\n libubsan1 libwebp7 libxpm4 linux-libc-dev lto-disabled-list make\n manpages-dev nvidia-compute-utils-515-server nvidia-dkms-515-server\n nvidia-headless-515-server nvidia-headless-no-dkms-515-server\n nvidia-kernel-common-515-server nvidia-kernel-source-515-server rpcsvc-proto\n0 upgraded, 68 newly installed, 0 to remove and 0 not upgraded.\nNeed to get 108 MB/315 MB of archives.\nAfter this operation, 956 MB of additional disk space will be used.\nIgn:1 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/main amd64 linux-libc-dev amd64 5.15.0-58.64\nErr:1 http://security.ubuntu.com/ubuntu jammy-updates/main amd64 linux-libc-dev amd64 5.15.0-58.64\n 404 Not Found [IP: 54.229.116.227 80]\nIgn:2 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/main amd64 libtiff5 amd64 4.3.0-6ubuntu0.3\nErr:2 http://security.ubuntu.com/ubuntu jammy-updates/main amd64 libtiff5 amd64 4.3.0-6ubuntu0.3\n 404 Not Found [IP: 54.229.116.227 80]\nIgn:3 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-cfg1-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:3 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-cfg1-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nIgn:4 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:4 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nIgn:5 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-compute-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:5 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-compute-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nIgn:6 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:6 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nIgn:7 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-common-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:7 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-common-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nIgn:8 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:8 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nIgn:9 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-no-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:9 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-no-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nIgn:10 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:10 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\n", "stdout_lines": ["Reading package lists...", "Building dependency tree...", "Reading state information...", "The following additional packages will be installed:", " build-essential bzip2 cpp cpp-11 cpp-12 dctrl-tools dkms dpkg-dev fakeroot", " fontconfig-config fonts-dejavu-core g++ g++-11 gcc gcc-11 gcc-11-base gcc-12", " libalgorithm-diff-perl libalgorithm-diff-xs-perl libalgorithm-merge-perl", " libasan6 libasan8 libatomic1 libc-dev-bin libc-devtools libc6-dev libcc1-0", " libcrypt-dev libdeflate0 libdpkg-perl libfakeroot libfile-fcntllock-perl", " libfontconfig1 libgcc-11-dev libgcc-12-dev libgd3 libgomp1 libisl23 libitm1", " libjbig0 libjpeg-turbo8 libjpeg8 liblsan0 libmpc3 libnsl-dev", " libnvidia-cfg1-515-server libnvidia-compute-515-server libpciaccess0", " libquadmath0 libstdc++-11-dev libtiff5 libtirpc-dev libtsan0 libtsan2", " libubsan1 libwebp7 libxpm4 linux-libc-dev lto-disabled-list make", " manpages-dev nvidia-compute-utils-515-server nvidia-dkms-515-server", " nvidia-headless-no-dkms-515-server nvidia-kernel-common-515-server", " nvidia-kernel-source-515-server rpcsvc-proto", "Suggested packages:", " bzip2-doc cpp-doc gcc-11-locales gcc-12-locales debtags menu debian-keyring", " g++-multilib g++-11-multilib gcc-11-doc gcc-multilib autoconf automake", " libtool flex bison gdb gcc-doc gcc-11-multilib gcc-12-multilib gcc-12-doc", " glibc-doc bzr libgd-tools libstdc++-11-doc make-doc", "The following NEW packages will be installed:", " build-essential bzip2 cpp cpp-11 cpp-12 dctrl-tools dkms dpkg-dev fakeroot", " fontconfig-config fonts-dejavu-core g++ g++-11 gcc gcc-11 gcc-11-base gcc-12", " libalgorithm-diff-perl libalgorithm-diff-xs-perl libalgorithm-merge-perl", " libasan6 libasan8 libatomic1 libc-dev-bin libc-devtools libc6-dev libcc1-0", " libcrypt-dev libdeflate0 libdpkg-perl libfakeroot libfile-fcntllock-perl", " libfontconfig1 libgcc-11-dev libgcc-12-dev libgd3 libgomp1 libisl23 libitm1", " libjbig0 libjpeg-turbo8 libjpeg8 liblsan0 libmpc3 libnsl-dev", " libnvidia-cfg1-515-server libnvidia-compute-515-server libpciaccess0", " libquadmath0 libstdc++-11-dev libtiff5 libtirpc-dev libtsan0 libtsan2", " libubsan1 libwebp7 libxpm4 linux-libc-dev lto-disabled-list make", " manpages-dev nvidia-compute-utils-515-server nvidia-dkms-515-server", " nvidia-headless-515-server nvidia-headless-no-dkms-515-server", " nvidia-kernel-common-515-server nvidia-kernel-source-515-server rpcsvc-proto", "0 upgraded, 68 newly installed, 0 to remove and 0 not upgraded.", "Need to get 108 MB/315 MB of archives.", "After this operation, 956 MB of additional disk space will be used.", "Ign:1 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/main amd64 linux-libc-dev amd64 5.15.0-58.64", "Err:1 http://security.ubuntu.com/ubuntu jammy-updates/main amd64 linux-libc-dev amd64 5.15.0-58.64", " 404 Not Found [IP: 54.229.116.227 80]", "Ign:2 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/main amd64 libtiff5 amd64 4.3.0-6ubuntu0.3", "Err:2 http://security.ubuntu.com/ubuntu jammy-updates/main amd64 libtiff5 amd64 4.3.0-6ubuntu0.3", " 404 Not Found [IP: 54.229.116.227 80]", "Ign:3 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-cfg1-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:3 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-cfg1-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Ign:4 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:4 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Ign:5 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-compute-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:5 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-compute-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Ign:6 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:6 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Ign:7 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-common-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:7 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-common-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Ign:8 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:8 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Ign:9 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-no-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:9 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-no-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Ign:10 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:10 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]"]}
failed: [13.37.164.47] (item=nvidia-utils-515-server) => {"ansible_loop_var": "item", "cache_update_time": 1675822253, "cache_updated": false, "changed": false, "item": "nvidia-utils-515-server", "msg": "'/usr/bin/apt-get -y -o "Dpkg::Options::=--force-confdef" -o "Dpkg::Options::=--force-confold" install 'nvidia-utils-515-server=515.86.01-0ubuntu0.22.04.1'' failed: E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-compute-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.225.193 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-utils-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.225.193 80]\nE: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?\n", "rc": 100, "stderr": "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-compute-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.225.193 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-utils-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.225.193 80]\nE: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?\n", "stderr_lines": ["E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-compute-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.225.193 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-utils-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.225.193 80]", "E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?"], "stdout": "Reading package lists...\nBuilding dependency tree...\nReading state information...\nThe following additional packages will be installed:\n libnvidia-compute-515-server\nSuggested packages:\n nvidia-driver-515-server\nThe following NEW packages will be installed:\n libnvidia-compute-515-server nvidia-utils-515-server\n0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.\nNeed to get 50.3 MB of archives.\nAfter this operation, 194 MB of additional disk space will be used.\nIgn:1 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:1 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.225.193 80]\nIgn:2 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:2 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.225.193 80]\n", "stdout_lines": ["Reading package lists...", "Building dependency tree...", "Reading state information...", "The following additional packages will be installed:", " libnvidia-compute-515-server", "Suggested packages:", " nvidia-driver-515-server", "The following NEW packages will be installed:", " libnvidia-compute-515-server nvidia-utils-515-server", "0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.", "Need to get 50.3 MB of archives.", "After this operation, 194 MB of additional disk space will be used.", "Ign:1 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:1 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.225.193 80]", "Ign:2 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:2 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.225.193 80]"]}
failed: [13.37.164.47] (item=nvidia-headless-no-dkms-515-server) => {"ansible_loop_var": "item", "cache_update_time": 1675822253, "cache_updated": false, "changed": false, "item": "nvidia-headless-no-dkms-515-server", "msg": "'/usr/bin/apt-get -y -o "Dpkg::Options::=--force-confdef" -o "Dpkg::Options::=--force-confold" install 'nvidia-headless-no-dkms-515-server=515.86.01-0ubuntu0.22.04.1'' failed: E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-cfg1-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-compute-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-compute-utils-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-common-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-source-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-headless-no-dkms-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?\n", "rc": 100, "stderr": "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-cfg1-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-compute-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-compute-utils-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-common-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-source-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-headless-no-dkms-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?\n", "stderr_lines": ["E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-cfg1-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-compute-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-compute-utils-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-common-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-source-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-headless-no-dkms-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?"], "stdout": "Reading package lists...\nBuilding dependency tree...\nReading state information...\nThe following additional packages will be installed:\n libnvidia-cfg1-515-server libnvidia-compute-515-server libpciaccess0\n nvidia-compute-utils-515-server nvidia-kernel-common-515-server\n nvidia-kernel-source-515-server\nThe following NEW packages will be installed:\n libnvidia-cfg1-515-server libnvidia-compute-515-server libpciaccess0\n nvidia-compute-utils-515-server nvidia-headless-no-dkms-515-server\n nvidia-kernel-common-515-server nvidia-kernel-source-515-server\n0 upgraded, 7 newly installed, 0 to remove and 0 not upgraded.\nNeed to get 107 MB/107 MB of archives.\nAfter this operation, 283 MB of additional disk space will be used.\nIgn:1 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-cfg1-515-server amd64 515.86.01-0ubuntu0.22.04.1\nIgn:2 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1\nIgn:3 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-compute-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1\nIgn:4 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-common-515-server amd64 515.86.01-0ubuntu0.22.04.1\nIgn:5 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1\nIgn:6 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-no-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:1 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-cfg1-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nErr:2 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nErr:3 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-compute-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nErr:4 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-common-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nErr:5 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nErr:6 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-no-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\n", "stdout_lines": ["Reading package lists...", "Building dependency tree...", "Reading state information...", "The following additional packages will be installed:", " libnvidia-cfg1-515-server libnvidia-compute-515-server libpciaccess0", " nvidia-compute-utils-515-server nvidia-kernel-common-515-server", " nvidia-kernel-source-515-server", "The following NEW packages will be installed:", " libnvidia-cfg1-515-server libnvidia-compute-515-server libpciaccess0", " nvidia-compute-utils-515-server nvidia-headless-no-dkms-515-server", " nvidia-kernel-common-515-server nvidia-kernel-source-515-server", "0 upgraded, 7 newly installed, 0 to remove and 0 not upgraded.", "Need to get 107 MB/107 MB of archives.", "After this operation, 283 MB of additional disk space will be used.", "Ign:1 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-cfg1-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Ign:2 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Ign:3 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-compute-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Ign:4 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-common-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Ign:5 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Ign:6 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-no-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:1 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-cfg1-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Err:2 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Err:3 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-compute-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Err:4 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-common-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Err:5 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Err:6 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-no-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]"]}
failed: [13.37.164.47] (item=nvidia-kernel-source-515-server) => {"ansible_loop_var": "item", "cache_update_time": 1675822253, "cache_updated": false, "changed": false, "item": "nvidia-kernel-source-515-server", "msg": "'/usr/bin/apt-get -y -o "Dpkg::Options::=--force-confdef" -o "Dpkg::Options::=--force-confold" install 'nvidia-kernel-source-515-server=515.86.01-0ubuntu0.22.04.1'' failed: E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-source-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 34.253.189.82 80]\nE: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?\n", "rc": 100, "stderr": "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-source-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 34.253.189.82 80]\nE: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?\n", "stderr_lines": ["E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-source-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 34.253.189.82 80]", "E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?"], "stdout": "Reading package lists...\nBuilding dependency tree...\nReading state information...\nThe following NEW packages will be installed:\n nvidia-kernel-source-515-server\n0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.\nNeed to get 31.3 MB of archives.\nAfter this operation, 55.3 MB of additional disk space will be used.\nIgn:1 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:1 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 34.253.189.82 80]\n", "stdout_lines": ["Reading package lists...", "Building dependency tree...", "Reading state information...", "The following NEW packages will be installed:", " nvidia-kernel-source-515-server", "0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.", "Need to get 31.3 MB of archives.", "After this operation, 55.3 MB of additional disk space will be used.", "Ign:1 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:1 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 34.253.189.82 80]"]}
PLAY RECAP *****************************************************************************************************************
13.37.164.47 : ok=4 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
Here is the playbook:
Ansible role fails with :
TASK [nvidia.nvidia_driver : add key]
ok: [gpu1]TASK [nvidia.nvidia_driver : add repo]
fatal: [gpu1]: FAILED! => {
"changed": false, "msg": "Failed to update apt cache:
E:Failed to fetch https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/by-hash/SHA256/751939d95516afc289908a19e447f0acc1506367f72ed356431a2b1a469cc8ca 404 Not Found [IP: 152.199.20.126 443],
E:Some index files failed to download. They have been ignored, or old ones used instead."}
$ sudo apt-key list
/etc/apt/trusted.gpg
--------------------
pub rsa4096 2016-06-24 [SC]
AE09 FE4B BD22 3A84 B2CC FCE3 F60F 4B3D 7FA2 AF80
uid [ unknown] cudatools <[email protected]>
$ sudo apt-get update
[...]
Ign:5 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 InRelease
Get:7 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 Release [697 B]
Get:8 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 Release.gpg [836 B]
Ign:9 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 Packages
Ign:9 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 Packages
Ign:9 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 Packages
Err:9 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 Packages
404 Not Found [IP: 152.199.20.126 443]
Fetched 836 B in 2s (392 B/s)
Reading package lists... Done
E: Failed to fetch https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/by-hash/SHA256/751939d95516afc289908a19e447f0acc1506367f72ed356431a2b1a469cc8ca 404 Not Found [IP: 152.199.20.126 443]
E: Some index files failed to download. They have been ignored, or old ones used instead.
It seems that the whole by-hash
dir is missing on nvidia repos :
https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/by-hash/
As a workaround I had to change the the apt_repository task on my side adding a special by-hash=no
option to avoid previous error :
$ cat /etc/apt/sources.list.d/developer_download_nvidia_com_compute_cuda_repos_ubuntu2004_x86_64.list
deb [by-hash=no] http://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 /
Is it a transient issue on nvidia repo side ? Or should we permanently add this "no-hash" fix on ansible templates ?
Thanks
it's posible exttend role for debian stretch or buster?
Thanks.
The logic in line:
hasn't been working for me.
I've specified:
vars:
- nvidia_driver_package_version: '465.19.01-1'
- nvidia_driver_branch: '465'
and yet I see this error:
No package matching 'nvidia-driver-latest-dkms-465.19.01-1' found available, installed or updated
Hello,
The latest 430 drivers are not published to the NVIDIA developer repository. The Ubuntu repository used by this role, http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/
only contains up to driver version 410.
When setting nvidia_driver_ubuntu_install_from_cuda_repo: yes
I get:
TASK [nvidia.nvidia_driver : install driver packages] ****************************************************
fatal: [nodename]: FAILED! => {"changed": false, "msg": "No package matching 'cuda-drivers-510' is available"}
The package doesn't exist:
$ apt-cache search ^cuda-drivers
cuda-drivers-fabricmanager-450 - Meta-package for FM and Driver
cuda-drivers-fabricmanager-460 - Transitional package for cuda-drivers-fabricmanager-510
cuda-drivers-fabricmanager-470 - Meta-package for FM and Driver
cuda-drivers-fabricmanager-510 - Meta-package for FM and Driver
No issues when setting nvidia_driver_ubuntu_install_from_cuda_repo: no
, but I would like to install from the CUDA repository though.
After running playbook I ended up with a non working driver and this:
Installed Packages
kernel.x86_64 3.10.0-957.el7 @anaconda
kernel-debug-devel.x86_64 3.10.0-1127.18.2.el7 @updates
kernel-devel.x86_64 3.10.0-1127.18.2.el7 @updates
kernel-headers.x86_64 3.10.0-957.el7 @anaconda
kernel-tools.x86_64 3.10.0-957.el7 @anaconda
kernel-tools-libs.x86_64 3.10.0-957.el7 @anaconda
Might be good to install the kernel headers/devel for the running kernel on systems that require specific kernel versions?
Note: updating to the newer kernel did fix the issue as I assume downgrading headers/devel would have as well
First, thanks for this very useful role.
I'm having an issue when trying to install the drivers on an Ubuntu 18.04 VM with PCIe passthrough (on plain KVM).
I ran the playbook like this:
ansible-playbook -i vm-inventory.yml -e host=test -e nvidia_driver_ubuntu_install_from_cuda_repo=yes -e nvidia_driver_persistence_mode_on=no nvidia-driver.yml
and everything seems fine, except that it still runs the configure persistenced service to turn on persistence
task than then fails
TASK [nvidia.nvidia_driver : configure persistenced service to turn on persistence mode] ***************************************************************************************************************************************************************************************
fatal: [test]: FAILED! => {"changed": false, "checksum": "d9db4b73d83eca98ac2179a02847f1ab0a2cb866", "msg": "Source /root/.ansible/tmp/ansible-tmp-1661936608.0951219-393830-90866238334705/source not found"}
Below the verbose output:
TASK [nvidia.nvidia_driver : configure persistenced service to turn on persistence mode] ***************************************************************************************************************************************************************************************
task path: /home/ato/.ansible/roles/nvidia.nvidia_driver/tasks/main.yml:26
<test> CONNECT TO qemu+ssh://ato@majinbu/system
<test> FIND DOMAIN test
<test> ESTABLISH community.libvirt.libvirt_qemu CONNECTION
<test> EXEC /bin/sh -c 'echo ~ && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "echo ~ && sleep 0"]}}
<test> GA return: {'return': {'pid': 31557}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31557}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 0, 'out-data': 'fgo=', 'exited': True}}
<test> GA stdout: ~
<test> GA stderr:
<test> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo ~/.ansible/tmp `"&& mkdir "` echo ~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430 `" && echo ansible-tmp-1661936792.938617-393878-3538192051430="` echo ~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430 `" ) && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "( umask 77 && mkdir -p \"` echo ~/.ansible/tmp `\"&& mkdir \"` echo ~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430 `\" && echo ansible-tmp-1661936792.938617-393878-3538192051430=\"` echo ~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430 `\" ) && sleep 0"]}}
<test> GA return: {'return': {'pid': 31559}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31559}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 0, 'out-data': 'YW5zaWJsZS10bXAtMTY2MTkzNjc5Mi45Mzg2MTctMzkzODc4LTM1MzgxOTIwNTE0MzA9fi8uYW5zaWJsZS90bXAvYW5zaWJsZS10bXAtMTY2MTkzNjc5Mi45Mzg2MTctMzkzODc4LTM1MzgxOTIwNTE0MzAK', 'exited': True}}
<test> GA stdout: ansible-tmp-1661936792.938617-393878-3538192051430=~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430
<test> GA stderr:
Using module file /home/ato/.local/lib/python3.8/site-packages/ansible/modules/stat.py
<test> PUT /home/ato/.ansible/tmp/ansible-local-3938357gq_ahvb/tmpx3vmg1rs TO ~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_stat.py
<test> GA send: {"execute": "guest-file-open", "arguments": {"path": "~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_stat.py", "mode": "wb+"}}
<test> GA return: {'return': 1088}
<test> GA send: {"execute": "guest-file-close", "arguments": {"handle": 1088}}
<test> GA return: {'return': {}}
<test> EXEC /bin/sh -c 'chmod u+x '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/'"'"' '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_stat.py'"'"' && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "chmod u+x '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/' '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_stat.py' && sleep 0"]}}
<test> GA return: {'return': {'pid': 31567}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31567}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 0, 'exited': True}}
<test> GA stdout:
<test> GA stderr:
<test> EXEC /bin/sh -c '/usr/bin/python3.6 '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_stat.py'"'"' && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "/usr/bin/python3.6 '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_stat.py' && sleep 0"]}}
<test> GA return: {'return': {'pid': 31570}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31570}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 0, 'out-data': 'CnsiY2hhbmdlZCI6IGZhbHNlLCAic3RhdCI6IHsiZXhpc3RzIjogZmFsc2V9LCAiaW52b2NhdGlvbiI6IHsibW9kdWxlX2FyZ3MiOiB7InBhdGgiOiAiL2V0Yy9zeXN0ZW1kL3N5c3RlbS9udmlkaWEtcGVyc2lzdGVuY2VkLnNlcnZpY2UuZC9vdmVycmlkZS5jb25mIiwgImZvbGxvdyI6IGZhbHNlLCAiZ2V0X2NoZWNrc3VtIjogdHJ1ZSwgImNoZWNrc3VtX2FsZ29yaXRobSI6ICJzaGExIiwgImdldF9tZDUiOiBmYWxzZSwgImdldF9taW1lIjogdHJ1ZSwgImdldF9hdHRyaWJ1dGVzIjogdHJ1ZX19fQo=', 'exited': True}}
<test> GA stdout:
{"changed": false, "stat": {"exists": false}, "invocation": {"module_args": {"path": "/etc/systemd/system/nvidia-persistenced.service.d/override.conf", "follow": false, "get_checksum": true, "checksum_algorithm": "sha1", "get_md5": false, "get_mime": true, "get_attributes": true}}}
<test> GA stderr:
<test> PUT /home/ato/.ansible/roles/nvidia.nvidia_driver/files/nvidia-persistenced-override.conf TO ~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source
<test> GA send: {"execute": "guest-file-open", "arguments": {"path": "~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source", "mode": "wb+"}}
<test> GA return: {'return': 1089}
<test> GA send: {"execute": "guest-file-close", "arguments": {"handle": 1089}}
<test> GA return: {'return': {}}
<test> EXEC /bin/sh -c 'chmod u+x '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/'"'"' '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source'"'"' && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "chmod u+x '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/' '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source' && sleep 0"]}}
<test> GA return: {'return': {'pid': 31574}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31574}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 0, 'exited': True}}
<test> GA stdout:
<test> GA stderr:
Using module file /home/ato/.local/lib/python3.8/site-packages/ansible/modules/copy.py
<test> PUT /home/ato/.ansible/tmp/ansible-local-3938357gq_ahvb/tmp9ikfkhqz TO ~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_copy.py
<test> GA send: {"execute": "guest-file-open", "arguments": {"path": "~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_copy.py", "mode": "wb+"}}
<test> GA return: {'return': 1090}
<test> GA send: {"execute": "guest-file-close", "arguments": {"handle": 1090}}
<test> GA return: {'return': {}}
<test> EXEC /bin/sh -c 'chmod u+x '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/'"'"' '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_copy.py'"'"' && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "chmod u+x '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/' '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_copy.py' && sleep 0"]}}
<test> GA return: {'return': {'pid': 31577}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31577}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 0, 'exited': True}}
<test> GA stdout:
<test> GA stderr:
<test> EXEC /bin/sh -c '/usr/bin/python3.6 '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_copy.py'"'"' && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "/usr/bin/python3.6 '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_copy.py' && sleep 0"]}}
<test> GA return: {'return': {'pid': 31580}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31580}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 1, 'out-data': 'CnsiZmFpbGVkIjogdHJ1ZSwgIm1zZyI6ICJTb3VyY2UgL3Jvb3QvLmFuc2libGUvdG1wL2Fuc2libGUtdG1wLTE2NjE5MzY3OTIuOTM4NjE3LTM5Mzg3OC0zNTM4MTkyMDUxNDMwL3NvdXJjZSBub3QgZm91bmQiLCAiaW52b2NhdGlvbiI6IHsibW9kdWxlX2FyZ3MiOiB7InNyYyI6ICIvcm9vdC8uYW5zaWJsZS90bXAvYW5zaWJsZS10bXAtMTY2MTkzNjc5Mi45Mzg2MTctMzkzODc4LTM1MzgxOTIwNTE0MzAvc291cmNlIiwgImRlc3QiOiAiL2V0Yy9zeXN0ZW1kL3N5c3RlbS9udmlkaWEtcGVyc2lzdGVuY2VkLnNlcnZpY2UuZC9vdmVycmlkZS5jb25mIiwgIl9vcmlnaW5hbF9iYXNlbmFtZSI6ICJudmlkaWEtcGVyc2lzdGVuY2VkLW92ZXJyaWRlLmNvbmYiLCAiZm9sbG93IjogZmFsc2UsICJjaGVja3N1bSI6ICJkOWRiNGI3M2Q4M2VjYTk4YWMyMTc5YTAyODQ3ZjFhYjBhMmNiODY2IiwgImJhY2t1cCI6IGZhbHNlLCAiZm9yY2UiOiB0cnVlLCAidW5zYWZlX3dyaXRlcyI6IGZhbHNlLCAiY29udGVudCI6IG51bGwsICJ2YWxpZGF0ZSI6IG51bGwsICJkaXJlY3RvcnlfbW9kZSI6IG51bGwsICJyZW1vdGVfc3JjIjogbnVsbCwgImxvY2FsX2ZvbGxvdyI6IG51bGwsICJtb2RlIjogbnVsbCwgIm93bmVyIjogbnVsbCwgImdyb3VwIjogbnVsbCwgInNldXNlciI6IG51bGwsICJzZXJvbGUiOiBudWxsLCAic2VsZXZlbCI6IG51bGwsICJzZXR5cGUiOiBudWxsLCAiYXR0cmlidXRlcyI6IG51bGx9fX0K', 'exited': True}}
<test> GA stdout:
{"failed": true, "msg": "Source /root/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source not found", "invocation": {"module_args": {"src": "/root/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source", "dest": "/etc/systemd/system/nvidia-persistenced.service.d/override.conf", "_original_basename": "nvidia-persistenced-override.conf", "follow": false, "checksum": "d9db4b73d83eca98ac2179a02847f1ab0a2cb866", "backup": false, "force": true, "unsafe_writes": false, "content": null, "validate": null, "directory_mode": null, "remote_src": null, "local_follow": null, "mode": null, "owner": null, "group": null, "seuser": null, "serole": null, "selevel": null, "setype": null, "attributes": null}}}
<test> GA stderr:
<test> EXEC /bin/sh -c 'rm -f -r '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/'"'"' > /dev/null 2>&1 && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "rm -f -r '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/' > /dev/null 2>&1 && sleep 0"]}}
<test> GA return: {'return': {'pid': 31583}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31583}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 0, 'exited': True}}
<test> GA stdout:
<test> GA stderr:
fatal: [test]: FAILED! => {
"changed": false,
"checksum": "d9db4b73d83eca98ac2179a02847f1ab0a2cb866",
"diff": [],
"invocation": {
"module_args": {
"_original_basename": "nvidia-persistenced-override.conf",
"attributes": null,
"backup": false,
"checksum": "d9db4b73d83eca98ac2179a02847f1ab0a2cb866",
"content": null,
"dest": "/etc/systemd/system/nvidia-persistenced.service.d/override.conf",
"directory_mode": null,
"follow": false,
"force": true,
"group": null,
"local_follow": null,
"mode": null,
"owner": null,
"remote_src": null,
"selevel": null,
"serole": null,
"setype": null,
"seuser": null,
"src": "/root/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source",
"unsafe_writes": false,
"validate": null
}
},
"msg": "Source /root/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source not found"
}
Any suggestion?
I would like to specify a driver version. How is it possible to do this?
What is exactly needed to "officially" support 20.04?
I ran the role on a 20.04 machine and superficially it seems fine.
I tried installing the nvidia driver on an AlmaLinux 9 system, but unfortunately the driver misses support for this
NVidia driver role supports installation on Redhat family system AlmaLinux
NVidia driver role does not support AlmaLinux
Cause of this behavior is in main.yml
- name: redhat family install tasks
include_tasks: install-redhat.yml
when: ansible_os_family == 'RedHat'
Output of ansible [node] -m setup
"ansible_os_family": "AlmaLinux"`,
"ansible_distribution_major_version": "9",
Please add support for AlmaLinux
Note: I'm happy to create a pull-request to add this support
Hi,
There seems to be an issue when trying to install on RHEL9.
It tries to dnf install using the module name: nvidia-driver:525.125.06
While the actual module name for version 525.125.06 is: nvidia-driver:525-dkms:20230804160227:f6b002c604:x86_64
Are you planning to update the role to support RHEL9?
I'm currently running into this longstanding issue on Ubuntu 20.04 with the following configuration:
- role: nvidia.nvidia_driver
nvidia_driver_ubuntu_branch: 340
nvidia_driver_ubuntu_packages: [ 'nvidia-340' ]
I don't have this issue on machines when using the default NVIDIA 450 packages but I can't install this on all my systems due to hardware compatibility, and anyway it's possible that some recent versions are affected in combination with some kernels anyway (e.g., in the Launchpad thread above, a user reports having issues with version 465).
It seems there is known error text and a known workaround, so it might be useful to add to this role.
I've got a problem to install specific NVIDIA Driver on RHEL8
After PR #53 was able to install the branch version of the driver.
But I need to install an older version rather than the most recent version of the branch.
Therefore, "nvidia_driver_package_version" was specified as follows.
nvidia_driver_package_state: present
nvidia_driver_package_version: '470.82.01-1.el8'
nvidia_driver_persistence_mode_on: yes
nvidia_driver_skip_reboot: no
nvidia_driver_module_file: /etc/modprobe.d/nvidia.conf
nvidia_driver_module_params: ''
nvidia_driver_add_repos: yes
nvidia_driver_branch: "470"
But it says the package cannot be found.
fatal: [GPU-SVR01]: FAILED! => changed=false
msg: No group nvidia-driver:470.82.01-1.el8 available.
results: []
Is the nvidia_driver_package_version format wrong?
Is there support for Ubuntu 20.04 planned? The role currently cannot be installed on the latest ubuntu.
Hi there,
i wanted to thank you for the nice ansible role. Unfortunately Debian does not seem to be officially supported. But I managed it with a bit of variable overriding.
I would be happy if debian would be officially supported. Until then, maybe this will help someone who uses debian to use this role anyway.
# my playbook
.....
roles:
- role: unix-basics
tags: unix-basics
- role: xanmanning.k3s
tags: k3s
- role: nvidia.nvidia_driver # should run after cluster install
vars:
# See https://github.com/NVIDIA/ansible-role-nvidia-driver#role-variables
nvidia_driver_ubuntu_cuda_repo_baseurl: 'https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64' # enforced 'debian11'
nvidia_driver_ubuntu_install_from_cuda_repo: yes
nvidia_driver_persistence_mode_on: yes
ansible_distribution: Ubuntu # forcing in to the ubuntu part of the role
when: ansible_hostname == 'k3s-worker1' # we only have ONE node with NVIDIA
tags:
- nvidia
....
Hi,
I'm currently trying to deploy the nvidia drivers as well as the nvidia CUDA toolkit using this Ansible playbook on Ubuntu 22.04 machines.
First, I'm not quite sure the impact of the nvidia_driver_ubuntu_install_from_cuda_repo
flag. Aside from using different repositories to install the package, what would the use case to choose one option or the other?
I managed to install the drivers with the nvidia_driver_branch: '525'
. Previously, we were using the nvidia_driver_branch: '515'
option. How do I know which values are available?
Finally, what is the difference with the nvidia_driver_package_version
version? Again, how do I know which values are available?
Thanks in advance for your help.
Emmanuel
I've downloaded this role through Galaxy and am attempting to get Rapids.AI installed.
I've created a file nvidia-driver.yml
---
- hosts: localhost
roles:
- nvidia.nvidia_driver
and execute it with ansible-playbook nvidia-driver.yml
. When it runs, the last command ( a reboot instruction ) fails with
"Running reboot with local connection would reboot the control node."
Rebooting the system, and running the playbook again, returns success.
Now my questions:
thanks
This role install drivers and cuda?
Hello,
I am realizing that having 2 GPU or more installed does not work with the module.
If I run the driver 515.105.01, via the .run, it works for dual GPU, but running the playbook does not.
Rocky Linux 8.7, RTX 2080TI (Dual).
Please do not hesitate if you need anymore info,
I'd like to use this role to provision drivers on a VM with Secure Boot enabled. What should I do to request signed drivers?
The new LTS release of Ubuntu was released a couple of months ago. It would be great if this release could also be supported.
Hi,
I've got a problem to install the NVIDIA Driver on CentOS8.
When installing for the first time, the latest version 495.29.05 was installed because the version was not specified.
and then I have specified the driver version in default/main.yml as shown below.
nvidia_driver_package_state: present
nvidia_driver_package_version: '470.57.02-1'
nvidia_driver_persistence_mode_on: yes
after that I ran the playbook, but the installation failed as follows:
TASK [nvidia.nvidia_driver : install driver packages RHEL/CentOS 8 and newer] *********************************************************************************
fatal: [gdp-glm-gpu001]: FAILED! => changed=false
msg: No group nvidia-driver:470.57.02-1 available.
results: []
Even if I try changing it to multiple versions as shown below, it fails the same.
450.142.00
450.156.00-1
460.91.03-1
460.106.00-1
470.42.01-1
470.57.02-1
setting nvidia_driver_ubuntu_install_from_cuda_repo: yes
and nvidia_driver_ubuntu_branch: "455"
installs 465.19.01 (latest i assume?)
When using the CUDA repo how does one pin the driver version?
is there plan to update for RHEL/CentOS/Rocky 8?
Maybe related to #4.
Here https://github.com/NVIDIA/ansible-role-nvidia-driver/blob/master/tasks/install-redhat.yml#L21 it is installing the cuda-drivers
meta-package, which points to the non-dkms packages. In the repo (for example https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/) there are -dkms
versions for the nvidia and cuda latest drivers. Installing non-dkms versions is brittle, because any kernel update requires re-running this playbook or getting broken drivers, something that is non-obvious.
Either using DKMS drivers by default, or providing an option to do just that, would be useful. We can fix it by installing those -dkms individual packages by hand, but an nvidia-maintained playbook or DKMS metapackage would prevent that solution from breaking in the future.
Installation of a specific package version is in error. Package is not found.
I was able to download changing this line:
There is a typo in constructed file name : a '=' instead of a '-'
'nvidia-driver-latest-dkms='+nvidia_driver_package_version
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.