Coder Social home page Coder Social logo

ansible-role-nvidia-driver's Issues

Controlling version of CUDA drivers installed

Is it possible to specify the version to install? The documentation does not suggest that this is possible.
Controlling the version is important for reproducibility. For instance, as of today, if I install the latest CUDA driver I will get CUDA 11.0, but I would need 10.2 for compatibility with the latest binary installation of pytorch (1.5.1 targets CUDA 10.2).

Problems on a fresh Centos 7 install

I am trying to set up a headless render node for Davinci Resolve running on CentOS 7. The ansible playbook claims that everything ran successfully, but following a reboot I am getting errors from nvidia-persistenced:

  • Failed to start NVIDIA Persistence Daemon.
  • Failed to query NVIDIA devices. Please ensure that the NVIDIA device files (/dev/nvidia*) exist, and that user 0 has read and write permissions for those files.

Environment

v1.0.0-2-g7f66955

CentOS-7-x86_64-Minimal-1810.iso

uname -r
3.10.0-957.27.2.el7.x86_64

lspci | grep -i --color 'vga|3d|2d'
01:00.0 VGA compatible controller: NVIDIA Corporation GM200 [GeForce GTX 980 Ti] (rev a1)

Fully up to date packages as at 2019-08-27

Steps

  1. Installed Centos minimal config
  2. sudo yum upgrade
  3. sudo yum install lshw pciutils (for debugging)
  4. Added ssh key for Ansible controller
  5. Ran the playbook

What logs would be useful to identify what's going wrong?

BASE url to ARM64 SBSA is `sbsa` instead of `aarch64`

Please check the base url of the arm64 server
https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/sbsa/
There should be a patch like:

diff --git a/vars/main.yml b/vars/main.yml
index c994f5b..c01187d 100644
--- a/vars/main.yml
+++ b/vars/main.yml
@@ -1,2 +1,2 @@
-_ubuntu_repo_dir: "{{ ansible_distribution | lower }}{{ ansible_distribution_version | replace('.', '') }}/{{ ansible_architecture }}"
+_ubuntu_repo_dir: "{{ ansible_distribution | lower }}{{ ansible_distribution_version | replace('.', '') }}/{{ ansible_architecture | replace('aarch64', 'sbsa') }}"
 _rhel_repo_dir: "rhel{{ ansible_distribution_major_version }}/{{ ansible_architecture }}"

please blacklist nouveau driver

is there any reason why this role doesn't include a task like this?

- name: Blacklist the nouveau driver module                                                                                                                                                                                            
  community.general.kernel_blacklist:                                                                                                                                                                                                  
    name: nouveau                                                                                                                                                                                                                      
    state: present 

How do i install 455 drivers for CUDA 11.1

HI,
If i set nvidia_driver_ubuntu_branch: 455 i get No package matching 'nvidia-headless-455-server' is available? (should this be trying to install something that doesn't exist?)

If i try also setting:

nvidia_driver_ubuntu_packages:
- nvidia-headless-450-server
- nvidia-utils-450-server

and 450 is installed not 455? I'm running this on k3s on docker and using ansible-role-nvidia-docker to setup that up. I need Nvidia's Triton Infernece Server 20.11 which needs CUDA 11.1 and thus minimum 455 drivers. I can't use 460 because the Nvidia k8s-device-plugin breaks.

So i'm stuck between a rock and a hard place. If i could get 455 drivers working here then I'd be OK (for now). Is this possible with this role?

Unable to enable service nvidia-persistenced: Failed to enable unit

Hello,
While installing this role on a fresh gpu server, I'm getting the following error:
FAILED! => {"changed": false, "msg": "Unable to enable service nvidia-persistenced: Failed to enable unit: Unit file /etc/systemd/system/nvidia-persistenced.service is masked.\n"}

How do i resolve this ?

Thanks!

Role doesn't work correctly on 22.04

Hello, I get that error when trying to provision a EC2 instance running on ubuntu 22.04

TASK [nvidia.nvidia_driver : install driver packages] **********************************************************************
failed: [13.37.164.47] (item=nvidia-headless-515-server) => {"ansible_loop_var": "item", "cache_update_time": 1675822253, "cache_updated": false, "changed": false, "item": "nvidia-headless-515-server", "msg": "'/usr/bin/apt-get -y -o "Dpkg::Options::=--force-confdef" -o "Dpkg::Options::=--force-confold" install 'nvidia-headless-515-server=515.86.01-0ubuntu0.22.04.1'' failed: E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/l/linux/linux-libc-dev_5.15.0-58.64_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/t/tiff/libtiff5_4.3.0-6ubuntu0.3_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-cfg1-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-compute-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-compute-utils-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-source-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-common-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-dkms-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-headless-no-dkms-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-headless-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?\n", "rc": 100, "stderr": "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/l/linux/linux-libc-dev_5.15.0-58.64_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/t/tiff/libtiff5_4.3.0-6ubuntu0.3_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-cfg1-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-compute-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-compute-utils-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-source-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-common-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-dkms-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-headless-no-dkms-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-headless-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?\n", "stderr_lines": ["E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/l/linux/linux-libc-dev_5.15.0-58.64_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/main/t/tiff/libtiff5_4.3.0-6ubuntu0.3_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-cfg1-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-compute-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-compute-utils-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-source-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-common-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-dkms-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-headless-no-dkms-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-headless-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?"], "stdout": "Reading package lists...\nBuilding dependency tree...\nReading state information...\nThe following additional packages will be installed:\n build-essential bzip2 cpp cpp-11 cpp-12 dctrl-tools dkms dpkg-dev fakeroot\n fontconfig-config fonts-dejavu-core g++ g++-11 gcc gcc-11 gcc-11-base gcc-12\n libalgorithm-diff-perl libalgorithm-diff-xs-perl libalgorithm-merge-perl\n libasan6 libasan8 libatomic1 libc-dev-bin libc-devtools libc6-dev libcc1-0\n libcrypt-dev libdeflate0 libdpkg-perl libfakeroot libfile-fcntllock-perl\n libfontconfig1 libgcc-11-dev libgcc-12-dev libgd3 libgomp1 libisl23 libitm1\n libjbig0 libjpeg-turbo8 libjpeg8 liblsan0 libmpc3 libnsl-dev\n libnvidia-cfg1-515-server libnvidia-compute-515-server libpciaccess0\n libquadmath0 libstdc++-11-dev libtiff5 libtirpc-dev libtsan0 libtsan2\n libubsan1 libwebp7 libxpm4 linux-libc-dev lto-disabled-list make\n manpages-dev nvidia-compute-utils-515-server nvidia-dkms-515-server\n nvidia-headless-no-dkms-515-server nvidia-kernel-common-515-server\n nvidia-kernel-source-515-server rpcsvc-proto\nSuggested packages:\n bzip2-doc cpp-doc gcc-11-locales gcc-12-locales debtags menu debian-keyring\n g++-multilib g++-11-multilib gcc-11-doc gcc-multilib autoconf automake\n libtool flex bison gdb gcc-doc gcc-11-multilib gcc-12-multilib gcc-12-doc\n glibc-doc bzr libgd-tools libstdc++-11-doc make-doc\nThe following NEW packages will be installed:\n build-essential bzip2 cpp cpp-11 cpp-12 dctrl-tools dkms dpkg-dev fakeroot\n fontconfig-config fonts-dejavu-core g++ g++-11 gcc gcc-11 gcc-11-base gcc-12\n libalgorithm-diff-perl libalgorithm-diff-xs-perl libalgorithm-merge-perl\n libasan6 libasan8 libatomic1 libc-dev-bin libc-devtools libc6-dev libcc1-0\n libcrypt-dev libdeflate0 libdpkg-perl libfakeroot libfile-fcntllock-perl\n libfontconfig1 libgcc-11-dev libgcc-12-dev libgd3 libgomp1 libisl23 libitm1\n libjbig0 libjpeg-turbo8 libjpeg8 liblsan0 libmpc3 libnsl-dev\n libnvidia-cfg1-515-server libnvidia-compute-515-server libpciaccess0\n libquadmath0 libstdc++-11-dev libtiff5 libtirpc-dev libtsan0 libtsan2\n libubsan1 libwebp7 libxpm4 linux-libc-dev lto-disabled-list make\n manpages-dev nvidia-compute-utils-515-server nvidia-dkms-515-server\n nvidia-headless-515-server nvidia-headless-no-dkms-515-server\n nvidia-kernel-common-515-server nvidia-kernel-source-515-server rpcsvc-proto\n0 upgraded, 68 newly installed, 0 to remove and 0 not upgraded.\nNeed to get 108 MB/315 MB of archives.\nAfter this operation, 956 MB of additional disk space will be used.\nIgn:1 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/main amd64 linux-libc-dev amd64 5.15.0-58.64\nErr:1 http://security.ubuntu.com/ubuntu jammy-updates/main amd64 linux-libc-dev amd64 5.15.0-58.64\n 404 Not Found [IP: 54.229.116.227 80]\nIgn:2 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/main amd64 libtiff5 amd64 4.3.0-6ubuntu0.3\nErr:2 http://security.ubuntu.com/ubuntu jammy-updates/main amd64 libtiff5 amd64 4.3.0-6ubuntu0.3\n 404 Not Found [IP: 54.229.116.227 80]\nIgn:3 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-cfg1-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:3 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-cfg1-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nIgn:4 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:4 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nIgn:5 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-compute-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:5 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-compute-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nIgn:6 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:6 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nIgn:7 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-common-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:7 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-common-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nIgn:8 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:8 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nIgn:9 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-no-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:9 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-no-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nIgn:10 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:10 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\n", "stdout_lines": ["Reading package lists...", "Building dependency tree...", "Reading state information...", "The following additional packages will be installed:", " build-essential bzip2 cpp cpp-11 cpp-12 dctrl-tools dkms dpkg-dev fakeroot", " fontconfig-config fonts-dejavu-core g++ g++-11 gcc gcc-11 gcc-11-base gcc-12", " libalgorithm-diff-perl libalgorithm-diff-xs-perl libalgorithm-merge-perl", " libasan6 libasan8 libatomic1 libc-dev-bin libc-devtools libc6-dev libcc1-0", " libcrypt-dev libdeflate0 libdpkg-perl libfakeroot libfile-fcntllock-perl", " libfontconfig1 libgcc-11-dev libgcc-12-dev libgd3 libgomp1 libisl23 libitm1", " libjbig0 libjpeg-turbo8 libjpeg8 liblsan0 libmpc3 libnsl-dev", " libnvidia-cfg1-515-server libnvidia-compute-515-server libpciaccess0", " libquadmath0 libstdc++-11-dev libtiff5 libtirpc-dev libtsan0 libtsan2", " libubsan1 libwebp7 libxpm4 linux-libc-dev lto-disabled-list make", " manpages-dev nvidia-compute-utils-515-server nvidia-dkms-515-server", " nvidia-headless-no-dkms-515-server nvidia-kernel-common-515-server", " nvidia-kernel-source-515-server rpcsvc-proto", "Suggested packages:", " bzip2-doc cpp-doc gcc-11-locales gcc-12-locales debtags menu debian-keyring", " g++-multilib g++-11-multilib gcc-11-doc gcc-multilib autoconf automake", " libtool flex bison gdb gcc-doc gcc-11-multilib gcc-12-multilib gcc-12-doc", " glibc-doc bzr libgd-tools libstdc++-11-doc make-doc", "The following NEW packages will be installed:", " build-essential bzip2 cpp cpp-11 cpp-12 dctrl-tools dkms dpkg-dev fakeroot", " fontconfig-config fonts-dejavu-core g++ g++-11 gcc gcc-11 gcc-11-base gcc-12", " libalgorithm-diff-perl libalgorithm-diff-xs-perl libalgorithm-merge-perl", " libasan6 libasan8 libatomic1 libc-dev-bin libc-devtools libc6-dev libcc1-0", " libcrypt-dev libdeflate0 libdpkg-perl libfakeroot libfile-fcntllock-perl", " libfontconfig1 libgcc-11-dev libgcc-12-dev libgd3 libgomp1 libisl23 libitm1", " libjbig0 libjpeg-turbo8 libjpeg8 liblsan0 libmpc3 libnsl-dev", " libnvidia-cfg1-515-server libnvidia-compute-515-server libpciaccess0", " libquadmath0 libstdc++-11-dev libtiff5 libtirpc-dev libtsan0 libtsan2", " libubsan1 libwebp7 libxpm4 linux-libc-dev lto-disabled-list make", " manpages-dev nvidia-compute-utils-515-server nvidia-dkms-515-server", " nvidia-headless-515-server nvidia-headless-no-dkms-515-server", " nvidia-kernel-common-515-server nvidia-kernel-source-515-server rpcsvc-proto", "0 upgraded, 68 newly installed, 0 to remove and 0 not upgraded.", "Need to get 108 MB/315 MB of archives.", "After this operation, 956 MB of additional disk space will be used.", "Ign:1 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/main amd64 linux-libc-dev amd64 5.15.0-58.64", "Err:1 http://security.ubuntu.com/ubuntu jammy-updates/main amd64 linux-libc-dev amd64 5.15.0-58.64", " 404 Not Found [IP: 54.229.116.227 80]", "Ign:2 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/main amd64 libtiff5 amd64 4.3.0-6ubuntu0.3", "Err:2 http://security.ubuntu.com/ubuntu jammy-updates/main amd64 libtiff5 amd64 4.3.0-6ubuntu0.3", " 404 Not Found [IP: 54.229.116.227 80]", "Ign:3 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-cfg1-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:3 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-cfg1-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Ign:4 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:4 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Ign:5 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-compute-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:5 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-compute-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Ign:6 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:6 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Ign:7 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-common-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:7 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-common-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Ign:8 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:8 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Ign:9 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-no-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:9 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-no-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Ign:10 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:10 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]"]}
failed: [13.37.164.47] (item=nvidia-utils-515-server) => {"ansible_loop_var": "item", "cache_update_time": 1675822253, "cache_updated": false, "changed": false, "item": "nvidia-utils-515-server", "msg": "'/usr/bin/apt-get -y -o "Dpkg::Options::=--force-confdef" -o "Dpkg::Options::=--force-confold" install 'nvidia-utils-515-server=515.86.01-0ubuntu0.22.04.1'' failed: E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-compute-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.225.193 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-utils-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.225.193 80]\nE: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?\n", "rc": 100, "stderr": "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-compute-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.225.193 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-utils-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.225.193 80]\nE: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?\n", "stderr_lines": ["E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-compute-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.225.193 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-utils-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.225.193 80]", "E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?"], "stdout": "Reading package lists...\nBuilding dependency tree...\nReading state information...\nThe following additional packages will be installed:\n libnvidia-compute-515-server\nSuggested packages:\n nvidia-driver-515-server\nThe following NEW packages will be installed:\n libnvidia-compute-515-server nvidia-utils-515-server\n0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.\nNeed to get 50.3 MB of archives.\nAfter this operation, 194 MB of additional disk space will be used.\nIgn:1 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:1 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.225.193 80]\nIgn:2 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:2 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.225.193 80]\n", "stdout_lines": ["Reading package lists...", "Building dependency tree...", "Reading state information...", "The following additional packages will be installed:", " libnvidia-compute-515-server", "Suggested packages:", " nvidia-driver-515-server", "The following NEW packages will be installed:", " libnvidia-compute-515-server nvidia-utils-515-server", "0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.", "Need to get 50.3 MB of archives.", "After this operation, 194 MB of additional disk space will be used.", "Ign:1 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:1 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.225.193 80]", "Ign:2 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:2 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.225.193 80]"]}
failed: [13.37.164.47] (item=nvidia-headless-no-dkms-515-server) => {"ansible_loop_var": "item", "cache_update_time": 1675822253, "cache_updated": false, "changed": false, "item": "nvidia-headless-no-dkms-515-server", "msg": "'/usr/bin/apt-get -y -o "Dpkg::Options::=--force-confdef" -o "Dpkg::Options::=--force-confold" install 'nvidia-headless-no-dkms-515-server=515.86.01-0ubuntu0.22.04.1'' failed: E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-cfg1-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-compute-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-compute-utils-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-common-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-source-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-headless-no-dkms-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?\n", "rc": 100, "stderr": "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-cfg1-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-compute-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-compute-utils-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-common-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-source-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-headless-no-dkms-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]\nE: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?\n", "stderr_lines": ["E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-cfg1-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/libnvidia-compute-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-compute-utils-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-common-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-source-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-headless-no-dkms-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 54.229.116.227 80]", "E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?"], "stdout": "Reading package lists...\nBuilding dependency tree...\nReading state information...\nThe following additional packages will be installed:\n libnvidia-cfg1-515-server libnvidia-compute-515-server libpciaccess0\n nvidia-compute-utils-515-server nvidia-kernel-common-515-server\n nvidia-kernel-source-515-server\nThe following NEW packages will be installed:\n libnvidia-cfg1-515-server libnvidia-compute-515-server libpciaccess0\n nvidia-compute-utils-515-server nvidia-headless-no-dkms-515-server\n nvidia-kernel-common-515-server nvidia-kernel-source-515-server\n0 upgraded, 7 newly installed, 0 to remove and 0 not upgraded.\nNeed to get 107 MB/107 MB of archives.\nAfter this operation, 283 MB of additional disk space will be used.\nIgn:1 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-cfg1-515-server amd64 515.86.01-0ubuntu0.22.04.1\nIgn:2 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1\nIgn:3 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-compute-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1\nIgn:4 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-common-515-server amd64 515.86.01-0ubuntu0.22.04.1\nIgn:5 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1\nIgn:6 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-no-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:1 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-cfg1-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nErr:2 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nErr:3 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-compute-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nErr:4 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-common-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nErr:5 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\nErr:6 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-no-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 54.229.116.227 80]\n", "stdout_lines": ["Reading package lists...", "Building dependency tree...", "Reading state information...", "The following additional packages will be installed:", " libnvidia-cfg1-515-server libnvidia-compute-515-server libpciaccess0", " nvidia-compute-utils-515-server nvidia-kernel-common-515-server", " nvidia-kernel-source-515-server", "The following NEW packages will be installed:", " libnvidia-cfg1-515-server libnvidia-compute-515-server libpciaccess0", " nvidia-compute-utils-515-server nvidia-headless-no-dkms-515-server", " nvidia-kernel-common-515-server nvidia-kernel-source-515-server", "0 upgraded, 7 newly installed, 0 to remove and 0 not upgraded.", "Need to get 107 MB/107 MB of archives.", "After this operation, 283 MB of additional disk space will be used.", "Ign:1 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-cfg1-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Ign:2 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Ign:3 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-compute-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Ign:4 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-common-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Ign:5 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Ign:6 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-no-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:1 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-cfg1-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Err:2 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 libnvidia-compute-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Err:3 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-compute-utils-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Err:4 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-common-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Err:5 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]", "Err:6 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-headless-no-dkms-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 54.229.116.227 80]"]}
failed: [13.37.164.47] (item=nvidia-kernel-source-515-server) => {"ansible_loop_var": "item", "cache_update_time": 1675822253, "cache_updated": false, "changed": false, "item": "nvidia-kernel-source-515-server", "msg": "'/usr/bin/apt-get -y -o "Dpkg::Options::=--force-confdef" -o "Dpkg::Options::=--force-confold" install 'nvidia-kernel-source-515-server=515.86.01-0ubuntu0.22.04.1'' failed: E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-source-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 34.253.189.82 80]\nE: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?\n", "rc": 100, "stderr": "E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-source-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 34.253.189.82 80]\nE: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?\n", "stderr_lines": ["E: Failed to fetch http://security.ubuntu.com/ubuntu/pool/restricted/n/nvidia-graphics-drivers-515-server/nvidia-kernel-source-515-server_515.86.01-0ubuntu0.22.04.1_amd64.deb 404 Not Found [IP: 34.253.189.82 80]", "E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?"], "stdout": "Reading package lists...\nBuilding dependency tree...\nReading state information...\nThe following NEW packages will be installed:\n nvidia-kernel-source-515-server\n0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.\nNeed to get 31.3 MB of archives.\nAfter this operation, 55.3 MB of additional disk space will be used.\nIgn:1 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1\nErr:1 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1\n 404 Not Found [IP: 34.253.189.82 80]\n", "stdout_lines": ["Reading package lists...", "Building dependency tree...", "Reading state information...", "The following NEW packages will be installed:", " nvidia-kernel-source-515-server", "0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.", "Need to get 31.3 MB of archives.", "After this operation, 55.3 MB of additional disk space will be used.", "Ign:1 http://eu-west-3.ec2.archive.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1", "Err:1 http://security.ubuntu.com/ubuntu jammy-updates/restricted amd64 nvidia-kernel-source-515-server amd64 515.86.01-0ubuntu0.22.04.1", " 404 Not Found [IP: 34.253.189.82 80]"]}

PLAY RECAP *****************************************************************************************************************
13.37.164.47 : ok=4 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0

Here is the playbook:

  • name: Provisioning
    hosts: webserver
    roles:
    • role: nvidia.nvidia_driver
    • role: nvidia.nvidia_docker

Impossible to install drivers on fresh Ubuntu focal

Ansible role fails with :

TASK [nvidia.nvidia_driver : add key]
ok: [gpu1]

TASK [nvidia.nvidia_driver : add repo]
fatal: [gpu1]: FAILED! => {
"changed": false, "msg": "Failed to update apt cache:
E:Failed to fetch https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/by-hash/SHA256/751939d95516afc289908a19e447f0acc1506367f72ed356431a2b1a469cc8ca 404 Not Found [IP: 152.199.20.126 443],
E:Some index files failed to download. They have been ignored, or old ones used instead."}

$ sudo apt-key list
/etc/apt/trusted.gpg
--------------------
pub   rsa4096 2016-06-24 [SC]
      AE09 FE4B BD22 3A84 B2CC  FCE3 F60F 4B3D 7FA2 AF80
uid           [ unknown] cudatools <[email protected]>
$ sudo apt-get update
[...]
Ign:5 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  InRelease
Get:7 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  Release [697 B]
Get:8 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  Release.gpg [836 B]
Ign:9 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  Packages
Ign:9 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  Packages
Ign:9 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  Packages
Err:9 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  Packages
  404  Not Found [IP: 152.199.20.126 443]
Fetched 836 B in 2s (392 B/s)
Reading package lists... Done
E: Failed to fetch https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/by-hash/SHA256/751939d95516afc289908a19e447f0acc1506367f72ed356431a2b1a469cc8ca  404  Not Found [IP: 152.199.20.126 443]
E: Some index files failed to download. They have been ignored, or old ones used instead.

It seems that the whole by-hash dir is missing on nvidia repos :
https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/by-hash/

image

As a workaround I had to change the the apt_repository task on my side adding a special by-hash=no option to avoid previous error :

$ cat /etc/apt/sources.list.d/developer_download_nvidia_com_compute_cuda_repos_ubuntu2004_x86_64.list 
deb [by-hash=no] http://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 /

Is it a transient issue on nvidia repo side ? Or should we permanently add this "no-hash" fix on ansible templates ?

Thanks

for debian?

it's posible exttend role for debian stretch or buster?
Thanks.

Unable to specify older branch on RHEL 7.x

The logic in line:

name: "{{ nvidia_driver_package_version | ternary('nvidia-driver-latest-dkms-'+nvidia_driver_package_version, 'nvidia-driver-branch-'+nvidia_driver_rhel_branch) }}"

hasn't been working for me.

I've specified:

  vars:
    - nvidia_driver_package_version: '465.19.01-1'
    - nvidia_driver_branch: '465'

and yet I see this error:

No package matching 'nvidia-driver-latest-dkms-465.19.01-1' found available, installed or updated

New drivers are not published to repository

Hello,

The latest 430 drivers are not published to the NVIDIA developer repository. The Ubuntu repository used by this role, http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ only contains up to driver version 410.

install from cuda repo fails

When setting nvidia_driver_ubuntu_install_from_cuda_repo: yes I get:

TASK [nvidia.nvidia_driver : install driver packages] ****************************************************
fatal: [nodename]: FAILED! => {"changed": false, "msg": "No package matching 'cuda-drivers-510' is available"}

The package doesn't exist:

$ apt-cache search ^cuda-drivers
cuda-drivers-fabricmanager-450 - Meta-package for FM and Driver
cuda-drivers-fabricmanager-460 - Transitional package for cuda-drivers-fabricmanager-510
cuda-drivers-fabricmanager-470 - Meta-package for FM and Driver
cuda-drivers-fabricmanager-510 - Meta-package for FM and Driver

No issues when setting nvidia_driver_ubuntu_install_from_cuda_repo: no, but I would like to install from the CUDA repository though.

Kernel kernel-devel mismatch on non updated systems breaks dkms and module load

After running playbook I ended up with a non working driver and this:

Installed Packages
kernel.x86_64 3.10.0-957.el7 @anaconda
kernel-debug-devel.x86_64 3.10.0-1127.18.2.el7 @updates
kernel-devel.x86_64 3.10.0-1127.18.2.el7 @updates
kernel-headers.x86_64 3.10.0-957.el7 @anaconda
kernel-tools.x86_64 3.10.0-957.el7 @anaconda
kernel-tools-libs.x86_64 3.10.0-957.el7 @anaconda

Might be good to install the kernel headers/devel for the running kernel on systems that require specific kernel versions?
Note: updating to the newer kernel did fix the issue as I assume downgrading headers/devel would have as well

configure persistenced service to turn on persistence mode fails in Ubuntu 18.04

First, thanks for this very useful role.

I'm having an issue when trying to install the drivers on an Ubuntu 18.04 VM with PCIe passthrough (on plain KVM).

I ran the playbook like this:

ansible-playbook -i vm-inventory.yml -e host=test -e nvidia_driver_ubuntu_install_from_cuda_repo=yes -e nvidia_driver_persistence_mode_on=no nvidia-driver.yml

and everything seems fine, except that it still runs the configure persistenced service to turn on persistence task than then fails

TASK [nvidia.nvidia_driver : configure persistenced service to turn on persistence mode] ***************************************************************************************************************************************************************************************
fatal: [test]: FAILED! => {"changed": false, "checksum": "d9db4b73d83eca98ac2179a02847f1ab0a2cb866", "msg": "Source /root/.ansible/tmp/ansible-tmp-1661936608.0951219-393830-90866238334705/source not found"}

Below the verbose output:


TASK [nvidia.nvidia_driver : configure persistenced service to turn on persistence mode] ***************************************************************************************************************************************************************************************
task path: /home/ato/.ansible/roles/nvidia.nvidia_driver/tasks/main.yml:26
<test> CONNECT TO qemu+ssh://ato@majinbu/system
<test> FIND DOMAIN test
<test> ESTABLISH community.libvirt.libvirt_qemu CONNECTION
<test> EXEC /bin/sh -c 'echo ~ && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "echo ~ && sleep 0"]}}
<test> GA return: {'return': {'pid': 31557}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31557}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 0, 'out-data': 'fgo=', 'exited': True}}
<test> GA stdout: ~
<test> GA stderr:
<test> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo ~/.ansible/tmp `"&& mkdir "` echo ~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430 `" && echo ansible-tmp-1661936792.938617-393878-3538192051430="` echo ~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430 `" ) && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "( umask 77 && mkdir -p \"` echo ~/.ansible/tmp `\"&& mkdir \"` echo ~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430 `\" && echo ansible-tmp-1661936792.938617-393878-3538192051430=\"` echo ~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430 `\" ) && sleep 0"]}}
<test> GA return: {'return': {'pid': 31559}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31559}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 0, 'out-data': 'YW5zaWJsZS10bXAtMTY2MTkzNjc5Mi45Mzg2MTctMzkzODc4LTM1MzgxOTIwNTE0MzA9fi8uYW5zaWJsZS90bXAvYW5zaWJsZS10bXAtMTY2MTkzNjc5Mi45Mzg2MTctMzkzODc4LTM1MzgxOTIwNTE0MzAK', 'exited': True}}
<test> GA stdout: ansible-tmp-1661936792.938617-393878-3538192051430=~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430
<test> GA stderr:
Using module file /home/ato/.local/lib/python3.8/site-packages/ansible/modules/stat.py
<test> PUT /home/ato/.ansible/tmp/ansible-local-3938357gq_ahvb/tmpx3vmg1rs TO ~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_stat.py
<test> GA send: {"execute": "guest-file-open", "arguments": {"path": "~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_stat.py", "mode": "wb+"}}
<test> GA return: {'return': 1088}
<test> GA send: {"execute": "guest-file-close", "arguments": {"handle": 1088}}
<test> GA return: {'return': {}}
<test> EXEC /bin/sh -c 'chmod u+x '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/'"'"' '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_stat.py'"'"' && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "chmod u+x '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/' '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_stat.py' && sleep 0"]}}
<test> GA return: {'return': {'pid': 31567}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31567}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 0, 'exited': True}}
<test> GA stdout:
<test> GA stderr:
<test> EXEC /bin/sh -c '/usr/bin/python3.6 '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_stat.py'"'"' && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "/usr/bin/python3.6 '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_stat.py' && sleep 0"]}}
<test> GA return: {'return': {'pid': 31570}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31570}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 0, 'out-data': 'CnsiY2hhbmdlZCI6IGZhbHNlLCAic3RhdCI6IHsiZXhpc3RzIjogZmFsc2V9LCAiaW52b2NhdGlvbiI6IHsibW9kdWxlX2FyZ3MiOiB7InBhdGgiOiAiL2V0Yy9zeXN0ZW1kL3N5c3RlbS9udmlkaWEtcGVyc2lzdGVuY2VkLnNlcnZpY2UuZC9vdmVycmlkZS5jb25mIiwgImZvbGxvdyI6IGZhbHNlLCAiZ2V0X2NoZWNrc3VtIjogdHJ1ZSwgImNoZWNrc3VtX2FsZ29yaXRobSI6ICJzaGExIiwgImdldF9tZDUiOiBmYWxzZSwgImdldF9taW1lIjogdHJ1ZSwgImdldF9hdHRyaWJ1dGVzIjogdHJ1ZX19fQo=', 'exited': True}}
<test> GA stdout:
{"changed": false, "stat": {"exists": false}, "invocation": {"module_args": {"path": "/etc/systemd/system/nvidia-persistenced.service.d/override.conf", "follow": false, "get_checksum": true, "checksum_algorithm": "sha1", "get_md5": false, "get_mime": true, "get_attributes": true}}}
<test> GA stderr:
<test> PUT /home/ato/.ansible/roles/nvidia.nvidia_driver/files/nvidia-persistenced-override.conf TO ~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source
<test> GA send: {"execute": "guest-file-open", "arguments": {"path": "~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source", "mode": "wb+"}}
<test> GA return: {'return': 1089}
<test> GA send: {"execute": "guest-file-close", "arguments": {"handle": 1089}}
<test> GA return: {'return': {}}
<test> EXEC /bin/sh -c 'chmod u+x '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/'"'"' '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source'"'"' && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "chmod u+x '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/' '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source' && sleep 0"]}}
<test> GA return: {'return': {'pid': 31574}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31574}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 0, 'exited': True}}
<test> GA stdout:
<test> GA stderr:
Using module file /home/ato/.local/lib/python3.8/site-packages/ansible/modules/copy.py
<test> PUT /home/ato/.ansible/tmp/ansible-local-3938357gq_ahvb/tmp9ikfkhqz TO ~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_copy.py
<test> GA send: {"execute": "guest-file-open", "arguments": {"path": "~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_copy.py", "mode": "wb+"}}
<test> GA return: {'return': 1090}
<test> GA send: {"execute": "guest-file-close", "arguments": {"handle": 1090}}
<test> GA return: {'return': {}}
<test> EXEC /bin/sh -c 'chmod u+x '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/'"'"' '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_copy.py'"'"' && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "chmod u+x '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/' '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_copy.py' && sleep 0"]}}
<test> GA return: {'return': {'pid': 31577}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31577}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 0, 'exited': True}}
<test> GA stdout:
<test> GA stderr:
<test> EXEC /bin/sh -c '/usr/bin/python3.6 '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_copy.py'"'"' && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "/usr/bin/python3.6 '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/AnsiballZ_copy.py' && sleep 0"]}}
<test> GA return: {'return': {'pid': 31580}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31580}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 1, 'out-data': 'CnsiZmFpbGVkIjogdHJ1ZSwgIm1zZyI6ICJTb3VyY2UgL3Jvb3QvLmFuc2libGUvdG1wL2Fuc2libGUtdG1wLTE2NjE5MzY3OTIuOTM4NjE3LTM5Mzg3OC0zNTM4MTkyMDUxNDMwL3NvdXJjZSBub3QgZm91bmQiLCAiaW52b2NhdGlvbiI6IHsibW9kdWxlX2FyZ3MiOiB7InNyYyI6ICIvcm9vdC8uYW5zaWJsZS90bXAvYW5zaWJsZS10bXAtMTY2MTkzNjc5Mi45Mzg2MTctMzkzODc4LTM1MzgxOTIwNTE0MzAvc291cmNlIiwgImRlc3QiOiAiL2V0Yy9zeXN0ZW1kL3N5c3RlbS9udmlkaWEtcGVyc2lzdGVuY2VkLnNlcnZpY2UuZC9vdmVycmlkZS5jb25mIiwgIl9vcmlnaW5hbF9iYXNlbmFtZSI6ICJudmlkaWEtcGVyc2lzdGVuY2VkLW92ZXJyaWRlLmNvbmYiLCAiZm9sbG93IjogZmFsc2UsICJjaGVja3N1bSI6ICJkOWRiNGI3M2Q4M2VjYTk4YWMyMTc5YTAyODQ3ZjFhYjBhMmNiODY2IiwgImJhY2t1cCI6IGZhbHNlLCAiZm9yY2UiOiB0cnVlLCAidW5zYWZlX3dyaXRlcyI6IGZhbHNlLCAiY29udGVudCI6IG51bGwsICJ2YWxpZGF0ZSI6IG51bGwsICJkaXJlY3RvcnlfbW9kZSI6IG51bGwsICJyZW1vdGVfc3JjIjogbnVsbCwgImxvY2FsX2ZvbGxvdyI6IG51bGwsICJtb2RlIjogbnVsbCwgIm93bmVyIjogbnVsbCwgImdyb3VwIjogbnVsbCwgInNldXNlciI6IG51bGwsICJzZXJvbGUiOiBudWxsLCAic2VsZXZlbCI6IG51bGwsICJzZXR5cGUiOiBudWxsLCAiYXR0cmlidXRlcyI6IG51bGx9fX0K', 'exited': True}}
<test> GA stdout:
{"failed": true, "msg": "Source /root/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source not found", "invocation": {"module_args": {"src": "/root/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source", "dest": "/etc/systemd/system/nvidia-persistenced.service.d/override.conf", "_original_basename": "nvidia-persistenced-override.conf", "follow": false, "checksum": "d9db4b73d83eca98ac2179a02847f1ab0a2cb866", "backup": false, "force": true, "unsafe_writes": false, "content": null, "validate": null, "directory_mode": null, "remote_src": null, "local_follow": null, "mode": null, "owner": null, "group": null, "seuser": null, "serole": null, "selevel": null, "setype": null, "attributes": null}}}
<test> GA stderr:
<test> EXEC /bin/sh -c 'rm -f -r '"'"'~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/'"'"' > /dev/null 2>&1 && sleep 0'
<test> GA send: {"execute": "guest-exec", "arguments": {"path": "/bin/sh", "capture-output": true, "arg": ["-c", "rm -f -r '~/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/' > /dev/null 2>&1 && sleep 0"]}}
<test> GA return: {'return': {'pid': 31583}}
<test> GA send: {"execute": "guest-exec-status", "arguments": {"pid": 31583}}
<test> GA return: {'return': {'exited': False}}
<test> GA return: {'return': {'exitcode': 0, 'exited': True}}
<test> GA stdout:
<test> GA stderr:
fatal: [test]: FAILED! => {
    "changed": false,
    "checksum": "d9db4b73d83eca98ac2179a02847f1ab0a2cb866",
    "diff": [],
    "invocation": {
        "module_args": {
            "_original_basename": "nvidia-persistenced-override.conf",
            "attributes": null,
            "backup": false,
            "checksum": "d9db4b73d83eca98ac2179a02847f1ab0a2cb866",
            "content": null,
            "dest": "/etc/systemd/system/nvidia-persistenced.service.d/override.conf",
            "directory_mode": null,
            "follow": false,
            "force": true,
            "group": null,
            "local_follow": null,
            "mode": null,
            "owner": null,
            "remote_src": null,
            "selevel": null,
            "serole": null,
            "setype": null,
            "seuser": null,
            "src": "/root/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source",
            "unsafe_writes": false,
            "validate": null
        }
    },
    "msg": "Source /root/.ansible/tmp/ansible-tmp-1661936792.938617-393878-3538192051430/source not found"
}

Any suggestion?

ubuntu 20.04 support

What is exactly needed to "officially" support 20.04?

I ran the role on a 20.04 machine and superficially it seems fine.

Support for AlmaLinux

I tried installing the nvidia driver on an AlmaLinux 9 system, but unfortunately the driver misses support for this

Expected behavior

NVidia driver role supports installation on Redhat family system AlmaLinux

Actual behavior

NVidia driver role does not support AlmaLinux

Cause of this behavior is in main.yml

- name: redhat family install tasks
  include_tasks: install-redhat.yml
  when: ansible_os_family == 'RedHat'

Output of ansible [node] -m setup

"ansible_os_family": "AlmaLinux"`,
 "ansible_distribution_major_version": "9",

Please add support for AlmaLinux
Note: I'm happy to create a pull-request to add this support

Role doesnt work on RHEL9

Hi,

There seems to be an issue when trying to install on RHEL9.

It tries to dnf install using the module name: nvidia-driver:525.125.06
While the actual module name for version 525.125.06 is: nvidia-driver:525-dkms:20230804160227:f6b002c604:x86_64

Are you planning to update the role to support RHEL9?

Add workaround for Ubuntu postinstall script bug?

I'm currently running into this longstanding issue on Ubuntu 20.04 with the following configuration:

  - role: nvidia.nvidia_driver
    nvidia_driver_ubuntu_branch: 340
    nvidia_driver_ubuntu_packages: [ 'nvidia-340' ]

I don't have this issue on machines when using the default NVIDIA 450 packages but I can't install this on all my systems due to hardware compatibility, and anyway it's possible that some recent versions are affected in combination with some kernels anyway (e.g., in the Launchpad thread above, a user reports having issues with version 465).

It seems there is known error text and a known workaround, so it might be useful to add to this role.

Specify the version of driver package

I've got a problem to install specific NVIDIA Driver on RHEL8

After PR #53 was able to install the branch version of the driver.
But I need to install an older version rather than the most recent version of the branch.

Therefore, "nvidia_driver_package_version" was specified as follows.

nvidia_driver_package_state: present
nvidia_driver_package_version: '470.82.01-1.el8'
nvidia_driver_persistence_mode_on: yes
nvidia_driver_skip_reboot: no
nvidia_driver_module_file: /etc/modprobe.d/nvidia.conf
nvidia_driver_module_params: ''
nvidia_driver_add_repos: yes
nvidia_driver_branch: "470"

But it says the package cannot be found.

fatal: [GPU-SVR01]: FAILED! => changed=false
msg: No group nvidia-driver:470.82.01-1.el8 available.
results: [] 

Is the nvidia_driver_package_version format wrong?

Support for Ubuntu 20.04

Is there support for Ubuntu 20.04 planned? The role currently cannot be installed on the latest ubuntu.

Support for Debian

Hi there,

i wanted to thank you for the nice ansible role. Unfortunately Debian does not seem to be officially supported. But I managed it with a bit of variable overriding.
I would be happy if debian would be officially supported. Until then, maybe this will help someone who uses debian to use this role anyway.

# my playbook
.....
  roles:
    - role: unix-basics
      tags: unix-basics
    - role: xanmanning.k3s
      tags: k3s
    - role: nvidia.nvidia_driver  # should run after cluster install
      vars:
        # See https://github.com/NVIDIA/ansible-role-nvidia-driver#role-variables
        nvidia_driver_ubuntu_cuda_repo_baseurl: 'https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64'  # enforced 'debian11'
        nvidia_driver_ubuntu_install_from_cuda_repo: yes
        nvidia_driver_persistence_mode_on: yes
        ansible_distribution: Ubuntu  # forcing in to the ubuntu part of the role
      when: ansible_hostname == 'k3s-worker1'  # we only have ONE node with NVIDIA
      tags:
        - nvidia
....

How to know the correct versions for nvidia_driver_branch and nvidia_driver_package_version

Hi,

I'm currently trying to deploy the nvidia drivers as well as the nvidia CUDA toolkit using this Ansible playbook on Ubuntu 22.04 machines.

First, I'm not quite sure the impact of the nvidia_driver_ubuntu_install_from_cuda_repo flag. Aside from using different repositories to install the package, what would the use case to choose one option or the other?

I managed to install the drivers with the nvidia_driver_branch: '525'. Previously, we were using the nvidia_driver_branch: '515' option. How do I know which values are available?

Finally, what is the difference with the nvidia_driver_package_version version? Again, how do I know which values are available?

Thanks in advance for your help.

Emmanuel

"Running reboot with local connection would reboot the control node."

I've downloaded this role through Galaxy and am attempting to get Rapids.AI installed.

I've created a file nvidia-driver.yml

---
- hosts: localhost
  roles: 
  - nvidia.nvidia_driver

and execute it with ansible-playbook nvidia-driver.yml. When it runs, the last command ( a reboot instruction ) fails with

"Running reboot with local connection would reboot the control node."

Rebooting the system, and running the playbook again, returns success.

Now my questions:

  • Can the README be extended to say something about use of
    this module? I am just familiar enough with Ansible to be dangerous,
    but Galaxy was new to me, and I ended up referring to this simple
    tutorial at https://www.jeffgeerling.com/blog/using-ansible-galaxy to
    make forward progress.
  • Is there a straightforward way to determine whether the install
    was successful? A verification step would be helpful.

thanks

Driver does not Install or Configure SLI correctly

Hello,

I am realizing that having 2 GPU or more installed does not work with the module.

If I run the driver 515.105.01, via the .run, it works for dual GPU, but running the playbook does not.

Rocky Linux 8.7, RTX 2080TI (Dual).

Please do not hesitate if you need anymore info,

Signed drivers for Secure Boot

I'd like to use this role to provision drivers on a VM with Secure Boot enabled. What should I do to request signed drivers?

Support ubuntu 22.04

The new LTS release of Ubuntu was released a couple of months ago. It would be great if this release could also be supported.

Problem to install the NVIDIA drivers on CentOS 8

Hi,

I've got a problem to install the NVIDIA Driver on CentOS8.

When installing for the first time, the latest version 495.29.05 was installed because the version was not specified.
and then I have specified the driver version in default/main.yml as shown below.

nvidia_driver_package_state: present
nvidia_driver_package_version: '470.57.02-1'
nvidia_driver_persistence_mode_on: yes

after that I ran the playbook, but the installation failed as follows:

TASK [nvidia.nvidia_driver : install driver packages RHEL/CentOS 8 and newer] *********************************************************************************
fatal: [gdp-glm-gpu001]: FAILED! => changed=false
msg: No group nvidia-driver:470.57.02-1 available.
results: []

Even if I try changing it to multiple versions as shown below, it fails the same.

450.142.00
450.156.00-1
460.91.03-1
460.106.00-1
470.42.01-1
470.57.02-1

Use DKMS drivers as default, or provide an option to do that.

Maybe related to #4.

Here https://github.com/NVIDIA/ansible-role-nvidia-driver/blob/master/tasks/install-redhat.yml#L21 it is installing the cuda-drivers meta-package, which points to the non-dkms packages. In the repo (for example https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/) there are -dkms versions for the nvidia and cuda latest drivers. Installing non-dkms versions is brittle, because any kernel update requires re-running this playbook or getting broken drivers, something that is non-obvious.

Either using DKMS drivers by default, or providing an option to do just that, would be useful. We can fix it by installing those -dkms individual packages by hand, but an nvidia-maintained playbook or DKMS metapackage would prevent that solution from breaking in the future.

Using specific package version installation in error

Installation of a specific package version is in error. Package is not found.
I was able to download changing this line:

name: "{{ nvidia_driver_package_version | ternary('nvidia-driver-latest-dkms='+nvidia_driver_package_version, 'nvidia-driver-latest-dkms') }}"

There is a typo in constructed file name : a '=' instead of a '-'
'nvidia-driver-latest-dkms='+nvidia_driver_package_version

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.