Comments (4)
This bug is due to the breaking changes in NVIDIA Driver R535.xx series (affected versions are >= 535.43
, < 535.98
.
TL;DR)
- Avoid NVIDIA Drivers between 535.43 and 535.86. These are broken. If you must use this driver version, use
pip install nvidia-ml-py == 12.535.77
as a workaround. - If you use NVIDIA Drivers 535.104.05+ and pynvml 12.535.108+, process information will be OK.
NVIDIA Driver Changes:
-
535.43.xx (NVIDIA/nvidia-settings@39c3e28) added a field
usedGpuCcProtectedMemory
tonvmlProcessInfo_st
, which breaks the process information API. The only compatiblepynvml
version is 12.535.77. -
535.54.xx (still affected)
-
535.86.xx (still affected)
-
535.98.xx (NVIDIA/nvidia-settings@0cb3bef) reverts the change, removing the field
usedGpuCcProtectedMemory
fromnvmlProcessInfo_st
(v2 API). -
535.104.05 (NVIDIA/nvidia-settings@74cae7f) everything is now fixed. Adds
nvmlProcessInfo_v2_st
again withoutusedGpuCcProtectedMemory
(which is correct). Needspynvml
>=
12.535.108
.
Cross-ref: XuehaiPan/nvitop#88 (comment)
from gpustat.
We won't be adding monkey-patching because it is extremely complex to manage all the combinations. The buggy versions of nvidia drivers (535.43 and 535.86) and nvidia-ml-py 12.535.77 should be avoided, but there is a working workaround. I've added a warning message shown when such incompatible versions of driver/pynvml are found.
from gpustat.
I see, thanks for the report. I will make an update to support the new nvidia driver. Probably the same issue as #157.
from gpustat.
Hi, I'll be using nvidia-ml-py 12.535.77 for now. Many thanks for the help.
from gpustat.
Related Issues (20)
- Some low-level errors (like `pynvml.nvml.NVMLError_LibRmVersionMismatch`) result in nothing printed (std or diagnostic) HOT 1
- UserWarning: Failed to setupterm(kind='xterm'): setupterm: could not find terminfo database HOT 1
- Support anaconda's legacy pynvml package HOT 7
- How to obtain RPM value for the fans ? HOT 2
- Plugin Architecture
- module 'pynvml' has no attribute '_nvmlGetFunctionPointer' HOT 17
- Truncate the "command" when use "-f" HOT 1
- ModuleNotFoundError: No module named '_curses' HOT 2
- ModuleNotFoundError: No module named '_curses' HOT 2
- Process not displayed HOT 3
- make appimage format or binary file οΌit can run everywhere HOT 1
- make appimage format HOT 1
- Error on querying NVIDIA devices | OverflowError: Python int too large to convert to C long HOT 9
- Include GDDR6(X) VRAM temperatures HOT 2
- Show CUDA Driver Version in the output HOT 2
- Enhance gpustat to Display Latest CUDA Version Compatible with Current NVIDIA Driver HOT 4
- Even more compact (single-line) output for statusline use HOT 2
- Misreported used memory with the driver 535.129.03 HOT 3
- /usr/bin/gpustat:6: DeprecationWarning: pkg_resources is deprecated as an API. HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gpustat.