I tried this for 418.81 on Windows 10 64 bits and is not working. Our software use

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

New release is uploaded: <a href="https://github.com/jantenhove/NvencSessionLimitBump/

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Does this actually override CUDA or NVENC sessions? about nvidia-patch HOT 36 CLOSED

keylase commented on August 22, 2024

Does this actually override CUDA or NVENC sessions?

from nvidia-patch.

Comments (36)

jantenhove commented on August 22, 2024 4

I have created a 'session bump' program which bumps the sessions for Direct3D by creating a configurable number of Cuda encoding sessions. Code + binary can be found here: https://github.com/jantenhove/NvencSessionLimitBump

Anyone willing to test/comment?

from nvidia-patch.

jantenhove commented on August 22, 2024 3

The PR is merged. I commented on the PR report. It's nice to see it working for others!

from nvidia-patch.

matthew1972 commented on August 22, 2024 3

the 3d bump is working for me great work guys ….

from nvidia-patch.

imayo commented on August 22, 2024 2

Yeah, it has no problems with it, it can encode 3 files concurrently.
I have seen debug logs and it is using

It seems that it is using cuda functions to use cuda device as input for nvenc. Nvenc can be initialized with CUDA or DirectX and we use DirectX device so maybe this patch currently does only unlock encoding sessions initialized with NV_ENC_DEVICE_TYPE_CUDA.

EDIT: We can now encode more than 2 concurrent streams after runing ffmpeg code sample.

from nvidia-patch.

Snawoot commented on August 22, 2024 2

@jantenhove Thank you for your code!

from nvidia-patch.

jantenhove commented on August 22, 2024 2

@Snawoot Thank you for analyzing the problem so quickly.
If there is demand I can make a dedicated, open source program to unlock a configurable number of D3D11 sessions.

from nvidia-patch.

jantenhove commented on August 22, 2024 2

New release is uploaded: https://github.com/jantenhove/NvencSessionLimitBump/releases

from nvidia-patch.

imayo commented on August 22, 2024 1

I can confirm. I was able to encode more than 2 instances with our software, if I reboot windows after that i can´t encode more than 2 again but if I run ffmpeg code again then we can encode again more than 2 instances.
My guess, maybe the code you are modifiying at nvcuvid when initialized does enable some flag that nvidia uses to enable this windows session to encode more than 2 streams.

from nvidia-patch.

jantenhove commented on August 22, 2024 1

@Snawoot Sorry for uploading the wrong binary. I debugged the x64 version, but created a x86 release build. Anyway, here is a x64 build: https://www.filehosting.org/file/details/784619/nvenc-patch-test.exe

from nvidia-patch.

Snawoot commented on August 22, 2024 1

Here is my results. Journey through about ten levels of call stack leads to D3D and then to nvwgf2umx.dll. This patch has been applied in memory of process debugged with x64dbg. With this patch D3D encoding session opens successfully and this bumped limit also persists even if process restarted with unmodified library nvwgf2umx.dll.

Probably this discovery also may help Plex users on Windows since Plex currently uses dxva2 and MF.

But there is two problems:

I didn't tested real encoding because test binary only opens encoding sessions. But I think most likely it should work.
nvwgf2umx.dll is part of driver component and covered by system integrity protection or digital signatures. Normally I can't modify this file even with Administrator privileges. Of course, I can modify file on disk using external OS. Actually I did it with mount of qcow2 image of Windows system to my host Linux system. It appears system won't load modified library.

It seems to me, it is more practical to bump D3D sessions via bumping CUDA sessions with some sort of minimal binary opening several sessions, because it is simpler to add single one-shot executable to autostart than bothering with system protection every time. Also maintenance of one binary patch takes less efforts than maintenance of two.

If someone feels like he is up to implement such bumping binary in form of standalone application with source code - feel free to make Pull Request. Also parameterized script for FFmpeg will do. I wonder if ffmpeg has some sort of dummy input which fits here best.

from nvidia-patch.

Snawoot commented on August 22, 2024 1

And here is minimal ffmpeg script which bumps 10 sessions with nullsrc input and null output: https://gist.github.com/Snawoot/243c53bb52044297f5ceb6125d59dc93 (don't forget to set actual ffmpeg path in script).

I'll add this with proper description to win readme.md thereby closing this issue.

from nvidia-patch.

Snawoot commented on August 22, 2024 1

@jantenhove

Your app works just fine with VC++ redist installed.

I asked fellows to see if this code can be statically linked against VC++ runtime in order to simplify things for users and make app standalone. @svjukov modified project to link runtime statically, leaving link to nvcuda dynamic. I tested app on clean system without VC2017 Redist and it works without a hitch.

Sergey prepared PR awaiting for your review. I think it is very useful change and hope for merge.

from nvidia-patch.

jantenhove commented on August 22, 2024 1

I will do that tomorrow. I'm currently on mobile. Thanks for all your work!

from nvidia-patch.

Snawoot commented on August 22, 2024 1

Thank you!

from nvidia-patch.

niXta1 commented on August 22, 2024 1

Good job guys! ⭐⭐⭐⭐⭐

from nvidia-patch.

niXta1 commented on August 22, 2024

Do you use ffmpeg?

from nvidia-patch.

imayo commented on August 22, 2024

To clarify things :)
The software we are developing is Dixper (www.dixper.gg) and we use directly nvidia video codec SDK, support also GRID SDK. I tried applying the path and running our server to see if we could connect more than 2 peers with NVENC encoder but after patching it did just behave like without it, 3rd NVENC instance will fail with I think NV_ENC_ERR_OUT_OF_MEMORY. After that i did check if nvcuvid.dll was in the process and it is not, so maybe we are using something so different from ffmpeg/Plex.

from nvidia-patch.

imayo commented on August 22, 2024

Do you use ffmpeg?

No, we don´t, we use Nvidia Video Codec SDK and GRID SDK directly.

from nvidia-patch.

Snawoot commented on August 22, 2024

@imayo

Nice to meet you. Let's say I'm guy with debugger and disassembler here.

This patch is intended to patch NVENC (and only NVENC). This patch should work with any NVENC-enabled software but testing criteria is still ffmpeg since many software derived from it.

About your concern for the name of patched library: as I mentioned before, nvcuvid.dll is loaded dynamically by NvEncodeAPI.dll

Your error NV_ENC_ERR_OUT_OF_MEMORY looks like you really use NVENC, but something goes wrong. For some reason library is not patched or rolled back by System File Protection after patch, or 32bit-library is used somehow.

In order to sort things out please perform test with 64bit ffmpeg. You may run 3 simultaneous transcodes by issuing command like this:

ffmpeg -i input.avi -s 1280x720 -v:c h264_nvenc output1.mp4 -s 640x480 -v:c h264_nvenc output2.mp4 -s 320x240 -v:c h264_nvenc output3.mp4

If it will simply fail with same error we will know patch is simply not applied. If it'll work we shall seek problem somewhere else.

from nvidia-patch.

imayo commented on August 22, 2024

@imayo

Nice to meet you. Let's say I'm guy with debugger and disassembler here.

This patch is intended to patch NVENC (and only NVENC). This patch should work with any NVENC-enabled software but testing criteria is still ffmpeg since many software derived from it.

About your concern for the name of patched library: as I mentioned before, nvcuvid.dll is loaded dynamically by NvEncodeAPI.dll

Your error NV_ENC_ERR_OUT_OF_MEMORY looks like you really use NVENC, but something goes wrong. For some reason library is not patched or rolled back by System File Protection after patch, or 32bit-library is used somehow.

In order to sort things out please perform test with 64bit ffmpeg. You may run 3 simultaneous transcodes by issuing command like this:
ffmpeg -i input.avi -s 1280x720 -v:c h264_nvenc output1.mp4 -s 640x480 -v:c h264_nvenc output2.mp4 -s 320x240 -v:c h264_nvenc output3.mp4
If it will simply fail with same error we will know patch is simply not applied. If it'll work we shall seek problem somewhere else.

I am getting:

Invalid loglevel "h264_nvenc". Possible levels are numbers or:
"quiet"
"panic"
"fatal"
"error"
"warning"
"info"
"verbose"
"debug"
"trace"

Maybe the binaries i downloaded are compiled without nvenc support?

from nvidia-patch.

Snawoot commented on August 22, 2024

@imayo No, it's unlikely. Binaries at FFmpeg site refer to this site and they are built with NVENC.

Probably it's typo in your command line. You may post it here and we'll take a look.

from nvidia-patch.

imayo commented on August 22, 2024

@imayo No, it's unlikely. Binaries at FFmpeg site refer to this site and they are built with NVENC.

Probably it's typo in your command line. You may post it here and we'll take a look.

Yeah, got it -c:v instead of -v:c

from nvidia-patch.

Snawoot commented on August 22, 2024

It looks weird. Is it possible in your dev environment co-exist multiple versions of nvcuvid.dll, probably installed with some additional SDK package?

from nvidia-patch.

Snawoot commented on August 22, 2024

No, it removes conditional jump leading to failure return, when one of subroutines indicates active sessions above limit.

Probably you should use x64dbg and see which libraries are getting loaded. This debugger has useful feature to set breakpoint on each dll load, including programmatically initiated dynamic loads. I bet different set of libraries co-exist in system.

from nvidia-patch.

imayo commented on August 22, 2024

No, it removes conditional jump leading to failure return, when one of subroutines indicates active sessions above limit.

Probably you should use x64dbg and see which libraries are getting loaded. This debugger has useful feature to set breakpoint on each dll load, including programmatically initiated dynamic loads. I bet different set of libraries co-exist in system.

I will try, although this is not in my skills :)
And, why would you say when executing ffmpeg which calls patched nvcuvid then our software can initialize more nvenc instances¿

from nvidia-patch.

jantenhove commented on August 22, 2024

I have exactly the same problem. We use NVENC directly with Direct3D. After patching we still get the NV_ENC_ERR_OUT_OF_MEMORY for the third session. When analyzing the libraries that are being loaded by the executable, we see 'nvEncodeAPI64.dll' getting loaded. Nvcuvid.dll is not loaded by our binary.

I can also confirm that running the FFMPEG command above enables 1 extra session for our software. The fourth sessions still fails with the out of memory error. When i change the ffmpeg command to create 6 outputs, I can use 6 NVENC sessions in our software.

from nvidia-patch.

Snawoot commented on August 22, 2024

@jantenhove Hello,

Is there some way I can reproduce it on clean Windows machine? Probably some mininal executable would be ideal.

from nvidia-patch.

jantenhove commented on August 22, 2024

@jantenhove Hello,

Is there some way I can reproduce it on clean Windows machine? Probably some mininal executable would be ideal.

Thanks for reopening. I will create a simple test program based on the Direct3D sample from the SDK.

from nvidia-patch.

jantenhove commented on August 22, 2024

@Snawoot
I've create a simple test program: https://www.filehosting.org/file/details/784491/nvenc-patch-test.exe
It tries to create 3 encoding sessions on each graphics card and shows if it succeeds or fails. You probably need vs 2017 redistributable installed.

When it fails after creating 2 encoding sessions, you can run the ffmpeg command from #53 (comment) (with c:v instead of v:c). After that, you should be able to create more than 2 encoding sessions until you restart the computer.

from nvidia-patch.

Snawoot commented on August 22, 2024

@jantenhove Thank you! I'm going to start looking at it.

from nvidia-patch.

Snawoot commented on August 22, 2024

@jantenhove This is a 32bit binary which uses libraries from %WINDIR%\SysWOW64. It's a 32bit versions of libraries and they are not patched. Speaking of 32bit apps, I shall not support them because 32bit patch requires almost same efforts as 64bit, despite it is a legacy platform.

Also I can confirm: nvcuvid.dll doesn't loaded at all in this app.

Could you please provide x64 build of your test app? Maybe it is possible to derive solution which fits both for D3D and CUDA encoding session.

from nvidia-patch.

Snawoot commented on August 22, 2024

I just had some important discovery.

32bit ffmpeg build exhibits exactly same behavior. It fails to open 3 sessions on patched system, but after successful run of 64bit version of ffmpeg, it becomes capable to open 3 sessions.

@jantenhove your x64 test binary will be very helpful for revealing roots of problem and distinct between CUDA vs D3D mode and 32bit vs 64bit.

from nvidia-patch.

Snawoot commented on August 22, 2024

@jantenhove Yes, this will be much better than current trick with ffmpeg, so we'd appreciate such contribution.

from nvidia-patch.

Snawoot commented on August 22, 2024

@jantenhove I'm going to set up clean VM with Windows 10 installation within couple of days. I'm planning to check which dependencies required (if they are) and do all walkthrough manually.

from nvidia-patch.

jantenhove commented on August 22, 2024

@Snawoot In theory you'll only need the Visual Studio 2017 Redistributable (x64) when using the binary. When compiling yourself, you need the Nnvidia Video Codec SDK + Cuda SDK installed. I've created a small readme: https://github.com/jantenhove/NvencSessionLimitBump/blob/master/readme.md

from nvidia-patch.

Snawoot commented on August 22, 2024

Thank you! I'll have to update docs for this patch to add reference to new workaround. Could you please issue new release with static binary or add static binary to current latest release?

from nvidia-patch.

Does this actually override CUDA or NVENC sessions? about nvidia-patch HOT 36 CLOSED

Comments (36)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent