Coder Social home page Coder Social logo

atelier-sync-fix's Introduction

What this does

Improves GPU utilization in D3D11-based Atelier games and can dramatically improve performance as a result.

A quick test on my 6900XT in early areas of Sophie 2 shows the following improvements (on Proton, using DXVK 1.10.1 and Mesa 22.0):

Location Before After
Menu (1440p) 104.7 144.0
Menu (4k) 63.4 82.3
Commercial District (1440p) 83.3 144.0
Commercial district (4k) 51.0 72.9

The issue

The engine of these games has serious issues with GPU under-utilization ever since they switched to D3D11 with Firis, due to the way data is exchanged between the CPU and GPU. It's likely that they were trying to emulate D3D9 resource management in a really bad way and never bothered to fix it, and with each new game it gets worse. Sophie 2 sets a sad new record with roughly 20 GPU sync points in the main menu.

Basically, what happens is as follows (in pseudo-code):

ID3D11Buffer* stagingBuffer;
D3D11_MAPPED_SUBRESOURCE mapped;
D3D11_BUFFER_DESC desc;
// ..
desc.Usage = D3D11_USAGE_STAGIG;
desc.CPUAccessFlags = D3D11_CPU_ACCESS_READ | D3D11_CPU_ACCESS_WRITE;
d3d11Device->CreateBuffer(&desc, nullptr, &stagingBuffer);
d3d11Context->CopyResource(stagingBuffer, gpuResource);  // <- this is executed on the GPU
d3d11Context->Map(stagingBuffer, 0, D3D11_MAP_READ_WRITE, 0, &mapped); // <- this waits for CopyResource to complete
// ... some CPU work here to write to the mapped buffer
d3d11Context->Unmap(stagingBuffer, 0);
d3d11Context->CopySubresourceRegion(gpuResource, 0, ..., stagingBuffer, 0, ...); //< this is done on the GPU again

Not only does this force full CPU-GPU synchronization at the start of each frame, this also happens multiple times every frame, back-to-back, and with all different kinds of resources (vertex buffers, a render target, you name it). They somehow even manage to have other buffers with D3D11_USAGE_DYNAMIC or D3D11_USAGE_STAGING in that mess whcih they could map directly, but no, Gust prefers GPU synchronization.

The fact that there are multiple sync points back-to-back makes this especially problematic since submitting those infividual copy commands that happen between calls to Map is fairly costly on the driver side.

The solution

It's actually quite simple: Instead of doing all those nasty CopyResource calls on the GPU, we just do them on the CPU.

However, we can't just map the GPU resources directly for the most part, so for each GPU resource that's being copied into a staging buffer, we create another staging buffer - but unlike the game, we keep it around, and update it each time the GPU resource itself gets updated. By the time the game calls CopyResource, the GPU may not be done using all those shadow resources yet, so we will still synchronize, but at worst we'll now synchronize with one single copy command from the previous frame, not with dozens of copy commands in the current frame.

Caveats

  • Memory usage as well as CPU utilization are increased.
  • Not all GPU sync points are caught. There are some genuine data dependencies that can't easily be worked around this way, so there will still be situations where GPU load is low, or where the game will stutter briefly.
  • I haven't tested this on Windows at all yet, or in any sort of long gameplay sessions. There may be stability issues.

atelier-sync-fix's People

Contributors

doitsujin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

atelier-sync-fix's Issues

White screen in Atelier Ryza (first game) on Steam Deck

I'm playing Atelier Ryza (first game) on my Steam Deck running the stock image of SteamOS.
Applying the DLL in the game's files does nothing. I have to set up an override with this launch command:
WINEDLLOVERRIDES="d3d11=n,b" %command%
As instructed here https://www.protondb.com/app/1121560
If I do this, the game starts but after the logos I see only a white screen, but the game is running, because I can hear the music and sound effects.
I can't try with other Atelier games because I only own Ryza on Steam. I have a PC with a RX6800 and Manjaro Linux, I could try on it but I can't right now, anyway I'd rather play this game on Deck.

I know the DLL gets loaded with the override command because atfix.log gets created.
This is the content of the log file
Loading d3d11.dll successful, entry points are:
D3D11CreateDevice @ 0x7ffce7eaf640
D3D11CreateDeviceAndSwapChain @ 0x7ffce7eaf750
Hooking device 0x516480
ID3D11Device::CreateDeferredContext @ 0x7ffce7f14200 -> 0x7ffc9c2b58a0
Hooking context 0x517fd8
ID3D11DeviceContext::ClearRenderTargetView @ 0x7ffce7fb5b50 -> 0x7ffc9c2b2a60
ID3D11DeviceContext::ClearUnorderedAccessViewFloat @ 0x7ffce7fb7d30 -> 0x7ffc9c2b2b10
ID3D11DeviceContext::ClearUnorderedAccessViewUint @ 0x7ffce7fb8f00 -> 0x7ffc9c2b2bc0
ID3D11DeviceContext::CopyResource @ 0x7ffce7fbaf90 -> 0x7ffc9c2b3950
ID3D11DeviceContext::CopySubresourceRegion @ 0x7ffce7fbd930 -> 0x7ffc9c2b3bc0
ID3D11DeviceContext::CopyStructureCount @ 0x7ffce7fbbfb0 -> 0x7ffc9c2b1630
ID3D11DeviceContext::Dispatch @ 0x7ffce7fc3700 -> 0x7ffc9c2b2e50
ID3D11DeviceContext::DispatchIndirect @ 0x7ffce7fc2cd0 -> 0x7ffc9c2b2c70
ID3D11DeviceContext::OMSetRenderTargets @ 0x7ffce7fe22a0 -> 0x7ffc9c2b2d50
ID3D11DeviceContext::OMSetRenderTargetsAndUnorderedAccessViews @ 0x7ffce7fdcc10 -> 0x7ffc9c2b2f30
ID3D11DeviceContext::UpdateSubresource @ 0x7ffce8016bb0 -> 0x7ffc9c2b1490

Text Bug with Atelier Rorona

Using Version 0.5 of the sync fix on Atelier Rorona The Alchemist of Arland DX makes the text bug out. This happens on both the Japanese and English language versions.

20230218120237_1

20230218122341_1

While the sync fix isn't required for this game, as the performance is good without it. I felt that it was worth noting since this is still another D3D11 based Atelier game on what seems to be the same engine as the rest of the series.

Unable to launch Atelier Ryza

Hello,

maybe I am doing something wrong but the game can't seem to be able to launch with the .dll. I see the game window pop up for a second and it's gone immediately.

This is running on a RTX3070. Do you require any additional information?

atfix.log attached.
atfix.log

Tested the fix in other KT games and got improvements

Dynasty Warriors 9 - Big Improvement(53 - 60 fps)
Arslan: The Warriors of Legend - Big Improvement(58 - 60 fps)
BERSERK and the Band of the Hawk - Big Improvement(55 - 60 fps)
SAMURAI WARRIORS: Spirit of Sanada - Small Improvement

I will test other KT games and update this list.

Any chance of this fixing the problems on steam deck?

Hi, this fix makes the games run great on deck but there's an issue from Ryza 2 onwards by which loading screeen can sometimes freeze. Is this something that can be solved within this fix or looked at by the dev? Thanks so much for the fix!

Mapping shadow resources forces thread sync in GPU driver

Ryza really likes to update certain vertex buffers with a copy to a staging buffer, map READ_WRITE, unmap, copy back to vertex buffer. While the shadow buffers prevent this from becoming a full GPU sync, the map still causes a sync with internal gpu driver threads on both DXVK and the Windows AMD driver. I attempted a horrible hack to avoid this (28b19cc), and it turns out that in addition to letting me finally run my synthesis UI at 144fps, it also clears up a fairly bad fps drop that you get in Ryza 2 every time you jump on a Steam Deck. (Sadly that hack is so hacky it doesn't even manage to work on all Atelier games, breaking Sophie DX.)

So if anyone has a less-hacky solution to this, I'd be very interested.

Atelier Ryza 3 3d model motion problem

In Atelier Ryza 3, during character movement, story progression, and riding fairies, the character and fairy actions will speed up and appear jittery. Removing d3d11.dll will restore normal behavior.

This issue occurs in all versions after the 1.1.0.0 update.

atfix.log attached.

atfix.log

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.