Comments (9)
First of all this commit entirely broke nintendo switch support:
After LiStartConnection() called, streaming picture and sound appears but input immediately stuck, after a few seconds audio and video also stuck and app becomes unresponsive.
If you put a Limelog()
call before and after enet_socket_wait()
can you tell if it's hanging inside that function call? If it's not hanging, can you print the value of condition
and the return value of enet_socket_wait()
to see if it's failing or something?
It looks like poll()
is enabled by default in https://github.com/cgutman/enet/blob/8d69c5abe4b699e7077395e01927bd102b3ba597/unix.c#L107
If the Switch implementation of poll()
is broken, you can try to comment out #define HAS_POLL 1
and see if the problem goes away. You could also try setting #define NO_MSGAPI 1
and see if that helps things.
First, something one (audio, video, input) could stuck, after a few seconds it starts work fine, but something else stuck instead, and so on, something will always stuck until an entire app freezes. If I'll try to close connection when something already start to freeze, app will crash. These symptoms are similar to behaviour without my fix so may be it could be related.
It's suspicious that you get simultaneous audio, video, and input hangs. Audio is completely decoupled from input and video. It runs on completely separate threads with separate queues, locks, and other data structures. The only possible way I could see audio failing is if no thread can successfully call enet_host_service()
to keep the ENet control connection alive (which doesn't directly break audio, but causes the host to terminate the connection). That could be the case if a thread hangs while holding enetMutex
.
My suggestion would be to add calls to Limelog()
in various places to determine at least what code is executing when it hangs.
from moonlight-common-c.
Thanks for your reply!
- No, it's not stuck in
enet_socket_wait()
- Condition == 0, sometimes it's == 2 when I send input data, return of
enet_socket_wait()
is always == 0 - Disabling
#define HAS_POLL 1
not helped, double checked it by printing some stuff if it's presented. - Disabling
#define NO_MSGAPI 1
also not helped - I found that sometimes app not freezes entirely, sometimes only video stuck (input and sound works), but than sound could freeze and video with input start to work, or video and audio works, but input stops, but app will also freeze entirely after about 30 seconds. I think that I replaced
enet_socket_wait
with old solution just delays the problem, and bad wifi connection just force it.
Also I've never seen my Switch app disconnected from host because of bad connection like PC does. But if another client will connect to host, Switch will successfully disconnect it self from host.
Update:
After freeze, any input starts to log "Input queue reached maximum size"
Here is a video footage for visual representation of what I am talking about: Video
from moonlight-common-c.
Update:
After freeze, any input starts to log "Input queue reached maximum size"
OK, that definitely sounds like a hang inside the ENet code. The input send thread is responsible for pulling items from that queue and sending them through the ENet control stream connection to the host. Since only one thread can call into ENet at a time, we have a mutex to synchronize the threads that need to send requests on that connection.
If a thread hangs inside enet_host_service()
, then enetMutex
will never be released and several related threads will all hang (input send thread, control recv thread, loss stats thread, request IDR frame thread) while waiting on that mutex. Since the host doesn't receive any periodic control stream traffic from us for a while, it will terminate the connection. However, the hung threads can't exit either because they're waiting on that mutex, so you have to force terminate Moonlight to stop the stream.
To debug this, we'll need to figure out why and where it's hanging. The far easiest way would be to use a debugger like GDB and dump the thread stacks. That will immediately show the precise location of the hang. If you have a debugger but are not sure about the syntax for the commands, I might be able to help.
If there's no debugger for the Switch, you can probably start by adding some debug prints before and after these calls here:
moonlight-common-c/src/ControlStream.c
Line 543 in d9ea208
moonlight-common-c/src/ControlStream.c
Line 657 in d9ea208
from moonlight-common-c.
I think the only debugger for switch is twili, but I don't know why, it works only on linux, but my main machine is on MacOS, let's try to find the problem by logging first, if it will not help, I'll try to run twili.
And here are logs that app produced:
log.log
log1.log
log2.log
from moonlight-common-c.
Ok, I think I've setted up gdb, but I have no any idea how to use it.
from moonlight-common-c.
Once Moonlight gets stuck in that hung state, break into gdb
(Ctrl+C should do it) then run thread apply all bt
in the debugger prompt. Hopefully that will show all the info we need, or at least enough to narrow down the investigation.
from moonlight-common-c.
Hi, finaly I've got gdb works, here are some logs from it:
switch.log
switch2.log
switch3.log
from moonlight-common-c.
OK, I think I see a pattern in the deadlocks. They all seem to originate from sessionmgrAttachClient()
.
Thread 5 (Thread 707 (?)):
#0 0x0000002bffd4ceb4 in svcWaitProcessWideKeyAtomic ()
#1 0x0000002bffd4dea8 in condvarWaitTimeout ()
#2 0x0000002bffd5819c in sessionmgrAttachClient ()
#3 0x0000002bffd51bf0 in bsdRecvMMsg ()
#4 0x0000002bffd48a2c in recvmmsg ()
#5 0x0000002bffd48b10 in recvmsg ()
#6 0x0000002bff2e691c in enet_socket_receive ()
#7 0x0000002bff2e4e14 in enet_host_service ()
#8 0x0000002bff2d738c in sendMessageEnet ()
#9 0x0000002bff2d7fd8 in lossStatsThreadFunc ()
#10 0x0000002bff2dbb88 in ThreadProc ()
#11 0x0000002bffd571b8 in __thread_entry ()
#12 0x0000002bffd38b2c in _EntryWrap ()
#13 0x0000000000000000 in ?? ()
All of the deadlocked threads are waiting in in libnx here - https://github.com/switchbrew/libnx/blob/c5a9a909a91657a9818a3b7e18c9b91ff0cbb6e3/nx/source/sf/sessionmgr.c#L47
From glancing at the code, it looks like only a certain number of "sessions" (concurrent BSD sockets calls) are allowed in libnx at a time. Moonlight has several of these calls in flight at any given time (generally at least 2 blocking recvfrom()
calls and a poll()
).
According to the default BSD socket session configuration, it allocates a maximum of 3 sessions. Moonlight can easily exceed this and it may be the cause of the deadlocks. This explains why 97216e1 made the problem worse, since it is putting yet another thread into a blocking BSD sockets call (poll()
) rather than the old way which waited in a non-socket-related function usleep()
that didn't count as a BSD sockets session.
If I'm reading the code properly, it looks like you just need to switch from socketInitializeDefault()
to manually initializing sockets with addtional BSD sessions like:
SocketInitConfig cfg = *(socketGetDefaultInitConfig());
cfg.num_bsd_sessions = 8;
socketInitialize(&cfg);
EDIT: It looks like there may also be a bug in the wakeup logic of sessionmgrAttachClient()
and sessionmgrDetachClient()
that may be the root cause of the deadlocks when we reach the 3 session limit. I filed switchbrew/libnx#556
from moonlight-common-c.
Thanks a lot! While waiting libnx to release a patched version, your workaround with num_bsd_sessions = 8
works like a charm! You are the best!
from moonlight-common-c.
Related Issues (20)
- Protocol Spec HOT 2
- Extend support for other game streaming protocols HOT 1
- steamlink support? HOT 1
- LGPLv3 License
- RTX 3080 not support hevc? HOT 1
- Simultaneously stream to multiple clients HOT 1
- Not streaming due to "DRM"
- Controller no longer recognized
- questions about copyright HOT 1
- Audio block size mismatch when connected to Sunshine server HOT 3
- Stream locked to 30 FPS with 0 frame dropped HOT 1
- Where can i find the reference for nv gamestream protocol? HOT 4
- and documents or simple demo on howto use libmoonlight-common-c.so ?
- BT.709 and full color range by default HOT 2
- Game session closed automatically HOT 5
- Can you run Moonlight on Samsung TVs? HOT 1
- Feature Request - ability to remap "exit stream" gamepad buttons
- Can you write a development document? HOT 1
- PS Vita crashes with Sunshine AMD on Linux HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from moonlight-common-c.