Coder Social home page Coder Social logo

STUN and TURN not working with Docker without "network_mode: host", as described in install instructions for "Port Range" docker-compose.yml about server HOT 14 CLOSED

screego avatar screego commented on May 16, 2024 3
STUN and TURN not working with Docker without "network_mode: host", as described in install instructions for "Port Range" docker-compose.yml

from server.

Comments (14)

jmattheis avatar jmattheis commented on May 16, 2024

Have you tested screego without docker to ensure the docker container is really at fault here?

Have you logged in and created a room with a user session or configured SCREEGO_AUTH_MODE?

However, the demo instance at app.screego.net didn't work at all as soon as at least one NAT was in-between the connection path. If the demo instance is hosted using via Docker this might be caused by the same issu

The demo instance is hosted without docker. Turn on there is disabled for unauthenticated users, thus, it is perfectly fine that the stream doesn't work there because TURN isn't used.

from server.

ThelloD avatar ThelloD commented on May 16, 2024

Have you tested screego without docker to ensure the docker container is really at fault here?

I didn't test it without docker, but it's working fine with a docker-compose.yml that uses network_mode: host instead of the explicit port specifications such as port: \n - 50000-50100:50000-50100/udp.

I'm not sure what exactly causes this, but apparently there are frequent issues when hosting TURN using Docker, examples:

Further, afaik docker-compose (or Docker itself?) also seems to have troubles binding both TCP and UDP ports for the same port range.

I first posted this issue as a comment in #27 because @mbleichner mentioned "inspecting port 3478 with tcpdump, it seems like some (not all) UDP packets use an internal docker IP address". Unfortunately I'm not sure whether his solution also was using network_mode: host or if the issue actually was resolved for him using the ports: syntax. (Btw, the ports: section is ignored and has no effect if network_mode: host is set.)
So maybe the issue could also be caused that some UDP packets are using the wrong IP, namely internal Docker networking IPs (private IPs)?

Tbh, I'm really not sure if the issue is caused by Screego or the way how Docker binds the ports.

If Docker is not supported without network_mode: host then this would also be a valid solution. But for me it seems that both STUN as well as TURN just don't work properly without network_mode: host. But wherever the issue relies, either the install instructions should be updated accordingly (removing the "port range" section and adding a description that Docker is not supported without network_mode: host) or the issue should be fixed (in case it's actually caused by Screego and not by Docker).

Have you logged in and created a room with a user session or configured SCREEGO_AUTH_MODE?

I'm aware of the settings, so yes, everything done and running fine. However, I'm currently using network_mode: host so all ports are created on the host directly and not in a virtual network first which then is mapped from the host network. With my current setup, both STUN and TURN are working as expected, although personally I'd have preferred not having to use the network_mode: host mode.

The demo instance is hosted without docker. Turn on there is disabled for unauthenticated users, thus, it is perfectly fine that the stream doesn't work there because TURN isn't used.

I know that TURN is not enabled on the demo instance for unauthenticated users, but I disagree that "it is perfectly fine that the stream doesn't work there because TURN isn't used".

Using STUN is in some cases sufficient, even if both clients are behind are NAT. I've tested this with both Screego in STUN mode as well as another tool.
In two scenarios, using only STUN the stream is working between clients behind a firewall/NAT.

In my 3rd scenario, I used a mobile client using a cellular connection (LTE). In that case, the stream does not load when only STUN is used, I assume that's because of some complex LTE proxies and IPv4 via IPv6 etc, but that's fine.

When I switched to TURN, streams are working working for all three scenarios, including via the cellular connection.

Therefore, my conclusions are:

  • TURN and STUN are working as expected on my instance
  • The stream is working reliable using only STUN on my (tested with two scenarios), however using STUN on app.screego.net streams only load in a local network but fails in all scenarios where NATs are involved. Therefore, since since the stream is loading using STUN on my instance but fails on app.screego.net, my assumption is that something on the public instance is apparently not working as intended.

By the way, I'd also recommend adding a brief explanation below the "create room" form that informs the user, that TURN is supported and built-in, but disabled (for unauthenticated users) on the demo server.

from server.

jmattheis avatar jmattheis commented on May 16, 2024

Thanks for the great analysis (:,

So maybe the issue could also be caused that some UDP packets are using the wrong IP, namely internal Docker networking IPs (private IPs)?

This could be, I'll take this on my to do list to check. It should be possible to use the stun via tcp only (which is enabled by default), maybe we can workaround weird udp things in docker with this.

I know that TURN is not enabled on the demo instance for unauthenticated users, but I disagree that "it is perfectly fine that the stream doesn't work there because TURN isn't used".

I've played with this thought already a little, this would require to have different UIs for public screego and self-hosted screego or some kind of hidden config flag. I rather keep screego simple (:.

Using STUN is in some cases sufficient, even if both clients are behind are NAT. I've tested this with both Screego in STUN mode as well as another tool.
In two scenarios, using only STUN the stream is working between clients behind a firewall/NAT.

In my 3rd scenario, I used a mobile client using a cellular connection (LTE). In that case, the stream does not load when only STUN is used, I assume that's because of some complex LTE proxies and IPv4 via IPv6 etc, but that's fine.

So there is no difference between laplace and screego, only between public screego and self-hosted screego?

however using STUN on app.screego.net streams only load in a local network but fails in all scenarios where NATs are involved.

I can say that I use app.screego.net without turn with some of my colleagues, and we both have a nat'ed router. On app.screego.net IPv6 is enabled, which probably is used for most of the traffic. Maybe the IPv6 traffic is routed differently and thus, you're experiencing different results on self-hosted / public-hosted.

Could you test on https://webrtc.github.io/samples/src/content/peerconnection/trickle-ice/ that stun:app.screego.net:3478 and your self-hosted server return the same addresses?

from server.

mbleichner avatar mbleichner commented on May 16, 2024

I first posted this issue as a comment in #27 because @mbleichner mentioned "inspecting port 3478 with tcpdump, it seems like some (not all) UDP packets use an internal docker IP address". Unfortunately I'm not sure whether his solution also was using network_mode: host or if the issue actually was resolved for him using the ports: syntax. (Btw, the ports: section is ignored and has no effect if network_mode: host is set.)

I switched to network_mode: host, so that was actually what solved it for me. And because it worked so well, I actually never tried to switch back to publishing ports explicitly.

from server.

ThelloD avatar ThelloD commented on May 16, 2024

So there is no difference between laplace and screego, only between public screego and self-hosted screego?

Screego (STUN) and Laplace (both on my instances behave) identical: 2 of 3 scenarios work
Screego (TURN) vs Laplace: Screego is better (3 of 3 scenarios work) than Laplace (2 of 3)
Screego on my instance vs Screego on app.screego.net: On my instances 2 (STUN) respectively 3 (TURN) scenarios work, but none work on app.screego.net

(Please keep in mind that all three scenarios use NATs or firewalls. On a local network without NAT app.screego.net works well)

I can say that I use app.screego.net without turn with some of my colleagues, and we both have a nat'ed router.

Ok, that's strange. Maybe it's indeed IPv6 ralated? I've been lazy and on I didn't configure IPv6 on my server yet, so maybe it's indeed related to this?

A friend told my about Screego after I've showed him a Laplace demo instance. However, he mentioned that he also tried the public instance but for him it also did only work in a local network and failed on nat'ed connections otherwise.
(Therefore we weren't 100% sure if Screego would work for our scenarios at all, but now as it is running on my server I'm very happy with it :) )

Could you test on https://webrtc.github.io/samples/src/content/peerconnection/trickle-ice/ that stun:app.screego.net:3478 and your self-hosted server return the same addresses?

Returned addresses are the same for both STUN servers, and the returned address is IPv4. However, for my machine I receive the message:

Note: errors from onicecandidateerror above are not neccessarily fatal. For example an IPv6 DNS lookup may fail but relay candidates can still be gathered via IPv4.
The server stun:MYDOMAIN:3478 returned an error with code=701:
STUN host lookup received error

But I think that error is a result of missing IPv6 configuration and can be ignored

from server.

ThelloD avatar ThelloD commented on May 16, 2024

I switched to network_mode: host, so that was actually what solved it for me. And because it worked so well, I actually never tried to switch back to publishing ports explicitly.

Thank you for the information! :)
Then my initial assumption seems to be correct, that the network_mode: host setting actually solved the issue.

from server.

jmattheis avatar jmattheis commented on May 16, 2024

Could you try https://appv4.screego.net/ this dns entry only has ipv4 settings.

from server.

ThelloD avatar ThelloD commented on May 16, 2024

Could you try https://appv4.screego.net/ this dns entry only has ipv4 settings.

appv4.screego.net behaves the same as my instance (with STUN) and the two scenarios are working here as well. So apparently it's really an IPv4/IPv6 issue, interesting.

Maybe noteworthy is that my Internet connection supports both IPv4 and IPv6 but the connection of my colleague seems to be IPv4 only (=scenario number 2), maybe this also a part of the problem?
On the other hand, I also tested Screego using the same ISP with two local networks (normal and guest network) that are separated via NAT/firewall (=scenario number 1) and it's not working when I use app.screego.net but working fine with appv4.screego.net. Since the Internet connection is the same here both machines should be using the same protocol(s) when connecting to the STUN server...

from server.

jmattheis avatar jmattheis commented on May 16, 2024

I think this is actually a bug in the implementation, dunno why I though this would make sense. If you connect via ipv6 then you only receive the ipv6 stun server thus only your external ipv6 and vice versa for the ipv4 address. If someone connects via ipv6 and another with ipv4 (without support for ipv6) then normal p2p connections cannot be established.

from server.

andrewgdunn avatar andrewgdunn commented on May 16, 2024

Wanted to document a reproduction of what I think might be the same/similar issue that @ThelloD was initially seeing.

There is a bit more complication in my setup for a couple reasons:

  • I am doing what I think is full cone NAT upstream of the server (from upstream public address to the servers RFC1918 address)
  • I am using nginx as a reverse proxy
  • I am using podman with the desire to deploy rootless

Upstream I have a public address, I'll say 200.100.100.100.
On the server I have a private address, I'll say 10.0.10.10.

I'm setting configuration parameters in a file ~/config/server.config, I'll copy excerpts here:

SCREEGO_EXTERNAL_IP=200.100.100.100
SCREEGO_SERVER_ADDRESS=0.0.0.0:5050
SCREEGO_TURN_ADDRESS=0.0.0.0:3478
SCREEGO_TURN_PORT_RANGE=50000:52000

This is the desired invocation:

podman run --detach --name=screego\ 
     --volume ~/config:/etc/screego:Z
     --publish 0.0.0.0:5050:5050
     --publish 0.0.0.0:3478:3478/tcp
     --publish 0.0.0.0:3478:3478/udp
     --publish 0.0.0.0:50000-52000:50000-520000/tcp
     screego/server:1.5.2

This doesn't appear to work. The we can get the web server side to function properly, but streams do not work unless you're in the same LAN. Testing with this gives us immediate feedback... but I'm not sure how to parse/understand the results.

Changing invocation to --net=host (as stated by @mbleichner and @ThelloD) will work, but only if we specify the private addresses to bind to (specifically on the SCREEGO_TURN_ADDRESS:

podman run --detach --name=screego\ 
     --volume ~/config:/etc/screego:Z
    --net=host
    screego/server:1.5.2

With:

SCREEGO_EXTERNAL_IP=200.100.100.100
SCREEGO_SERVER_ADDRESS=0.0.0.0:5050
SCREEGO_TURN_ADDRESS=10.0.10.10:3478
SCREEGO_TURN_PORT_RANGE=50000:52000

I'd like not to run with the --net=host invocation. However I'm not sure what the issue is. Since, for my environment, I got everything working fine when using the --net=host configuration I'd like to assume that my NAT'ing is "fine" and that there is something internal to how the container reacts to having the container runtime do port redirection.

from server.

jmattheis avatar jmattheis commented on May 16, 2024

Yes, this is a similar problem, when using port ranges with docker the browsers cannot establish a connection with each other even when turn is used.

from server.

andrewgdunn avatar andrewgdunn commented on May 16, 2024

Maybe I'm not grasping the state of the network in play, but I seem to be able to get things working without the turn ports being forwarded at my border firewall. Does the turn instance use a connection on 3478 to ratchet up to that port range?

My border firewall is allowing through 80,443, and 3478. I'm guessing that once connection is made on 443, then 3478, it ratches up to the 50000 range and it does so by NAT punching?

from server.

jmattheis avatar jmattheis commented on May 16, 2024

How do you test that it works? Turn is only needed if peers cannot connect to each other directly. So it can work without exposing the turn ports.

from server.

jmattheis avatar jmattheis commented on May 16, 2024

If docker is used without network_mode: host, then the strict ip checks fails, this can be disabled with SCREEGO_TURN_STRICT_AUTH=false. The setting was removed in v1.10.0, and now is always disabled / false.

from server.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.