Coder Social home page Coder Social logo

Comments (15)

guruofquality avatar guruofquality commented on June 14, 2024

There were cases where we got locked up forever waiting for system tcp timeouts when hosts had been removed from the network suddenly. Its possible that these timeouts are too strict: 0ede4c4

Its probably the case that some calls can take a very long time. Can you try increasing this value above 3 seconds? https://github.com/pothosware/SoapyRemote/blob/master/common/SoapyRPCUnpacker.cpp#L23

The default timeout for TCP seems to be 1 or 2 minutes in which case the OS closes the connection. Obviously thats too high, we need to find something in the middle :-)

from soapyremote.

guruofquality avatar guruofquality commented on June 14, 2024

I tried to improve the timeout check so we actually check if the server is down to allow for longer blocking calls: https://github.com/pothosware/SoapyRemote/tree/timeout_improved Give it try, thanks!

from soapyremote.

rkb68 avatar rkb68 commented on June 14, 2024

Thanks, Josh. I tried the new tree, and initially the behavior against the two servers (Raspian and openSUSE) was the same as before, with the same error messages.

I then upped the big time limit from 30000000 to 35000000, at which point the command
SoapySDRUtil --probe="remote=192.168.1.10"
completed successfully against the Raspian server, though a bunch of these messages popped up on the client:
[ERROR] ~SoapyRPCUnpacker: Unconsumed payload bytes -4
with these corresponding complaints on the server:
SoapyServerListener::handlerLoop() FAIL: SoapyRPCUnpacker::recv(header) FAIL:

With the new tree and modified time limit, the behavior against the openSUSE server did not change; same errors that I had listed originally.

from soapyremote.

guruofquality avatar guruofquality commented on June 14, 2024
  • I can confirm that the timeout on the master branch was a problem for the LimeSDR when it was performing a calibration which took a long time.
  • Also the first common on the timeout_improved branch causes an error to print when the remote test connection is made. I dont think it is harmful though.
  • I just pushed another commit to the timeout_improved branch, it seems that test connection needs to be properly "hung up" on. This prevents the error message.

from soapyremote.

rkb68 avatar rkb68 commented on June 14, 2024

Thanks again. The "hang up" code does the trick- no error messages running "--probe" against the Raspian server. The error messages I got running against openSUSE seem to have a completely different origin.

from soapyremote.

guruofquality avatar guruofquality commented on June 14, 2024

I merged that change btw. I'm not sure about the second set of errors on opensuse. I'm curious if you see the issue on the previous release 0.4.0 (https://github.com/pothosware/SoapyRemote/tree/soapy-remote-0.4.0). That would help pinpoint if there an additional issue with the timeout additions in 0.4.1.

from soapyremote.

rkb68 avatar rkb68 commented on June 14, 2024

Just tested 0.4.0- it works like a charm with the Raspian server, and gives same old same old with the openSUSE server (no surprises there).

cheers,
robert

from soapyremote.

coburnw avatar coburnw commented on June 14, 2024

I seem to be getting a related problem with SoapyLogAcceptor. When running a rate test with SoapySDRUtil, a single timeout occurs about 30 seconds into a run, once only, with the rate test continuing to run after the error with no apparent errors.

But when SoapySDRUtil is stopped with Ctrl-C, the remote server also closes and needs to be restarted.

If i put a sleep(40) immediately after make device in SoapySDRRateTest, the error occurs roughly 30seconds into the sleep, then continues clean until stopped with Ctrl-C resulting in the remote exiting also.

SoapySDRUtil --rate=39062.5 --channels=0 --direction=RX
'######################################################
'## Soapy SDR -- the SDR abstraction library
'######################################################

[INFO] SoapyRemote::setupRxStream(remoteFormat=CS32, localFormat=CS32, scaleFactor=5.53648e+07, mtu=1500, window=44040192)
[INFO] Client side stream bound to 192.168.2.131:43174
[INFO] Client side status bound to 192.168.2.131:50313
[INFO] Server side stream bound to [::ffff:192.168.2.101]:48581
[INFO] Server side stream connected to [::ffff:192.168.2.131]:43174
[INFO] Server side status connected to [::ffff:192.168.2.131]:50313
[INFO] Configured sender endpoint: dgram=1452 bytes, 178 elements @ 8 bytes, window=43008 KiB
[INFO] Client side stream connected to 192.168.2.101:48581
[INFO] Configured receiver endpoint: dgram=1452 bytes, 178 elements @ 8 bytes, window=43008 KiB
Stream format: CS32
Num channels: 1
Element size: 8 bytes
Begin RX rate test at 0.0390625 Msps
Starting stream loop, press Ctrl+C to exit...
[WARNING] Set thread priority 0.5 failed: Operation not permitted
0.0390631 Msps 0.312505 Bps
0.0390621 Msps 0.312497 Bps
0.039062 Msps 0.312496 Bps
0.0390618 Msps 0.312495 Bps
0.0390617 Msps 0.312494 Bps
0.0390615 Msps 0.312492 Bps
[ERROR] SoapyLogAcceptor::handlerLoop() FAIL: SoapyRPCUnpacker::recv() TIMEOUT:
0.0390615 Msps 0.312492 Bps
0.0390614 Msps 0.312491 Bps
0.0390626 Msps 0.312501 Bps
0.0390624 Msps 0.3125 Bps
0.0390623 Msps 0.312499 Bps
0.0390622 Msps 0.312498 Bps
0.0390622 Msps 0.312497 Bps
0.0390621 Msps 0.312497 Bps
Ctrl+C pressed here.
the remote 'SoapySDRServer --bind' dies.

from soapyremote.

guruofquality avatar guruofquality commented on June 14, 2024

Thanks, that pointed me in the right direction. the log acceptor also needed to recv() forever without a timeout. The timeout was only meant for the control packets and not the listener loops which can be hung up on. I made a change on the timeout_improved branch. Didnt test it yet, but its straightforward.

from soapyremote.

coburnw avatar coburnw commented on June 14, 2024

No errors so far. Very Nice.
Thank you.

from soapyremote.

guruofquality avatar guruofquality commented on June 14, 2024

Anyways sorry for the bugs, thanks for testing. I will try to get this tagged as well since its a pretty annoying bug to have in a release. @rkb68 let me know if the current master branch resolves your opensuse issues. Perhaps it was also the timeout bug in the logger.

If not I can make the suggestion that the connection timeout of 50 ms could be too strict in some cases. So I offer the following patch to simply increase it and see what happens:

diff --git a/common/SoapyRemoteDefs.hpp b/common/SoapyRemoteDefs.hpp
index d422b5b..368ae71 100644
--- a/common/SoapyRemoteDefs.hpp
+++ b/common/SoapyRemoteDefs.hpp
@@ -67,7 +67,7 @@
 #define SOAPY_REMOTE_DEFAULT_SERVICE "55132"
 
 //! Use this timeout for every socket poll loop
-#define SOAPY_REMOTE_SOCKET_TIMEOUT_US (50*1000) //50 ms
+#define SOAPY_REMOTE_SOCKET_TIMEOUT_US (500*1000) //500 ms
 
 //! Backlog count for the server socket listen
 #define SOAPY_REMOTE_LISTEN_BACKLOG 100

Thanks!

from soapyremote.

rkb68 avatar rkb68 commented on June 14, 2024

Thanks again; when I increased SOAPY_REMOTE_SOCKET_TIMEOUT_US to 500 ms, --probe still did not work, but --find did, though it would fail after the first run. However, after increasing the timeout to a full second, I can now run both --find and --probe with no errors. Perhaps my openSUSE installation has greater latency somewhere? Anyway, thanks for looking into this.

from soapyremote.

guruofquality avatar guruofquality commented on June 14, 2024

@rkb68 1 second is an incredible amount of time for a socket to connect on another host on a local network. Thats a bit worry some all by itself.

Is the openSUSE in a remote location or can you think of some oddity with the network configuration otherwise? Howabout sshing into the openSUSE box, is there a noticeable lag? Sometimes there can be issues with the DNS that cause the client to reverse lookup an IP address and timeout on every lookup -- something else to double check anyway.

I'm thinking of making the timeout configurable with an input argument or environment variable just to be safe and future proof things. But there is probably something else going on here.

from soapyremote.

rkb68 avatar rkb68 commented on June 14, 2024

I was able to back off the timeout to 700 ms, but yes, that is still a very long time. I installed openSUSE on the RPi3 using their card image, did not tweak any settings, merely enabled wifi with DHCP.

And yes, there is often (not always) a lag when ssh'ing in from elsewhere on the network (which does not happen with the Raspian image).

from soapyremote.

guruofquality avatar guruofquality commented on June 14, 2024

I added a remote timeout option but I forgot to mention it. Remote timeout it should hit all of the timeout values on the client that were previously using the default. I think this should not be needed usually, but in case its useful: https://github.com/pothosware/SoapyRemote/wiki#the-remote-timeout-key

from soapyremote.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.