Comments (33)
The new behavior was implemented to make the results more consistent and predictable. In particular, it was considered problematic that if you sent 3 pings in count mode, you would possibly get a loss only on the third ping, because the timeout would only affect the last one.
The recommended way to measure is to set a long-enough period, so that you don't run into this problem.
from fping.
Right, I can see how having the last few measurements disappear when latency increases can be odd. But the right way to fix that is doing something like what @peteheist is suggesting, and adjusting the wait interval to the actual observed RTT, instead of just arbitrarily discarding measurement points as you are currently doing.
Latency variations are quite common in the internet, actually, and can be an order of magnitude above the base RTT.
But yeah, since this behaviour is in a release version of fping, I am going to have to work around it anyway...
from fping.
I've tested the 3.16-rc1
I cannot confirm that it has changed described behavior in any way.
# time ./fping -c3 -t50 -p100 zabbix.jp
zabbix.jp : [0], 84 bytes, 238 ms (238 avg, 66% loss)
zabbix.jp : xmt/rcv/%loss = 3/1/66%, min/avg/max = 238/238/238
real 0m0.659s
user 0m0.000s
sys 0m0.000s
# ./fping -v
./fping: Version 3.16-rc1
./fping: comments to [email protected]
Don't want to be annoying, but this "issue" in NOT fixed.
from fping.
So, you mean that responses received after the timeout specified in -t should be discarded?
from fping.
I can understand the argument, but I wonder if the current behavior is possibly also preferred by others and possibly relied upon.
from fping.
Yes, it's an arguable thing - should be discarded or not.
My own opinion - the responses should be discarded.
I.e. options -t (timeout) and -c (count of packets) should not "depend each other" regarding received result.
I belong to a Zabbix team (NMS application), where fping (with "not default" mode, i.e. with -C option) is used to perform icmp ping checks.
Most of fping's options may be controlled in Zabbix, so for example someone wants to change parameters to use not default number of packets (-c) while leave the same default timeout (-t) - and in such case final result may get different.
https://www.zabbix.com/documentation/3.2/manual/config/items/itemtypes/simple_checks#icmp_pings
As for "backward compatibility" - I'm not sure how to resolve that, but current behavior is illogical for me, so I consider it as a bug, which is "to be fixed" instead of keep things "as is" because someone maybe relies it.
from fping.
Here is an alternative proposal:
- forbid usage of -t in conjunction with -c/-C/-l
- set "timeout" automatically to the the value of -p (period), so also for the last ping we wait for the correct amount of time.
This is probably easier to understand for the users, instead of having a separate timeout option.
What do you think?
from fping.
@schweikert, thanks for considering this, @zalexua - thanks for reporting :)
after thinking this over, the overriding of the timeout by period seems rather unintuitive to me.
it is also inconsistent - normally -t sets the timeout, then suddenly it's -p...
it seems to me that obeying -t always normally and not tying it to -c in slightly obscure ways would still be the best.
also, if i would like to do 5 pings with 5 seconds between them, and one second timeout, the -p proposal would make that impossible (without a bit messy wrapper script).
from fping.
@schweikert after thinking and discussing your suggestion with one more Zabbix folk here, we do not consider it as an optimal solution. Saying more specifically - it would partially break fping usage in Zabbix.
As @richlv said above - we also will lose flexibility, which currently is available for end users.
Today you have updated issue title and l want to say that I like it - it cleanly reflects what should be changed. I suppose it should not be any hard technically.
from fping.
The problem with that is that in loop/count mode, the default timeout of 500ms is really low. In non-loop/count mode you have retries and the backoff factor, but not in loop/count mode. So this would mean starting to discard perfectly fine ping replies, and it would surprise fping users.
In which way it would break Zabbix?
Here is another proposal:
- Implement the discarding of ping replies that come after the timeout specified as -t
- Change the default timeout from 1 second to the value of -p, if -t is not specified (which would mostly match the current behavior of fping, except for the last ping).
from fping.
You said that default value for -t is 1 second, but "man fping" says it's 500 ms. I see the same in sources (#define DEFAULT_TIMEOUT 500). So I'll use 500 ms further, fix me please if I'm wrong.
Thinking on - what people use fping with -C option ... I don't think these people want to measure ICMP pings which are longer than 500 ms, by default. If they use count of packets (-C) more than 1 (which is logical), so they should understand that there is some timeout and it has some affect. If I would expect replies with more than 500 ms, I'd need to explicitly allow that by big value for -t option, but not juggling by value for -C option.
Replies which are longer than timeout -t (default one or customized) are not fine, IMO :)
I said it partially would break fping usage in Zabbix. In Zabbix it's possible to specify custom values for -t and -p options separately. If they would be "merged" on fping side, then current zabbix functionality for icmp ping checks would be partially broken.
If you will forbid usage of -t in conjunction with -c/-C/-l - it will break fping usage completely for ~60-90% Zabbix users, because I'd say that the ~60-90% users do customize the "timeout" in icmp checks.
Your latest proposal sounds good to me. Let me collect more opinions here, will get back to you soon.
from fping.
the proposal in #32 (comment) seems reasonable to me
from fping.
You are right: it's 500ms by default, not 1000ms. I have corrected it in my comment.
Also, I agree that "forbidding" the usage of -t together with -c/-C/-l would be bad and indeed break many installations, so that's certainly a no go. My main concern is not to break existing installations (and I argue that people might not expect pings > 500ms to be discarded when in loop mode).
from fping.
The decision to drop packet if rtt is over timeout specified with -t is right IMO. However, I don't see an idea to connect -t and -p options in any way as good one.
from fping.
@dimir, thanks for your feedback. Note that -t and -p supposed to be "connected" only if -t is not customized. It sounds not so bad.
And it still will be much better than current behavior.
We still need to find out some compromise to have a chance for requested change.
@schweikert, I think I've collected all opinions I wanted to get. Hope to see some further movements :)
from fping.
Sure, I understand that this is only when -t is not specified. And I agree, the proposed solution is not so bad, but not perfect either, IMO. Also I agree that it will be better than current behavior.
from fping.
If you want to test, I have released fping-4.0-rc2 with these changes:
http://fping.org/dist/beta/fping-4.0-rc2.tar.gz
Testing would be much appreciated!
from fping.
I'll do it, definitely.
from fping.
Similarly to "man", maybe in --help it worth to add ", up to 2000 ms" after "where it is the -p period" ?
To be like this: "where it is the -p period, up to 2000 ms"
Duplicated word "for for" in ChangeLog.
from fping.
"for for" is still in the Changelog
from fping.
@zalexua: indeed, sorry. fixed now
from fping.
I extensively tested the changes yesterday and today. Yesterday, on beginning (before I ran "ping zabbix.jp"), I had a weird result of loss replies, which I could not explain. After a few unexpected ping loss, I ran wireshak and I saw replies (I guess, unfortunately I didn't pay attention on packets itself, but only for timing) , i.e. with -c2 I had 4 packets appeared in wireshark and their timing was fine!
Raw results are here, just in case on http://pastebin.com/sji0vBNK
Interval between ran fping/ping commands was 10-30 seconds.
After "ping zabbix.jp" I also ran tcpdump on shell, but all further attempts were successful and/or correct.
Today's tests - also everything was correct.
Not sure what it was yesterday on beginning, whom I have to blame - my eyes or my mind ... so feel free to ignore my concerns.
TESTED !
from fping.
Thank you!
from fping.
So just chiming in with the inevitable "you broke my use case" comment: I am using fping in Flent to measure bufferbloat, among other things. In this use case, the RTT can vary wildly within a single test, and we want to capture this. So discarding late pings is absolutely the wrong thing to do.
If I'm reading the code right, I can get the old behaviour by just passing a really large value to -t, right? Will that affect anything on old versions (when running in -c mode)?
from fping.
Actually, playing with v4.0 a bit more, I'd argue that the current handling is wrong. If you're going to honour -t in counter mode (which is a good idea!), it should only apply as a timeout after the last packet has been sent. I.e., if the test runs long enough for earlier packets to be received, don't ignore them but report the result. That would make the new behaviour useful in my case as well :)
from fping.
Yeah, I did read the discussion; I'm just disagreeing that this is more "consistent and predictable" :)
If you are ignoring late replies you are basically reporting a host as down even though it isn't. And for my measurements I specifically want a high sampling interval, but still be able to tell when latency increases above that interval (which it quite frequently does).
If you're only running a few pings, I can see the point of the new behaviour, but if you're running a long test (I usually run with -p 200 -c 300 for a full minute test), it doesn't make sense...
On the other hand, having the timeout parameter relate only to the end of the test, would make it possible to enforce a maximum duration for the whole test (which I currently have to do by means of a watchdog timer). Alternative, a time-based parameter instead of -c (i.e., "keep sending pings at this interval for x seconds") would be useful...
from fping.
I second @tohojo 's comment. The current behavior of using -t for a per-packet timeout isn't desirable in most cases. RTT can jump suddenly for many reasons, and that information shouldn't be discarded by default. Perhaps there is a use case for a per-packet timeout, but it could be a separate flag.
Useful timeout values depend on the environment under test. I think a generally effective default for the final timeout is to wait after the last request has been sent for some multiple of the max RTT (say 3x max RTT), falling back to a fixed timeout if there were no replies, and possibly placing a limit on the timeout if the max RTT is something unreasonably high.
from fping.
@tohojo: you are free to disagree of course... For me having a different measurement for the last ping was a convincing enough argument to change that behavior. Note that you can still set a longer -t value, if you want to. Just note that it might not produce the results that you expect.
from fping.
It seems to me that adjusting timeout based on observed RTT will be extremely confusing. It could be some additional feature (market it as "adaptive AI"...), but the normal flags should be easily understandable and testable by average user.
Sorry if I missed this in the comments, but what is the issue with specifying as large timeout as you want to handle?
from fping.
I want to support @schweikert 's words that "The new behavior was implemented to make the results more consistent and predictable". That's the key point.
Previous logic was very unclear.
from fping.
The argument for using a factor of max RTT is that you minimize both the total test time and the likelihood that you discard the last packet erroneously, which I think was the original problem reported. I think most users would be happy with this behavior without having to understand it.
The issue with specifying a large timeout is that in case the last reply from the server is lost, there could be an unnecessarily long wait time. You might choose 15 seconds as a time that you think you'll "never" see, but why wait 15 seconds at the end of the test when the maximum round-trip time you saw was 250 milliseconds?
But I think the choice of wait strategy at the end of the test is less consequential than the choice to time out individual round-trips. Knowing when and how often there are high round-trip times is often exactly what you're interested in, as it characterizes any bufferbloat, WiFi interference, hardware problems, etc. If those results are clipped, you can't differentiate between high RTT and packet loss.
Lastly, I empathize. Anyone who thinks that writing/maintaining a ping-like utility is "easy" either hasn't done it or hasn't been dragged through various corner cases. So, thanks for fping! :)
from fping.
I hate to ask for clarification on a closed issue, but does this change mean that if my period is 20ms, my rtt is 40ms, and my count is 100, fping will return 100% packet loss, rather than permitting multiple requests to be in flight simultaneously?
Shouldn't the sequence number solve this problem?
from fping.
You are right, in your case 100% packets will be considered as without reply - packet lost.
from fping.
Related Issues (20)
- setting SO_TIMESTAMPNS option: Protocol not available (on debian running in docker container) HOT 17
- -u does not work when -c is used HOT 3
- Duplicate reply packets when pinging multiple targets HOT 6
- -g option not working properly HOT 3
- fping ipv6 address -g not working
- Warning when using autoconf from version 2.62 and 2.70 HOT 2
- Architectures support
- fping -n -g x.x.x.x/24 hangs with systemd-resolved HOT 3
- not enough sequence numbers available! (expire_timeout=10000000000, host_nr=0, ping_count=0, seqmap_next_id=0) HOT 9
- Combinations "-c -a" and "-c -u" give same output as combination "-c -q" instead of just "-c"
- Inconsistent limits for address generation via `-g`, `--generator` using either range or CIDR HOT 2
- No error message when `getprotobyname()` fails HOT 1
- Time of execution when using retry -r increases in a nonlinear growth HOT 2
- errors on fping version 5.1 HOT 4
- -D, --timestamp with custom time string eq: %Y-%m-%d %H:%M:%S HOT 9
- fping6 issue with multiple addresses HOT 1
- fping crashes with __darwin_check_fd_set_overflow. Please help to resolve the issue. HOT 3
- Is it possible to provide official portable build ?
- Fix code scanning alert - Incorrect return-value check for a 'scanf'-like function HOT 1
- packet loss after 8-10 targets HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fping.