Coder Social home page Coder Social logo

Comments (19)

emmericp avatar emmericp commented on August 23, 2024

It shouldn't lose packets in that scenario.

I can't reproduce this with the l3-load-latency example which still works at full line rate as expected.

Command line: ./build/MoonGen examples/l3-load-latency.lua 14 15 0 1 60
(with commit 6d6cc3b which fixes this script when used with small packets, otherwise use the default packet size)

Can you post the script you are using?

BTW: what you are trying to do is probably not a good idea. Latency on full line rate is often problematic: every small pause causes buffers to fill up and these buffers can never be emptied as the packets are coming in at the same rate that you can send them out on.

The latency will just be a function of the buffer size after a short time.
Well, unless you are testing a hardware device which usually doesn't have this problem.

from moongen.

tristan-ovh avatar tristan-ovh commented on August 23, 2024

I do reproduce the bug with the example script: MoonGen /opt/moongen/examples/l3-load-latency.lua 0 1 0 1 60

[Device: id=0] Sent 14878234 packets, current rate 14.88 Mpps, 7617.60 MBit/s, 9998.10 MBit/s wire rate.
[Device: id=1] Received 14878826 packets, current rate 14.88 Mpps, 7617.63 MBit/s, 9998.14 MBit/s wire rate.
[Device: id=0] Sent 29758621 packets, current rate 14.88 Mpps, 7618.75 MBit/s, 9999.61 MBit/s wire rate.
[Device: id=1] Received 29759202 packets, current rate 14.88 Mpps, 7618.75 MBit/s, 9999.61 MBit/s wire rate.
^C[Device: id=0] Sent 42092496 packets with 2693919744 bytes payload (including CRC).
[Device: id=0] Sent 14.880372 (StdDev nan) Mpps, 7618.750699 (StdDev nan) MBit/s, 9999.610292 (StdDev nan) MBit/s wire rate on average.
[Device: id=1] Received 42092496 packets with 2693919744 bytes payload (including CRC).
[Device: id=1] Received 14.880374 (StdDev nan) Mpps, 7618.751910 (StdDev nan) MBit/s, 9999.611721 (StdDev nan) MBit/s wire rate on average.
Samples: 0, Average: nan ns, StdDev: 0.0 ns, Quartiles: nan/nan/nan ns
Saving histogram to 'histogram.csv'

from moongen.

emmericp avatar emmericp commented on August 23, 2024

Did you update to 6d6cc3b or later?
Timestamping was broken with small packets in this specific script independent from the rate.

Try to update or use the default packet size (124).

from moongen.

tristan-ovh avatar tristan-ovh commented on August 23, 2024

Thank you for your answers by the way :)
I am at the most up to date version, and it works fine at lower rates : MoonGen /opt/moongen/examples/l3-load-latency.lua 0 1

[Device: id=0] Sent 1953096 packets, current rate 1.95 Mpps, 1999.92 MBit/s, 2312.41 MBit/s wire rate.
[Device: id=1] Received 1953183 packets, current rate 1.95 Mpps, 1999.92 MBit/s, 2312.41 MBit/s wire rate.
[Device: id=0] Sent 3907147 packets, current rate 1.95 Mpps, 2000.94 MBit/s, 2313.59 MBit/s wire rate.
[Device: id=1] Received 3907269 packets, current rate 1.95 Mpps, 2000.94 MBit/s, 2313.59 MBit/s wire rate.
[Device: id=0] Sent 5861205 packets, current rate 1.95 Mpps, 2000.95 MBit/s, 2313.60 MBit/s wire rate.
[Device: id=1] Received 5861327 packets, current rate 1.95 Mpps, 2000.95 MBit/s, 2313.60 MBit/s wire rate.
^C[Device: id=0] Sent 6496110 packets with 831501952 bytes payload (including CRC).
[Device: id=0] Sent 1.954047 (StdDev 0.000006) Mpps, 2000.944480 (StdDev 0.004315) MBit/s, 2313.592055 (StdDev 0.005216) MBit/s wire rate on average.
[Device: id=1] Received 6496110 packets with 831501952 bytes payload (including CRC).
[Device: id=1] Received 1.954047 (StdDev 0.000006) Mpps, 2000.944186 (StdDev 0.006130) MBit/s, 2313.591715 (StdDev 0.007087) MBit/s wire rate on average.
Samples: 2320, Average: 254.4 ns, StdDev: 18.4 ns, Quartiles: 243.2/256.0/268.8 ns
Saving histogram to 'histogram.csv'

from moongen.

tristan-ovh avatar tristan-ovh commented on August 23, 2024

woops no, I was not at the latest version. It seems to work better now!

from moongen.

tristan-ovh avatar tristan-ovh commented on August 23, 2024

I still have problems with my own script but not with the example. I will see what you have changed. Thank you.

from moongen.

emmericp avatar emmericp commented on August 23, 2024

My recent changes shouldn't affect your code

from moongen.

tristan-ovh avatar tristan-ovh commented on August 23, 2024

Indeed they do not. I see that you use UDP packets instead of PTP packets for your example. Is there a reason to prefer one over the other?

While using measureLatency, I have also noted that you do not set pktLength in the UDP or PTP packets, resulting in broken packets that work anyway, but can be misinterpreted by some equipments.

from moongen.

emmericp avatar emmericp commented on August 23, 2024

It doesn't really matter here since we don't modify any of the PTP fields. The timestamper uses a PTP packet internally.
The packet type from :get*Packet() is just a fancy way to generate a C struct with some getters and setters.

You are right, the timestamper could set the size which would avoid problems when someone sets the wrong size here (note: my script does that in fillPacket())

from moongen.

tristan-ovh avatar tristan-ovh commented on August 23, 2024

I am trying to find the differences between your script and mine, as mine loses all timestamp packets. I see that you declare 3 TX and 3 RX queues for both devices. Are there reasons why you need more than 2 TX for load and timestamp sending, and 1 RX for timestamp reception?

from moongen.

emmericp avatar emmericp commented on August 23, 2024

One of the queues is for ARP rx/tx

from moongen.

tristan-ovh avatar tristan-ovh commented on August 23, 2024

The example l3-load-latency.lua does not report lost timestamps (the number of times measureLatency returns nil). When adding the reporting, I see that it is not 0 (about 5%), although the count of sent and received packets is exactly identical (same problem as with my script).

from moongen.

emmericp avatar emmericp commented on August 23, 2024

Okay, that means that timestamping fails for some reason.
This means that the packet was received successfully but it wasn't timestamped by the NIC for whatever reason.

I guess 5% loss of timestamping information at full line rate is okay, since latency measurements at full linerate are usually pointless anyways (see my previous comment).
I would be really worried if you were actually losing packets in a cable ;)

I'll keep this issue open and I'll have a look at the timestamping logic which determines whether the timestamping was sucessful or not using sequence numbers.

from moongen.

tristan-ovh avatar tristan-ovh commented on August 23, 2024

Indeed it is very acceptable.
My script works now. the problem was the size of the packets.

But I do not understand how the filtering works in your example. I believe that packets matching a filter will be sent to the chosen queue, but other packets may also be sent to that queue.
So for my script, I had to add a rule with a lower priority that sends all packets to another queue.

In your example, you have no such rule, which should mean that you receive all packets on this core. Moreover, the filter you use (filterTimestamps) seems to match only PTP packets, not PTP/UDP packets, so it should have no effect in your case.

Am I misunderstanding something?

from moongen.

emmericp avatar emmericp commented on August 23, 2024

All packets go to queue 0 by default. Only filters and RSS can redirect packets.

RSS is disabled by default, it has to enabled explicitly when configuring the device. And you would probably use a different set of queues for RSS.

Regarding the timestamp filter: this can actually be improved, yes. It currently just checks the PTP version at a specific offset in IP packets (mask.only_ip_flow) and it actually ignores the L4 protocol.
This should be changed, also checking if it is UDP is quite important for use cases like your TCP example in which this would match on some specific sequence numbers.

from moongen.

emmericp avatar emmericp commented on August 23, 2024

Commit 7e68758 changes the timestamp filter to check the L4 protocol.

from moongen.

emmericp avatar emmericp commented on August 23, 2024

I cannot reproduce the packet loss at full line rate.
I added a check for failed timestamps locally and I get all timestamps at full line rate.

What NIC are you using? I tested this with an Intel X540.

from moongen.

tristan-ovh avatar tristan-ovh commented on August 23, 2024

I use an Intel 82599ES. I made the test again and still have lost timestamps.
But it isn't really a problem for me.

from moongen.

emmericp avatar emmericp commented on August 23, 2024

I cannot reproduce the packet loss at full line rate.
I added a check for failed timestamps locally and I get all timestamps at full line rate.

But I'm using an Intel X540 which is basically the same NIC (datasheets are almost identical, same driver) just with 10GBase-T instead of SFP+ and a lot of bug fixes.

I would not be surprised if this is just a hardware problem in the 82599 NIC. I've seen some strange problems with that NIC that just don't happen on X540 NICs.

from moongen.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.