Coder Social home page Coder Social logo

Comments (15)

dmiller-nmap avatar dmiller-nmap commented on May 18, 2024

Confirmed on Ubuntu 14.04 on VirtualBox with NAT adapter. Dynamic-linked with system libpcap sends about 2x as many packets and takes 4 times as long.

from nmap.

dmiller-nmap avatar dmiller-nmap commented on May 18, 2024

Regarding the 2x as many packets: A dropped probe results in an increased number of retries. Since most ports are closed, this means most ports will get an extra retry. I'm seeing anywhere between 1 and 3 extra retries for the system-libpcap, which if the default is 1 retry, means between 1.5 and 2.5 times as many packets sent. Still no clue as to why it's happening, though.

from nmap.

dmiller-nmap avatar dmiller-nmap commented on May 18, 2024

Ok, I'm not sure why packets are being dropped: maybe that's a bug on our end. But I do see a solid reproducible difference between the two libpcaps in terms of time taken to get one packet. Command to reproduce: nmap -n -Pn -p 80 -d --packet-trace scanme.nmap.org. This sends exactly 1 SYN packet, receives exactly 1 SYNACK, and then the OS sends a RST (assuming no real network problems).

Within readip_pcap, we call pcap_select (from libnetutil) to wait for a pcap fd to be available for reading. With our included libpcap, select returns almost immediately, since a packet is available on the pcap fd. With the system libpcap, it waits the entire timeout (1 second for the first packet at least) before returning. The receive time on the packet is the same, so there's no difference in how fast the packet is processed; the select call is the only thing delaying things.

I don't know if this is related, but just above the pcap_select call in readip_pcap, we try to do a non-blocking read by doing pcap_setnonblock followed by pcap_next. But the man page for pcap_setnonblock says that pcap_next "will not work in non-blocking mode." It looks like we should be able to eliminate that whole code block, but I don't think it will affect this issue.

from nmap.

dmiller-nmap avatar dmiller-nmap commented on May 18, 2024

Ok, I've got good news and bad news, but I'm not sure which is which.

First, the difference between our included libpcap 1.5.3 and the system one is the result of a bug in our version. We introduced a configure option to turn off packet ring capture support for some 2.6 kernels that can't produce a good 32-bit binary. This is implemented as a preprocessor macro, PCAP_SUPPORT_PACKET_RING, but we forgot to put it into libpcap/configure.h.in so effectively this support is always turned off with the included libpcap. If I "fix" this problem, then we get the packet loss events with both included and system libpcaps.

Second, I believe the problem is related to this libpcap issue: the-tcpdump-group/libpcap#335. Note: related, but not identical. The "fix" for that issue is already in libpcap 1.5.3, so we have it. There's a couple related changes to the Linux kernel this year that address the deficiency (one linked from this one: torvalds/linux@41a50d621)

from nmap.

dmiller-nmap avatar dmiller-nmap commented on May 18, 2024

A few final notes on this issue:

  • The most relevant bug report on libpcap for this issue is the-tcpdump-group/libpcap#380. The suggested workaround there is using a very short timeout on select(). I can confirm that this works to avoid packet loss if we add a short-timeout select before the primary one in readip_pcap, but I can't figure out how to make this a general solution that doesn't alter the timing too much or just become a busy wait. Example patch:
--- a/tcpip.cc
+++ b/tcpip.cc
@@ -1710,6 +1710,9 @@ char *readip_pcap(pcap_t *pd, unsigned int *len, long to_usec,

     if (p == NULL) {
       /* Nonblocking pcap_next didn't get anything. */
+      if (to_usec < 200000 && pcap_select(pd, to_usec) > 0)
+        p = (char *) pcap_next(pd, &head);
+      else
       if (pcap_select(pd, to_usec) == 0)
         timedout = 1;
       else
  • As noted in that libpcap issue, the underlying problem is a bug in Linux TPACKET_V3 mmapped packet capture. This bug was fixed in Linux 3.19, and I can confirm that Nmap has no further issues on Linux 4.0. Please file bug reports with your distros to backport the patch if possible: torvalds/linux@da413ee
  • As a workaround, you can configure Nmap --disable-packet-ring --with-libpcap=included (an option that is passed along to libpcap's configure script). This disables the packet ring mmapped packet capture, which could slow down Nmap in very-high-packet-rate cases, but will be far less troublesome than this particular bug. Alternatively, you could try to muck around in libpcap/pcap-linux.c to try to downgrade to TPACKET_V2 which does not have this problem, but I have not tried that.

Given that we have these workarounds, and the bug is demonstrated to be in the Linux kernel code, not in Nmap or libpcap, I am removing the release milestone. I will leave the bug open until either several major distros backport the kernel fix or we find a suitable workaround.

from nmap.

dmiller-nmap avatar dmiller-nmap commented on May 18, 2024

Filed a bug report with Ubuntu to backport the Linux patch

from nmap.

nnposter avatar nnposter commented on May 18, 2024

FWIW I have experimented with downgrading to TPACKET_V2, instead of disabling the packet ring. The downgrade does rectify the issue but in my light testing I have not noticed any material performance advantages.

Here is the corresponding patch if anybody cares for it.

--- a/libpcap/pcap-linux.c
+++ b/libpcap/pcap-linux.c
@@ -188,6 +188,8 @@
 # endif /* PACKET_HOST */


+# undef TPACKET3_HDRLEN
+
  /* check for memory mapped access avaibility. We assume every needed
   * struct is defined if the macro TPACKET_HDRLEN is defined, because it
   * uses many ring related structs and macros */

from nmap.

pr0letariat avatar pr0letariat commented on May 18, 2024

Is this fixed in Nmap 7?

from nmap.

mpontillo avatar mpontillo commented on May 18, 2024

Cross-posting here from the related Launchpad issue. Inspired by the flow-disruptor workaround, I did a proof-of-concept nmap workaround as follows:

$ svn diff libnetutil/netutil.cc
Index: libnetutil/netutil.cc
===================================================================
--- libnetutil/netutil.cc   (revision 36280)
+++ libnetutil/netutil.cc   (working copy)
@@ -4073,7 +4073,8 @@
   Strncpy(pcapdev, device, sizeof(pcapdev));
 #endif
   do {
-    pt = pcap_open_live(pcapdev, snaplen, promisc, to_ms, err0r);
+    //pt = pcap_open_live(pcapdev, snaplen, promisc, to_ms, err0r);
+    pt = pcap_create(pcapdev, err0r);
     if (!pt) {
       failed++;
       if (failed >= 3) {
@@ -4084,6 +4085,11 @@
       sleep( compute_sleep_time(failed) );
     }
   } while (!pt);
+  pcap_set_promisc(pt, promisc);
+  pcap_set_timeout(pt, to_ms);
+  pcap_set_snaplen(pt, snaplen);
+  pcap_set_immediate_mode(pt, 1);
+  pcap_activate(pt);

 #ifdef WIN32
   if (wait == WAIT_ABANDONED || wait == WAIT_OBJECT_0) {

Obviously, this is nowhere near production-ready code, but I wanted to convince myself that the pcap_set_immediate_mode() workaround could work in nmap as well.

This caused the scan of a single host (using an Ubuntu 16.04 "Xenial" host running kernel 4.4.0) to go from taking ~45 seconds to ~5 seconds.

For the record, the test case I used was: sudo time ./nmap -sS -vv <host>.

from nmap.

dmiller-nmap avatar dmiller-nmap commented on May 18, 2024

@pontillo Thanks for notifying us! I'd like to play around with immediate mode a bit more to see how it could best work for us, but for now I'd settle for reproducing the original bug on a >3.19 kernel or at least fully describing and isolating it. Can you provide the output of nmap --version and uname -a for the setup that causes 45-second scans? Thanks!

from nmap.

mpontillo avatar mpontillo commented on May 18, 2024

Sure; below are some additional details.

First, here is my uname -a:

Linux xenial 4.4.0-36-generic #55-Ubuntu SMP Thu Aug 11 18:01:55 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

I tested with both the current version of nmap in Ubuntu 16.04, and a version I built from source (using the nmap-7.10 branch in Subversion).

The version packaged with Ubuntu (which, after running apt-get source nmap and checking debian/rules, you can tell is compiled with --with-liblua --with-liblinear --enable-ipv6) is:

Nmap version 7.01 ( https://nmap.org )
Platform: x86_64-pc-linux-gnu
Compiled with: liblua-5.2.4 openssl-1.0.2g libpcre-8.38 libpcap-1.7.4 nmap-libdnet-1.12 ipv6
Compiled without:
Available nsock engines: epoll poll select

With that version, I first saw the symptom: long scan times, and messages like the following printed to the console:

Increasing send delay for 192.168.0.9 from 0 to 5 due to 92 out of 306 dropped probes since last increase.
...
Nmap done: 1 IP address (1 host up) scanned in 45.53 seconds
           Raw packets sent: 2754 (121.160KB) | Rcvd: 1266 (50.652KB)
0.08user 0.08system 0:45.55elapsed 0%CPU (0avgtext+0avgdata 14852maxresident)k
1336inputs+0outputs (0major+1982minor)pagefaults 0swaps

I saw the same issue in the nmap-7.10 branch when I compiled from source. Then I compiled with the workaround I posted, (with no ./configure arguments, so it would use the shared library version of libpcap) as follows:

Nmap version 7.12 ( https://nmap.org )
Platform: x86_64-unknown-linux-gnu
Compiled with: nmap-liblua-5.2.4 openssl-1.0.2g libpcre-8.38 libpcap-1.7.4 nmap-libdnet-1.12 ipv6
Compiled without:
Available nsock engines: epoll poll select

The output from this version was normal, such as:

Nmap done: 1 IP address (1 host up) scanned in 5.99 seconds
           Raw packets sent: 1339 (58.900KB) | Rcvd: 1001 (40.052KB)
0.05user 0.02system 0:06.02elapsed 1%CPU (0avgtext+0avgdata 14472maxresident)k
2696inputs+0outputs (1major+1951minor)pagefaults 0swaps

from nmap.

dmiller-nmap avatar dmiller-nmap commented on May 18, 2024

I am in the process of updating the included libpcap to 1.8.1. I am going to optimistically remove our "disable TPACKET_V3" workaround and see how things go. I think it's important to record here the comment that @guyharris left on the-tcpdup-group/libpcap#380:

Linux kernel bug, about which there's not much we can do. If you're not using non-blocking mode and select()/poll()/epoll(), we work around it, as best we can, with a short poll() timeout inside libpcap. If you are using non-blocking mode and select()/poll()/epoll(), you have to work around it by making sure you have a short timeout in the select()/poll()/epoll(), at least on kernels with the bug, and, if the timer expires, call pcap_dispatch(). (With a high packet arrival rate, when blocking mode is used, TPACKET_V3 drops fewer packets than TPACKET_V2, so "don't use TPACKET_V3 on pre-3.19 kernels" isn't the right answer.)

from nmap.

djcater avatar djcater commented on May 18, 2024

I am going to optimistically remove our "disable TPACKET_V3" workaround and see how things go

@dmiller-nmap I tested Nmap 7.70SVN from a few days ago, which included the upgraded libpcap (built with --with-libpcap=included), and unfortunately I can still reproduce the false packet drops.

I have some further analysis which I can add later, but essentially on Ubuntu 18.04 with kernel 4.15, Nmap thinks there are packet drops (whereas Wireshark confirms that there are not). E.g. in Wireshark I can see the RST response to a SYN port probe come back almost immediately, but Nmap doesn't see it according to the debug output, and so it sends another probe. In the Nmap debug output for one closed port it looks something like:

-> S
(Wait)
-> S
<- RA
Ultrascan DROPPED probe packet to ... detected
Increased max_successful tryno for ... to 1 (packet drop)
<- RA

But in Wireshark it shows up as:

-> S
<- RA
(Wait)
-> S
<- RA

I confirmed that the # undef TPACKET3_HDRLEN fix in Nmap's libpcap fixed it.

I also confirmed that building with --disable-packet-ring fixed it.

I also confirmed that the pcap_set_immediate_mode patch fixed it.

So out of the 3 possible fixes, I've stuck with the 3rd one for now, because that fixes the issue regardless of whether the system libpcap is used or if Nmap's libpcap is used. And as Nmap relies on precise timing information for its scanning, it seems logical to me to use immediate mode. I think that's important because at the moment I can reproduce the issue with the Nmap 7.60 shipped with Ubuntu 18.04, as they are building with the system libpcap (1.8.1) to avoid have multiple copies of the same library. Kali appears to build with Nmap's libpcap included, so doesn't have this problem due to the in-tree # undef patch. So without fixing it outside of libpcap (e.g. in libnetutil), people using the default Nmap from Ubuntu (and perhaps other distributions) are always going to have this problem, slowing down their scans for no reason.

I can reproduce this reliably, so please let me know if there's anything you want me to test.

TL;DR: You still need to make some form of change to avoid this problem, even with libpcap 1.8.1 and a modern kernel such as 4.15.

from nmap.

guyharris avatar guyharris commented on May 18, 2024

And as Nmap relies on precise timing information for its scanning, it seems logical to me to use immediate mode.

If you're developing an application that's passively sniffing for packets, and do not care whether you see packets as soon as they arrive, but are willing to wait in order to get multiple packets per wakeup (causing fewer kernel <-> user transitions), then your application shouldn't use immediate mode.

If, however, you're developing an application that does want to see packets as soon as they arrive, and are willing to put up with more wakeups, then your application should use immediate mode.

So, if nmap is in the latter category, it should use immediate mode if it's available:

  • if libpcap has pcap_set_immediate_mode(), then it also has pcap_create() and pcap_activate(), so it should call pcap_create(), call pcap_set_immediate_mode() on the resulting pcap_t, set whatever other attributes are appropriate (note that it will not need to set the timeout, as that doesn't apply in immediate mode), and then call pcap_activate();

  • otherwise, set the timeout to a very low non-zero value, and, on platforms with BPF where BIOCIMMEDIATE is defined by <net/bpf.h>, do a BIOCIMMEDIATE ioctl on the result of pcap_fileno() on the pcap_t.

from nmap.

djcater avatar djcater commented on May 18, 2024

Thanks for the detailed information @guyharris. I've converted @mpontillo's comment into a pull request (#1291) just to make it easier to visualise.

It doesn't check for errors calling pcap_activate and it doesn't check if immediate mode is supported.

It looks like for libpcap you added support for immediate mode in 2013 to version 1.5.0-PRE-GIT: the-tcpdump-group/libpcap@48bc6c3

from nmap.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.