Coder Social home page Coder Social logo

Comments (10)

ioquatix avatar ioquatix commented on June 11, 2024 1

This is an interesting error. Async provides asynchronous address resolution, but this shouldn't be used as we are just resolving the IP address of the inbound connection, an operation that shouldn't involve any DNS lookup at all.

In the first instance, we could consider rescuing any lookup errors and just bypassing the socket IP address. However, this isn't strictly a good idea.

Perhaps a better option would be to try and collect/log the circumstances which cause this error to occur so we can investigate further.

Let me see if I can reproduce the issue.

from falcon.

travisbell avatar travisbell commented on June 11, 2024 1

@travisbell are you able to show me the contents of /etc/nsswitch.conf?

cat /etc/nsswitch.conf

# /etc/nsswitch.conf
#
# Example configuration of GNU Name Service Switch functionality.
# If you have the `glibc-doc-reference' and `info' packages installed, try:
# `info libc "Name Service Switch"' for information about this file.

passwd:         files
group:          files
shadow:         files
gshadow:        files

hosts:          files dns
networks:       files

protocols:      db files
services:       db files
ethers:         db files
rpc:            db files

netgroup:       nis

Seems fairly stock, I think?

Thanks for the incredible debugging! FYI--I had to roll back Ruby 3.3, which means I also rolled back Falcon on this service for now (it's just how our builds are setup). Just an FYI, I didn't roll it back because of this issue, I was hit by this memory leak so I'm waiting for the back port and 3.3.1 release to give this all another go.

from falcon.

ioquatix avatar ioquatix commented on June 11, 2024

I have not tried to reproduce this yet, but I've checked the implementation.

This is the code invoked by Addrinfo#ip_address:

/*
 * call-seq:
 *   addrinfo.ip_address => string
 *
 * Returns the IP address as a string.
 *
 *   Addrinfo.tcp("127.0.0.1", 80).ip_address    #=> "127.0.0.1"
 *   Addrinfo.tcp("::1", 80).ip_address          #=> "::1"
 */
static VALUE
addrinfo_ip_address(VALUE self)
{
    rb_addrinfo_t *rai = get_addrinfo(self);
    int family = ai_get_afamily(rai);
    VALUE vflags;
    VALUE ret;

    if (!IS_IP_FAMILY(family))
        rb_raise(rb_eSocket, "need IPv4 or IPv6 address");

    vflags = INT2NUM(NI_NUMERICHOST|NI_NUMERICSERV);
    ret = addrinfo_getnameinfo(1, &vflags, self);
    return rb_ary_entry(ret, 0);
}

It calls addrinfo_getnameinfo with NI_NUMERICHOST|NI_NUMERICSERV flags which means it should not perform any actual DNS lookup.

/*
 * call-seq:
 *   addrinfo.getnameinfo        => [nodename, service]
 *   addrinfo.getnameinfo(flags) => [nodename, service]
 *
 * returns nodename and service as a pair of strings.
 * This converts struct sockaddr in addrinfo to textual representation.
 *
 * flags should be bitwise OR of Socket::NI_??? constants.
 *
 *   Addrinfo.tcp("127.0.0.1", 80).getnameinfo #=> ["localhost", "www"]
 *
 *   Addrinfo.tcp("127.0.0.1", 80).getnameinfo(Socket::NI_NUMERICSERV)
 *   #=> ["localhost", "80"]
 */
static VALUE
addrinfo_getnameinfo(int argc, VALUE *argv, VALUE self)
{
    rb_addrinfo_t *rai = get_addrinfo(self);
    VALUE vflags;
    char hbuf[1024], pbuf[1024];
    int flags, error;

    rb_scan_args(argc, argv, "01", &vflags);

    flags = NIL_P(vflags) ? 0 : NUM2INT(vflags);

    if (rai->socktype == SOCK_DGRAM)
        flags |= NI_DGRAM;

    error = rb_getnameinfo(&rai->addr.addr, rai->sockaddr_len,
                           hbuf, (socklen_t)sizeof(hbuf), pbuf, (socklen_t)sizeof(pbuf),
                           flags);
    if (error) {
        rsock_raise_resolution_error("getnameinfo", error);
    }

    return rb_assoc_new(rb_str_new2(hbuf), rb_str_new2(pbuf));
}

This code invokes rb_getnameinfo:

static void *
nogvl_getnameinfo(void *arg)
{
    struct getnameinfo_arg *ptr = arg;
    return (void *)(VALUE)getnameinfo(ptr->sa, ptr->salen,
                                      ptr->host, (socklen_t)ptr->hostlen,
                                      ptr->serv, (socklen_t)ptr->servlen,
                                      ptr->flags);
}
int
rb_getnameinfo(const struct sockaddr *sa, socklen_t salen,
               char *host, size_t hostlen,
               char *serv, size_t servlen, int flags)
{
    struct getnameinfo_arg arg;
    int ret;
    arg.sa = sa;
    arg.salen = salen;
    arg.host = host;
    arg.hostlen = hostlen;
    arg.serv = serv;
    arg.servlen = servlen;
    arg.flags = flags;
    ret = (int)(VALUE)rb_thread_call_without_gvl(nogvl_getnameinfo, &arg, RUBY_UBF_IO, 0);
    return ret;
}

At first glance, this code looks a bit inefficient to me - numeric host/service should not incur any blocking overhead, but we still call it while releasing the GVL which doesn't make sense to me.

In addition, this is a simple operation which should directly read from the struct sockaddr data, so EAI_AGAIN (temporary failure in name resolution) is very odd. I need to find out under what circumstances that's occurring.

from falcon.

ioquatix avatar ioquatix commented on June 11, 2024

@travisbell are you able to show me the contents of /etc/nsswitch.conf?

from falcon.

ioquatix avatar ioquatix commented on June 11, 2024

There is some useful discussion here: https://stackoverflow.com/questions/42041606/under-what-circumstances-can-getnameinfo-return-eai-again

from falcon.

ioquatix avatar ioquatix commented on June 11, 2024

Hmm, it looks really normal. I wonder if the memory leak could potentially cause the name resolution to fail.

Thanks for trying it all out and giving feedback, that can be a lot of work, so it's really appreciated.

I'll investigate more on my end.

from falcon.

ioquatix avatar ioquatix commented on June 11, 2024

Do you mind giving some details about the host environment? Are you running on some kind of instance? Can you give details of the hardware?

from falcon.

ioquatix avatar ioquatix commented on June 11, 2024

I believe this is a bug in Ruby v3.3.0: https://bugs.ruby-lang.org/issues/20172

As such, there is nothing we can do but wait for Ruby v3.3.1 to be released with the fix.

from falcon.

travisbell avatar travisbell commented on June 11, 2024

No kidding, another 3.3.0 bug? Ok, well that makes it easy then!

Thanks again @ioquatix for all the debugging you did to figure this one out. 🙌🏼

from falcon.

ioquatix avatar ioquatix commented on June 11, 2024

Maybe you can report back if you try 3.3.1 so we can confirm it's sorted, but I guess in this case no news is good news :)

from falcon.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.