1. 39
    1. 26

      getaddrinfo is such a weird function. It’s carefully designed to be completely agnostic to protocol. I’ve written code with it on a machine with no IPv6 connectivity and then had it work seamlessly on IPv6 (and, apparently, on IPX on a machine that had the relevant bits of name resolution and protocol stacks configured). At the same time, it’s very clearly at the wrong layer. It’s a libc function, but it performs a blocking operation. If DNS times out, it can block for tens of seconds. Because libc (in traditional UNIX systems) does not depend on the threading library, there’s no way for it to be asynchronous.

      Accessing an environment variable makes thread safety hard but the real problem there is not related to getaddrinfo. There’s a get out of jail free card in POSIX:

      The setenv() function need not be thread-safe.

      This means that, if you call setenv from a multithreaded program, it is your fault. Friends don’t let friends call setenv. The correct time to set environment variables is in the parent process, before you pass them to execve. Don’t modify the environment after process creation. It will end badly.

      [ Edit ]

      The reason that setenv is such a terrible idea is that there is no sensible way of making it thread safe. POSIX also allows accessing the environment via the environ symbol. Any modification may race a read via that symbol. This comes with the wonderful caveat:

      Conforming multi-threaded applications shall not use the environ variable to access or modify any environment variable while any other thread is concurrently modifying any environment variable. A call to any function dependent on any environment variable shall be considered a use of the environ variable to access that environment variable.

      So, in a multithreaded application, you are responsible for providing synchronoisation for a global that is provided by the C runtime and may be accessed by things that aren’t aware of the synchronisation that you’ve added. Have fun and good luck!

      1. 9

        At the same time, it’s very clearly at the wrong layer.

        Yes. This. Please dear Eris, this. Shout this from the rooftops. Engrave it into your forehead. The amount of stuff that breaks in the most arcane of ways because of this is abjectly absurd. It’s as though they were trying to make as many “works on my machine” bugs as possible with a single function call. And as far as I can tell there’s no particularly good alternative?

        In reality there’s at least four different things you might want getaddrinfo to do that have wildly different performance characteristics and failure modes:

        • Look up local-only names (resolv.conf)
        • Look up local network names (things your domain controller can tell you, dynamic DNS from DHCP, that sort of thing), kinda special case but sometimes it’s the special case you need
        • Look up global network names via the actual DNS protocol and nothing else
        • The actual behavior of it, where you say “idk where to find this name but here it is, please try to figure it out”

        The whole NSS thing was a huge misdesign and it’d be really nice to be able to pretend it never happened.

        1. 17

          The whole NSS thing was a huge misdesign and it’d be really nice to be able to pretend it never happened.

          I think there’s a good case to be made for ‘look up in whatever name resolution mechanisms are configured for this system’. For example, if the user provides a .local domain, you (usually) want it to be looked up using whatever mDNS responder you have on the system (on non-Apple *NIX, this is usually Avahi, which is a bit of a shame because Apple’s mDNS-Responder has been more reliable, in my experience, and was open sourced). If you’re running on a primarily Windows network, you might want to have WINS consulted as well as DNS (Samba supports this). If a user wants to override things in /etc/hosts, they are free to do so.

          In particular, it’s nice that an application written before something like mDNS came along can take advantage of it. And that an application doesn’t have to care about the fact that it exists.

          I’m not sure how you can do that except by having a name resolution service (now, ideally, that should be a daemon that you connect to, rather than having the name resolution logic embedded in libraries loaded by libc, but that’s a somewhat separate issue: you can at least implement it on top of what NSS provides).

          There are a few big problems with getaddrinfo aside from the blocking behaviour. Most notably, it doesn’t tell you how it found the name. If a user provided it with a hosts file, I should probably trust it (it might not be trustworthy, but the user / sysadmin thinks it is, and who am I, a poor application, to say otherwise?). If the system DNS resolver found it and validated DNSSEC records, I should trust it. If the system DNS resolver found it and it was missing DNSSEC records… maybe I should trust it? At least I would like some indication of how much trust I should give it. I’d also like to know if this name was found with some local (or link-local) resolution mechanism or if it’s a global name, because I’ll potentially treat those differently.

          1. 4

            I think the proper solution for the NSS system is to have a local DNS resolver handle that complexity; that way the only relevant files become /etc/hosts, /etc/resolv.conf and /etc/hostname.

            If the user wants mDNS, it’s way better if the local resolver can handle .local being mDNS and configuring all that NSS stuff than having the application do it.

            Same goes for the other NSS domains (users, groups, etc.); there should be a local daemon the application can talk to that tells it the correct answers.

            Heck, by using Unix sockets for DNS resolution, you get a few goodies on top; the DNS resolver can know which user requested what. Etc.

            1. 11

              I think the proper solution for the NSS system is to have a local DNS resolver handle that complexity; that way the only relevant files become /etc/hosts, /etc/resolv.conf and /etc/hostname.

              A DNS resolver is definitely the right thing to handle DNS. It’s maybe the right thing to handle mDNS. It might be the right thing to handle /etc/hosts, given that it’s a one-off special case. I have difficulty believing that it’s the right thing to handle WINS. Having it act as a proxy that converts non-DNS name resolution into DNS records seems like the wrong abstraction. If you’d done this prior to DNSSEC, you’d end up needing to teach everything to handle DNSSEC results.

              I think the right thing to do is have an RPC interface for querying a set of name services. This is what Windows does. I believe it’s also what systemd does. With something like D-BUS, you can provide an interface that’s extensible and allows name servers to respond with more information than a DNS result, which consumers can then choose to ignore if they don’t understand it, or handle if they do. Unfortunately, NSS predates any attempt to standardise this kind of mechanism on *NIX systems. Running a new daemon that advertises name resolution services feels like a more UNIX solution than adding a new shared library that makes every program behave differently. It would also let you have per-user name resolution services, if you wanted your own network namespace. Plan 9 does this.

              1. 2

                I don’t think you need to teach anything DNSSEC, the local resolver can handle it. If you bend what DNS means a bit, and do the thing I mentioned with using a unix socket, you could probably get away with adapting WINS over this interface with little issue.

                Thanks to NSS, you can just adopt a completely new solution by simply writing a NSS library and setting nssswitch to use only that for everything.

                I’m not sure if D-BUS is the right solution here, it has to work in containers, which rarely run that mechanism to begin with (or even a proper init process).

                1. 2

                  True, if by “local resolver” you mean the stub resolver daemon that is replacing the libc resolver. DNSSEC should be validated after every network hop. This is one of the things systemd-resolved gets right.

            2. 3

              I’m not sure which came first, NSS or nscd, but they were both there by the time I started herding Suns in the 1990s. I gather nscd was supposed to help for things like NIS lookups, though I mostly avoided that side of things. For DNS it was hopeless: as far as I could tell, nscd was single-threaded, so if a machine had a non-negligible network load everything would end up waiting for nscd to do DNS lookups one at a time. Good idea, shame about the implementation.

              (There were some remnants of NIS in the Hermes mail system for user account management, but again that had suffered serious performance and even data loss problems under load - truncated password files can drive a sysadmin to drink. By the time I got involved the amount of NIS remaining was negligible so I don’t know if nscd helped or hindered in that context.)

              Another failure in this area was the BIND9 lightweight resolver daemon. It was “lightweight” in that all the resolver functionality was removed from libc (or libresolv) which became an IPC wrapper talking to lwresd. Sadly, the IPC cut was made in the wrong place so it didn’t work well with NSS, and it didn’t do anything to improve the API, and it came with a significant integration cost. So (as far as I know) no systems adopted it.

              1. 2

                I have never (knowingly) used ncsd, but it looks as if at least the FreeBSD version is multithreaded (there’s a command-line flag to force it into single-threaded operation).

          2. 6

            getaddrinfo, on the other hand, https://pubs.opengroup.org/onlinepubs/9699919799/functions/getaddrinfo.html

            The freeaddrinfo() and getaddrinfo() functions shall be thread-safe.

            So it’s a clear violation of POSIX by glibc.

            Here’s a python issue report from 2014, https://bugs.python.org/issue21216 and more, https://bugs.python.org/issue1288833 , https://bugs.python.org/issue25924 , https://bugs.python.org/issue26406

            1. 3

              The first link is a 2014 bug in eglibc which was fixed back then.

              The next three are literally the opposite: historically python called getaddrinfo with the GIL held, when releasing the GIL was investigated back in 2003 uncertainty around the behaviour of most platforms led to a getaddrinfo lock being added for safety. The issues you link are about lifting this lock for platforms confirmed to have thread-safe getaddrinfo.

              Ironically, the original issue (to unlock getaddrinfo) was at the behest of linux users.

              So it’s a clear violation of POSIX by glibc.

              The problem is not getaddrinfo, it’s setenv. Once setenv has corrupted the system all bets are off. getaddrinfo is but one of numerous syscalls impacted. Your car might have lane-assist, that won’t work if you’ve lost the wheels.

              1. 2

                I’m inclined to blame POSIX, for two reasons:

                • the getaddrinfo() spec ought to allow for the longstanding tradition of libresolv being configurable with environment variables - tho the RFC 3493 version of the spec also has this problem
                • POSIX is inconsistent about its treatment of environment variables used by libc

                There aren’t many libc environment variables in POSIX. The ones I could find are,

                • LANG and LC_xxx, which only affect setlocale(), so it’s relatively easy to avoid gotchas, and setlocale() is documented as not thread safe. (To preserve my sanity I have not looked into the interaction of things like ctype.h with threads and locales.)

                • PATH and execvp() - execvp() is thread-safe but it’s also a special case that does not compare well to getaddrinfo()

                • TZ and localtime_r() - murky, because while localtime() is explicitly documented to call tzset(), that isn’t the case for localtime_r(); however there is some discussion of localtime_r() and the various tzset() side-effects on global variables - which is amusing because localtime_r() is both thread-safe and allowed to mutate global state. https://pubs.opengroup.org/onlinepubs/9699919799/functions/localtime_r.html

                TZ is a great example, because tzset() is thread-safe, but it presumably uses getenv() which might not be. Tho I suppose it is permitted to use environ in a thread safe manner.

                There’s plenty of blame to share. I think more systems should implement setenv() and getenv() like Solaris’s thread-safe version.

                1. 2

                  To preserve my sanity I have not looked into the interaction of things like ctype.h with threads and locales.

                  That’s for the best. If you call setlocale with anything other than the C locale, a bunch of the multibyte conversion functions become stateful. These can be called by most of the locale-aware libc calls.

                  The only sensible thing to do with locales in multithreaded POSIX programs is make sure that nothing ever calls setlocale. If you must set a global locale, call uselocale, but ideally leave all global locale state set to C and use the _l-suffixed calls for anything where you’re preparing data for the user.

                  1. 2

                    The closest I got to these horrors was when working on ASCII case-insensitivity in BIND, during which we discussed if it was worth reimplementing tolower() for working with individual bytes [BIND always runs in the C locale], and the answer was yes, because we can avoid the extra indirection that tolower() needs in case the program calls setlocale().

                    1. 7

                      Solaris libc is nice in this regard. They have some preprocessor macros that you can set to say ‘this program does not want to support locales’. If you do this, most of the ctype functions are replaced with static inlines that do simple bit masking.

                      I would quite like that to be the default and for programs written in C/C++ to have to explicitly opt in to locales. Most of the time there’s a fairly straight choice between ‘C/C++ is the right tool for this job’ and ‘this program needs to deal with locales’.

                      1. 2

                        That Solaris feature sounds really nice!

                        The correct response to “this program needs to deal with locales” is, “fuck no! it needs ICU”

                        1. 1

                          The correct response to “this program needs to deal with locales” is, “fuck no! it needs ICU”

                          I think my response is ‘don’t write it in a systems language then, we have application languages with nice frameworks for that’.

              2. 4

                Or if you have to call it, you have to do so before starting any threads, which means don’t call any external libraries at all because you don’t know who might create a thread pool for some reason.

                Yeah, execve() or posix_spawn() arguments are the only safe places to change env vars.

                1. 1

                  So instead of an environment we’d need some sort of… registry ;)

                  1. 1

                    No, we’d get an environment, but they’d be immutable after process creation. Each new child process could take a modified or unmodified version of the parent’s environment, it just wouldn’t modify it during execution. If you need a variable that has a value that changes across the lifetime of a process, we have a way of doing that, and they can even be private to a compilation unit or DSO.

                  2. 1

                    At the same time, it’s very clearly at the wrong layer

                    I’m unclear what the right layer is. There should be a standard way for a C program to ask for a connection to a name without caring about low-level details. (There should be a standard way for that connection to be secured by whatever version of TLS the host system considers acceptable, too, but I guess that might be another argument). It’s very annoying that getaddrinfo isn’t asynchronous, but my biggest problem with it is that it doesn’t go further (by interpreting SRV records for example). I appreciate that it’s annoying that a broken system configuration can break applications, but I’m pretty sure the fake DNS server I have to run for some program that uses c-ares and doesn’t check /etc/hosts can also break applications.

                    1. 1

                      The process environment is a data structure. I don’t find it surprising that concurrency needs to be taken into account when reading or writing it. Even not knowing that getaddrinfo calls getenv specifically, I have always viewed modifying a process’s environment in-place as bad practice and a code smell. It’s nice to see “discoveries” like this that validate my position.

                      1. 1

                        Theoretically, who would we need to gather to make a POSIX 2.0 that fixes all these glaring issues? Would be nice if by 2050, people had it a bit better than we do now.

                          1. 1

                            Interesting. Do these ever change and fix anything meaningful?

                            1. 2

                              Dunno, tbh! The Open Group’s processes are very closed compared to standards orgs like the IETF or W3C so it takes a lot of effort to find out if it is even worth the effort if you aren’t a libc or shell utils maintainer.

                              1. 1

                                Only some very small parts. The only changes shipped in the 2018 edition compared to the 2008 edition was the ratification of two TCs, Technical Corrigendum 1 and 2. I can’t say exactly what is in those because they’re not freely available to the public.

                                In general, do not expect POSIX to make any big sweeping changes. Definitely don’t expect it to make changes without several existing OSs making them first. POSIX is very much a descriptive standard, it doesn’t lead the evolution of Unix-like OSs, it lags behind - usually by decades.

                          2. 1

                            How portable is this? The article mentions Linux and then even glibc, which isn’t used on all Linux and the tag is Unix.

                            While I get this is a general Unix/POSIX topic, I wonder whether there are differences across systems and how big they are.