1. 31
  1. 20

    The logic in this article appears to be: gethostbyname is a blocking call; therefore all serious applications should use a dedicated DNS library; therefore gethostbyname is unreliable. These points do not automatically follow, and no point is raised indicating why a blocking call must be intrinsically less reliable than a nonblocking one. The existence of the article implies glibc must have a more reliable implementation, suggesting that the function signature isn’t the problem.

    1. 17

      That isn’t exactly what I got out of it.

      That all serious applications don’t use gethostbyname() because gethostbyname() is unreliable, is assumed. Exactly why the author assumes this isn’t given, but I assume it too, so I didn’t have a problem with this, and I’m happy to speak to them. My reasons are simple:

      1. gethostbyname() cannot return multiple address types, and,
      2. gethostbyname() cannot be trusted to return more than one address (this is what the author is referring to by talking about Alpine and ancient BSD)

      “Serious applications” do complicated stuff to make for a good user-experience, and there’s no way to do that with gethostbyname(). I think this may have been the author alluding to the address types issue, but (more to the point) I can believe if you don’t know about this (or how to do this) that you might think gethostbyname() is fine – it isn’t, and to that end yes the function signature is absolutely the problem. It’s not as bad as gets() but everyone on the Internet really should be using getaddrinfo() instead.

      Something else I think is important requires backing up for a minute. I think what kubernetes/coredns does is stupid too: It knows all the services and already knows how to update files dynamically, so it can in some systems substantially reduce delays and network traffic by just making hosts files; e.g.

      (echo ' localhost.localdomain localhost';
      kubectl get svc -A -ogo-template --template="{{range .items}}{{.spec.clusterIP}} {{.metadata.name}}.{{.metadata.namespace}}.svc.cluster.local
      {{end}}") | kubectl create cm hosts --from-file=hosts=/dev/stdin

      that is to say, there’s no reason for this incompatibility to be a problem: Push is always better than polling. And think about all that code people could delete.

      But maybe this isn’t obvious (after all, someone thought they should be disabusing DNS).

      1. 2

        gethostbyname isn’t threadsafe. It’s defined to be not thread-safe in POSIX and documented as such in the Linux man pages.

        Eventually even macOS added the thread-safe GLIBCism gethostbyname_r.

        1. 5

          There’s no good reason for adding gethostbyname. getaddrinfo is defined to be thread safe, and if you’re modifying the source code then you should update to the API that doesn’t hard-code so many assumptions.

          1. 2

            gethostbyname_r’ predates getaddrinfo by 5 years. New code should be using getaddrinfo, but there’s a lot of code out there…

      2. 20

        If you use a a custom library for DNS lookups, will it support multicast DNS, i.e. the .local domain? Will it obey your wishes to e.g. use different DNS servers in different network environments, or to only use DNSsec? Will applications using different custom DNS libraries share a cache to avoid redundant queries?

        DNS lookup seems to me to be a core shared system service (as it is on Apple platforms), not something you outsource to a smorgasbord of different libraries. And the fact that some ancient Unix API has awful semantics shouldn’t prevent the OS from offering a better API in that core service, relegating gethostbyname/getaddrinfo to legacy status.

        1. 6

          This. It really sucks that nobody tried to extend NSS with a well-designed async API.

          I guess what prevents this kind of thing is the lack of coordination. POSIX is mostly treated as “something that comes from above and we kinda should follow it”, there doesn’t seem to be an active W3C style participatory process for unix APIs :(

          1. 6

            Yeah, I want name resolution, not DNS lookup and that’s what gethostbyname offers. Using NSS on Linux and netinfo (or whatever replaced it) on Mac users & administrators & distributors can establish how that should be done.

          2. 13

            using APIs that were designed to simply parse /etc/hosts and had DNS support shoehorned into them will always deliver unreliable results.

            I really don’t like that conclusion. It seems to imply we can never improve or extend behaviour beyond what was originally planned. And it comes from someone writing on the military experiment ARPANET, likely serving the content from a toy reimagining of Minix. That position is: It was not reliable in the past so we shouldn’t ever try to improve things, even though almost every single application in the wild assumes reliability.

            Let’s not ignore reality. Let’s improve things where we can.

            1. 6

              That’s a pretty broad take away from a fairly narrow statement though, isn’t it? It specifically says “APIs”, which in this case, can’t really be improved on or extended. The examples you give are more akin to creating a new, improved API based on the original, which is closer to the lines of what the author is arguing.

            2. 5

              tl;dr avoid musl?

              1. 3

                how did you come to that conclusion after reading the article?

                1. 7

                  The article seems to take a myopic view of name resolution only considering DNS resolution even though the whole point of gethostbyname is to be a unified interface to host name resolution, regardless of the source of those names. In modern system names come from many places, just relying on DNS resolution isn’t going to work well.

              2. 5

                So the problems with gethostbyname (and associated APIs), according to this post, are:

                • It doesn’t do DNS specifically, but rather name resolution… which is exactly what you want, if the sysadmin disables DNS in the NSS they presumably have a very good reason to do so and, as a good citizen of the platform, my application should respect their wishes; and:
                • The APIs are blocking… which is unfortunate, but can be worked around by the application by having a dedicated name resolution thread or process which exposes an async API. It can also be worked around to some extent with a local name resolution cache on the machine.

                And as a solution to these “problems”, it’s suggested that you instead break your application’s name resolution by hard-coding the use of DNS where the administrator might rely on some other system.

                Most of Ariadne’s Alpine/Musl stuff is good, but this is a miss IMHO.