1. 3
  1.  

    1. 7

      I’d like to know how many DNS memes come from lack of knowledge about DNS rather than about DNS flaws.

      I’m aware DNS is one of those old (1983) Internet protocols that have evolved organically for ages, and that it can be unintuitive. Maybe an alternate implementation could be better. Maybe there’s a point to “well, there’s stuff that is not DNS that is not as problematic”.

      But DNS memes always remind me of the circles of blame; the devs blame the network, the network admins blame the OS, the system admins blame the software. DNS is often the guy who no one defends :(

      1. 3

        There are some legitimately difficult things about the DNS.

        There’s a lot of hidden state, primarily in caches. You can control it to some extent by adjusting your TTLs, but it’s never going to react to changes very fast. (There are sometimes problems with long-running programs ignoring TTLs; less of a problem now than it was, but it does still occur and you need to be aware of the possibility especially for non-browser clients.) In most cases you don’t have access to the caches so you can’t inspect the state; instead you have to infer it from the behaviour that results (eg, connections to IP addresses that should be out of service). This is a consequence of being a globally scalable loosely-coherent system, and I don’t think the DNS could have avoided it.

        The delegation hierarchy is complicated. One of DNS’s warts is that NS records have two distinct functions: in the parent zone, they say where to find the child zone, and in the child zone they suggest what the NS records in the parent should be. They are configured in different places so the fact that they have the same name is really confusing. The indirection between NS names and addresses makes it even harder. Weird things can happen if they are out of sync, made worse because the DNS can mostly work despite this kind of misconfiguration, so it’s hard to diagnose. It takes a while to fix because NS TTLs tend to be long.

        There’s a bunch of mysterious folklore covered by the vague term “propagation”. It covers things like parent zone update delay, parent zone transfer time, parent TTL, child zone transfer time, child TTL. Many of these are not under your control nor your DNS provider’s control. DNS providers often say things like “allow 48 hours for propagation” even when it should be effectively instant, which makes it hard for less knowledgable users to know whether it doesn’t work because it hasn’t “propagated” or because it’s misconfigured.

        DNS experts know about diagnostic tools such as Zonemaster and DNSviz but DNS providers rarely help their customers find good debugging tools.

        Then there are the things people do with the DNS that make it more difficult.

        Split-horizon DNS is very common: It goes hand-in-hand with NAT. It adds a lot of difficulty because you need to be aware of the place a query was made and how the expected answers vary depending on the place.

        It’s possible to make split-horizon DNS easier to debug if you use a distinct subdomain of a real public domain name for each private network. (And use an IPv6 fc00:random:: unique local address range.) But nobody does that, and many systems fight you if you try. The result is that you end up with multiple private networks using the same IPv4 address space and the same domain name. Confusing!

        This article is (I think) an advert for a new service discovery protocol to replace DNS in Kubernetes clusters. (It’s a bit vague.) Kubernetes and the cloud add a bunch of failure modes to the DNS on top of the split-horizon problems. In many cases the failures are due to bugs or implementing the DNS protocol incorrectly. (Or sometimes the weirdly low AWS DNS rate limits.) This article mentions getting a SERVFAIL instead of NXDOMAIN, which is a fairly gross implementation error. They also vaguely describe some problems that are signs of musl libc’s broken stub resolver.

        In principle I think it might make sense for something like Kubernetes service discovery to hook a new protocol into the libc name service switch, but many programs don’t use a libc with NSS so they are stuck with DNS as the only universally supported option.

        The reason that every problem ends up being a DNS problem is that every interaction over the network starts with a DNS lookup, so usually the first sign of a network problem is a DNS failure. You never get far enough to get an error message clearly indicating it’s a network problem! Partly this is the fault of trad resolver APIs not reporting network errors clearly; partly it’s because a DNS cache can’t report network errors clearly. (Maybe Extended DNS Errors could help.)

        And cloudy systems are designed to turn every problem into a DNS problem, because the cluster configuration is exposed to the cluster’s components over DNS, so a cluster control plane failure becomes a DNS failure. Replacing DNS with a different service discovery protocol will turn every problem into a problem with your new service discovery protocol.

        1. 2

          Oh, I’m not saying that DNS doesn’t have warts- I’m not well-aware of them (so thanks for elaborating, that is interesting). I’m just saying that maybe some DNS meme behavior is… misguided.

          BTW, it has always surprised me how when I was younger, DNS propagation was a thing (in some scenarios, we planned around two weeks!) but nowadays when it’s not “instant enough not to notice”, I don’t think I’ve ever seen anything longer than… 30 minutes?

          I think one of the issues with split horizon is that you commonly hear you don’t want to put records for private networks on public DNS. To this day, I still fear doing so, even though I really don’t see it as a worse issue than… the additional complexity from not doing so, etc.

          In any case, I agree DNS is “improvable”. What I’m saying is that I think most of its terrible reputation is not deserved.

          1. 3

            WRT “propagation” there are a few TLDs that have infrequent zone updates, but happily they are getting rarer. The reverse DNS is often slow, because there’s an incredibly shonky mechanism for RIRs to copy zone fragments between each other. (eg, 131.111.0.0/16 is registered via RIPE but the reverse DNS parent zone 131.in-addr.arpa is managed by ARIN.)

            For split-horizon DNS you can have a domain like private.cam.ac.uk that is both a real subdomain and not published on the public Internet. (You don’t want to publish names with RFC 1918 IP addresses because that can expose your users to interception attacks.) That worked OK with IPv4 and NAT but it was more awkward with IPv6 in the mix, but I think that kind of real-but-private subdomain is a good model for cluster-internal DNS.

            1. 2

              Ah, OK- I actually place all my non-Internet-facing hosts on internal.example.com and that’s it. But really, are interception attacks such a huge deal?

              (Also, you’re reminding me that I should get off my lazy ass and at least enable IPv6 in this flat, where the ISP provides it…)

              1. 3

                If it’s a corporate network with an oldskool security model, hard and crunchy on the outside / soft and chewy on the inside, then internal services are likely to be poorly configured in a way that makes interception attacks easier - leaking authentication cookies, things like that.

                1. 1

                  Ahhh, OK, that matches my intuition.

                  For personal stuff, although I explored using TLS internally and most stuff I would expose to the Internet directly (but I don’t because I don’t want to spend money on IPv4 addresses :D), I still don’t secure things as well as I should because, hey, it’s not exposed ¬ ¬U.

                  So it’s like “don’t put IP addresses on public DNSs of stuff you wouldn’t expose to the Internet”, I guess.

            2. 2

              I hate to say it, but then you’ve just not seen enough broken things regarding DNS.

              My examples are all close to 10 years old by now (kinda glad the last years didn’t have those problems..) but I don’t see how they have fundamentally changed, because it’s inherent to the protocol

              a) The ISP where our company’s main domain(zone) was hosted did something weird when we tried to transfer off and despite us lowering the TTL to something reasonable when moving they kept it cached for like a day - while locking us out of the controls (different problem, but nevertheless) - that was basically a 24h downtime for anyone who had resolved us before the move.

              b) we switched some subdomain and noticed traffic from home networks not dying off for a month after our TTL was up - so some routers decided to basically just cache indefinitely

              1. 2

                But how would a DNS replacement avoid caching issues?

                Optimism about clean slates sometimes turns out to be effective, but… I’m just not seeing how you can do better. (I’d be very happy to be wrong, of course!) In the end, there’s always going to be crappy implementations in the Wild West of the Internet.

                1. 1

                  But how would a DNS replacement avoid caching issues?

                  With a new protocol you can do things like force updates/invalidation and control failure modes more gracefully. There’s plenty of prior art out there.

                  In a datacenter environment it’s also feasible to centralize management. Maybe not all the way, but at least more than the public internet.

                  1. 1

                    Oh, that is correct, but then it’s only a partial DNS replacement.

                    I am not very much into the “service mesh” ecosystem, but I guess people are already working on similar stuff and solutions.

                2. 2

                  That’s another non-DNS DNS problem I have seen a lot: failing to follow the make-before-break rule in a migration. Sometimes it’s hard to get something working in its new home before turning off the old version, but it’s usually worth the effort because it removes a lot of timing constraints and anxiety about whether the cutover will work.

                  There are DNS-related things you can trip over during a migration: you need to know how to change DNS records without deleting them and you need to be aware of negative TTLs as well as positive TTLs in case you delete the records by mistake.

              2. 1

                This article is (I think) an advert for a new service discovery protocol to replace DNS in Kubernetes clusters.

                Author/founder here. Yep, we’re trying to do better for building software on internal networks. That could be Kubernetes, or ECS, or just a set of bare-metal boxes someone is managing. Most of the tooling you’re bringing up is focused on the public internet which has DNS problems wholly unrelated to what often bites you in the ass when managing a datacenter.

                In many cases the failures are due to bugs or implementing the DNS protocol incorrectly. (Or sometimes the weirdly low AWS DNS rate limits.) This article mentions getting a SERVFAIL instead of NXDOMAIN, which is a fairly gross implementation error. They also vaguely describe some problems that are signs of musl libc’s broken stub resolver.

                glibc, musl, every network hardware vendor, the JVM, and the list goes on.

                In principle I think it might make sense for something like Kubernetes service discovery to hook a new protocol into the libc name service switch, but many programs don’t use a libc with NSS so they are stuck with DNS as the only universally supported option.

                NSS doesn’t expose a better API, but it does let you abstract away name resolution. That’s… okay. You still end up dealing with all of the world’s terrible DNS clients and client-side debugging tools even if you are solving the data distribution problem in a more sane way.

                And cloudy systems are designed to turn every problem into a DNS problem, because the cluster configuration is exposed to the cluster’s components over DNS, so a cluster control plane failure becomes a DNS failure. Replacing DNS with a different service discovery protocol will turn every problem into a problem with your new service discovery protocol.

                We’re pretty sure this is a property of how systems are built and how they fail, not a tautological truth about relying on the network.

              3. 1

                I agree with you. It reminds me of the old “i’ll use regular expression…now you have 2 problems” meme often repeated, but I knew regular expressions. I never liked the meme. It is just a tool. Know how to use your tools and you won’t shoot yourself in the foot.

              4. 1

                And since DNS is so limited - it’s a function from a name to an IP address - we can mostly avoid needing to rewrite application code.

                How does this work transparently without client opt-in? I looked at ezbake; how is this different from Just Another Sidecar?