1. 15

After seeing headscale (open-source implemenation of tailscale controller) and checking out the feature parity list in the README ( https://github.com/juanfont/headscale ), I’ve been wondering how a technical implementation would even work for the “Magic DNS” feature.

How is it even possible? The only ways i can think of are intercepting all DNS requests or managing a section /etc/hosts, and neither of these seems very elegant. What am I missing?

  1.  

  2. 20

    Tailscale employee here, basically it works by one of two ways depending on what DNS facilities the OS in question has.

    If you use Windows, {mac,i{,Pad}}OS or Linux with systemd and systemd-resolved, then tailscaled installs its DNS server (100.100.100.100) for your magic dns route ($domain.beta.tailscale.net). Your OS will then do the DNS routing based on the query name (IE: ontos.cetacean.org.github.beta.tailscale.net goes to 100.100.100.100 but google.com goes to the system’s default route). The real “magic” part of that is using DNS Search domains, which basically program in a “if there is no . in the query, append this and see what happens”. That makes ontos. point to ontos.cetacean.org.github.beta.tailscale.net, which then is resolved normally in the stack.

    If you do not use an OS with advanced DNS routing capabilities, tailscaled falls back to a few options: resolvconf (which it may have to fight with other programs because it’s a fancy wrapper around blowing away /etc/resolv.conf) and blowing away /etc/resolv.conf itself and then handling the DNS routing for you. Essentially, tailscaled has a DNS router built into it so that things Just Work :tm: even though the OS you are using doesn’t have the facilities needed to do the advanced DNS routing by itself. It is about as elegant as a square tire, but it works well enough that I’ve had trouble finding an OS where tailscaled’s DNS reprogramming fails.

    Either way this will also let you set whatever DNS routes you want on all of your machines at once using the admin panel. This lets you point a random subdomain (or even an imaginary top level domain if that is what you want to do; I’m a shitposter, not a cop) to an internal facing DNS server even if it is behind Tailscale so that you can create whatever kind of complicated DNS routing policy you want. You can even completely override the normal DNS request flow if you want (for example if acronym compliance says you need to scan all DNS queries made by all employees for whatever dubious reason), but I’ve found that most people really prefer to just have Magic DNS augment the existing DNS setup as opposed to becoming the ruler of DNS for their network.

    If you don’t want to have Tailscale manage DNS on your machine, either disable the Magic DNS feature in the admin panel or run tailscale up --accept-dns=false. That will disable the DNS reprogramming engine.

    1. 4

      Something else to keep in mind, Android (which I have been told counts as a separate OS from Linux even though super pedantically Linux is not an OS) apparently does have advanced DNS routing abilities, however due to facts and circumstances about how the Tailscale app for Android was designed there’s not really a good way for us to use it (it’s an app mostly written in Go, including the UI, and the JNI bindings for Go are “special”, especially given the APIs that are used to manipulate the Android network stack to begin with). As a result the Tailscale app on Android falls back to using Tailscale’s built in DNS server as the DNS resolver (this is also why the “fallback DNS” server setting exists, as otherwise we have no way to detect what the default DNS route would have been).

      Either way DNS is a mess and it makes us cry sometimes; however as a result of all of this pain I think that we end up with a better product for everyone. I really wish there was someone that we could pay to let us know exactly how all of the weird enterprise options in Windows intersect in ways that mess with how it does DNS resolution.

    2. 6

      They’ve blogged about this a bit in the past and the answer isn’t all that straightforward. For the configuration side they sync the configuration to the host from their control plane (source) so that their control plane servers don’t need to be involved in each request. The host side is much more complicated and is OS dependent. On the Linux side it depends heavily on how the individual distro configures DNS resolution. In the case of Windows, macOS, and Linux with systemd-resolved the resolver supports a routing table for DNS queries and can send them to different servers based on a few different criteria.

      1. 3

        While it’s really good to see Tailscale investing in maintaining technical accuracy and readability in their blog, they’ve unfortunately fell in the same trap as their predecessors.

        Of course, we think we’re more right than others, but the others think the same about themselves, and Debian resolvconf refuses to pick a winner.

        and later

        However, as Tailscale we actually want this behavior, so we use it to set DNS configuration when we can:

        No, you don’t want this behavior. There is no reason for Tailscale to be the authority of DNS on machines where Tailscale is deployed and it should not be handling the forwarding of non-Tailscale queries. A user of dnsmasq or systemd-resolved or similarly capable local DNS resolver should be able to specify which subdomains they want to resolve using Tailscale’s DNS. Should the UI for this tooling be improved? Absolutely. Should the Tailscale stack be where it happens? Certainly not. Multiple VPNs or other overlay networks could exist on the same machine and Tailscale shouldn’t be the one owning edge DNS routing.

        At this point, one wonders why none of the giants have tried to fix the real issue here:

        /etc/resolv.conf does not have support for routing DNS based on the domain name

        1. 3

          Even if you solved it you’d have to wait years for your code to end up in Debian or Red Hat and there’d be a good chance your fix was never accepted widely enough to rely on.

          1. 3

            Is resolv a good place for this though? If we can achieve subdomain DNS server routing with a single line in dnsmasq, is it worth trying to update an old, underspecified config file?

            1. 1

              Not sure. I think reasonable people could disagree on that, specially when it comes to embedded/containers. While I’m personally in the dnsmasq-everywhere-camp, we could do better than both the options.

              1. 6

                I fear that at this point unless someone makes a PR to glibc and other such libraries/oses with such an improvement and a distribution-agnostic specification to do it, you may be able to avoid the XKCD Standards problem; but I fear that the pushback is going to be along the lines of “just use dnsmasq/systemd-resolved/libfoobang” or whatever. If you want to champion such a thing then I’d be more than happy to use it, but I wouldn’t want to do it myself. The current state of the world is kinda painful yes but at least it somewhat works enough to bootstrap more elaborate mechanisms.