1. 55
  1. 22

    I do… kinda get the “don’t hog the ports” arguments, in that I could see some weird adversarial DOS issue where port 80 or 22 is challenged on a multi-user system (somehow try to get the main SSH process to crash, while simultaneously trying to bind 22 yourself).

    I am very sympathetic to the security arguments, but there is also an argument that all port binding should be privileged! It’s a limited namespace, relied upon by other machines connecting to this machine! I don’t think “privileged” needs to be “is root” (and it really shouldn’t be), but it’s definitely something

    1. 1

      It’s a limited namespace, relied upon by other machines connecting to this machine

      Starts with claiming 3306 and taking all the DB logins with you. (I haven’t seen one DB-TLS-Cert in a long time.)

    2. 22

      Why not just let systemd do it?

      1. 20

        Please don’t encourage an init monoculture!

        1. 45

          shrug let $INIT do it

        2. 16

          Decades prior to systemd, this was normally done by inetd. Only inetd would listen on privileged ports, it would then accept the connection and start a new (typically non-root) process that received the open socket as its standard in/out file descriptors. This was largely abandoned when servers started handling multiple connections in a single process.

          I’m quite surprised that there are still vulnerabilities from this though - dropping privileges as soon as you’ve bound to a port was a pretty common thing when I learned socket programming 20+ years ago and things like Capsicum require this style of programming.

          As others have pointed out, you can also use various MAC frameworks to allow specific ports to be bound by specific programs, specific users, or specific {user, program} pairs. Personally, I’d rather that systems automated that in their packaging systems (e.g. the nginx binary running as the httpd user may bind to ports http and https, provided by a configuration file in the package-install script). Ideally, all ports would be default-not-allowed unless you configured them in this way.

          1. 11

            process that received the open socket as its standard in/out file descriptors

            Yeah that is the problem with inetd that systemd solved like a champ. systemd allows you to listen on as many sockets as you want and then these will be passed to target process via FD passing. It (by default) do not use standard in/out and you are left with them as is (as in most cases you want to use them for logging).

            dropping privileges as soon as you’ve bound to a port was a pretty common thing when I learned socket programming 20+ years ago and things like Capsicum require this style of programming

            The problem is that such approach is very imperative in style, and it is hard to do with VMs like for example Erlang VM. Because if you want to drop the privileges, then which Erlang process should do it, and when? You have no way to know when to drop as you cannot guarantee that all sockets that you will need are already opened.

            1. 1

              This was largely abandoned when servers started handling multiple connections in a single process.

              I wonder, if it’s possible to write a daemon along the lines of inetd but support concurrency, multiple connections and more “modern” stuff.
              Sounds like a cool project idea.

              1. 3

                It might be interesting if inetd could provide a UNIX domain socket over which it would send newly accepted sockets as FDs. This would work nicely with Capsicumised daemons. They could run as inetd subprocesses waiting for new FDs on the socket that it passes. There are two (probably solvable) problems:

                • Shutdown is tricky. At some point, the daemon should shut down and wait for inetd to restart it if there’s a new connection. If the daemon makes this decision, then it may race a new connection going into its socket and so you’d need an ack to make sure that inetd restarted the daemon for the socket that it’s just accepted. If inetd makes the decision then it needs to know when the daemon has finished handling all requests.
                • For very short-lived requests, the accept rate can be a bottleneck. If you require inetd to accept, then send the FD as an out-of-band message, then the receiver to receive it, then you have a lot more overhead. This might not matter, because anything that’s expecting a huge connection rate probably doesn’t want to sit behind inetd. You could also potentially have a mode where inetd passes the child process the bound socket and allows it to accept connections for a bit if the accept rate is high.
            2. 5

              FTA: “For one thing, it doesn’t work in rootless containers [… which don’t] have systemd running sysctl.d so the configuration setting doesn’t get applied at boot.”

              1. 6

                I’m specifically referring to socket activation, where systemd binds to the port and passes the fd to the service. That way, the service never needs to be root or have CAP_NET_BIND_SERVICE. I consider the sysctl.d workaround a hack.

                1. 4

                  There’s nothing special about systemd in this regard. Any root permissioned process can run sysctl -p.

                2. 4

                  You could use systemd socket activation (systemd’s version of inetd) to accomplish this with slight modification to the underlying server you’re looking to run. And if you’re not running systemd, you probably have inetd available and can just use that.

                  1. 2

                    As the article notes, there are plenty of systems without systemd or any other scriptable init. It mentions rootless containers. WSL also lacks a good way to run init scripts.

                  2. 11

                    Dear Browsers, HTTP must support SRV records.

                    That would be a much more robust solution! Then you don’t need the hardcoded ports 80/443 at all!

                    Nothing should be using a hardcoded port number, they should be using SRV records (or other negotiation techniques). If nothing uses a hardcoded port number, then you don’t care about ports < 1024.

                    1. 6

                      SRV records would be great in general.

                      It would be great to have priorities and “real” fail over instead of load balancers, which are yet another thing that could go down, so with two application servers and one load balancer you might actually reduce availability (one thing must be up and one of two things must be up, compared to only one thing must be up). Of course in reality this would likely be multiple load balancers for example.

                      Still it’s a bit crazy how for HTTP we switch from text to binary headers, from TCP to a newly invented protocol on top of UDP and still don’t have SRV records. Would be great if that could finally use basic DNS records solving all sorts of problems from quite a few decades ago with HTTP.

                      1. 3

                        It’s not that surprising. All of the transitions that you mention are to reduce latency by reducing the total number of network round trips required to get from the click to the first chunk of the page being loaded. Supporting SRV records requires one additional DNS lookup per remote server. Worse, SRV records are usually not present and often negative cache times for DNS are fairly low, and so you may end up with your DNS cache having to go and querying the authoritative server a lot to double check that an SRV record hasn’t been added recently. With DNSSEC, negative records (NXDOMAIN) are signed and so there’s some additional overhead and, because DNSSEC responses are often too large for UDP, they may require the query of the authoritative server to use TCP, so you have the full TCP SYN-ACK dance before you even send the query from the cache.

                        1. 1

                          Yes and no. Sure that adds overhead, just as the load balancer adds overhead.

                          Regarding caching. We already have that issue. See what Google does with A records and rotating them.

                          I also would argue that all the layers of DNS caching a typical system has already: Provider one, OS/local DNS one, and libraries/software (browser) we don’t loose much. We have an initial DNS query (and additional multiple parallel ones, which we have for external resources). And then we can keep them between minutes and days. There’s a TTL after all.

                          Low latency only really matters in specific cases. You usually weren’t having an issue for regular web browsing. If you are a big search engine the protocol MIGHT help (though likely it is basically irrelevant, because it pales compared to load times of a large amount of websites). But if you are a big search engine you’ll anyways have your own strategy for DNS. Where in today’s world it’s often more relevant is where HTTP is used as some kind of API. And here you’ll get the SRV records once and likely make a huge amount of request using your HTTP/2 or HTTP/3 connection/channel, which you also likely will keep open if you care about latency. And in a microservice environment once that breaks it’s likely going to be that there is some form of failover. So you do DNS queries again.

                          That said, a lot of ways of handling the discovery of micro services DO use SRV. Consul for example.

                          So while there’s of course added latency I’d argue both that it’s still worthwhile and that DNS isn’t the thing that will diminish gains in most situations. And if they do and matter, I’d think it would be easy to mitigate the issue. However, at least right now I can’t think of a real life situation where that would really be a problem, as it would largely be end users visiting a website and the initial request being a most likely unnoticeable amount slower. And that’s only if the provider didn’t have it cached, otherwise it would be even less noticable. And all of that like I said is overshadowed by everything involving the page load.

                          1. 1

                            Yes and no. Sure that adds overhead, just as the load balancer adds overhead.

                            Not really. A load balancer typically adds some extra in-datacenter hops (significantly under 1ms), whereas additional DNS lookups require cross-Internet round trips (can be tens or hundreds of ms).

                            Regarding caching. We already have that issue. See what Google does with A records and rotating them.

                            This is very different for a couple of reasons. First, the A record lookup is unavoidable. If you could skip it, I’m sure Google would (there are some nice service-discovery protocols that let you do service discovery in the first packet, which connection migration in QUIC might let you scale up to the Internet at some point).

                            Second, the negative caching isn’t a problem with A records. If the A record isn’t there and it takes you a while to discover that (because you have to walk the DNS hierarchy to find the missing SOA or A record), this doesn’t matter because you’re going to show an error page anyway. In contrast, if an SRV record isn’t present then you fall back to using the default ports. That means that you can’t initiate the HTTP connection until you’ve got both the A and SRV responses.

                            Low latency only really matters in specific cases

                            Google has data showing that the bounce rate is significantly higher if the delay between clicking on a link and seeing the page is >200ms. Adding SRV record lookup here could easily add 20-50ms latency.

                            That said, a lot of ways of handling the discovery of micro services DO use SRV. Consul for example.

                            I’ve not see Consul before, but it appears as if it is different from the browser case in two ways:

                            • Most of its traffic is within a datacenter, so doing a DNS lookup on a local resolver to get the SRV record will be cheap (especially given that you’ll be using hundreds to thousands of microservices and so they’ll all be cached pretty quickly).
                            • It can expect an SRV record and so doesn’t hit the slow path as a result of the lack of negative caching.
                            1. 1

                              EDIT and tldr: You are right. I just want a way to trade a little bit of latency for more uptime with less needed parts. SRV is great and we should use it for more things, even when it’s not optimal here.

                              Not really. A load balancer typically adds some extra in-datacenter hops (significantly under 1ms), whereas additional DNS lookups require cross-Internet round trips (can be tens or hundreds of ms).

                              Again, only for uncached results. I was speaking about in-datacenter situations here. In bigger micro service architectures your milliseconds do matter. Losing a whole second very couple of thousand of requests, depending on the setup can be significant, however usually that means you are doing something wrong, if DNS is the culprit. But as mentioned, it would usually be cached - most likely on-instance or in-process. (fun side-story: I once tracked down an issue for a client where a local unbound instance’s resources were exhausted leading to crash, because of a bug in an application, making them think it somehow was blocked of the internet)

                              For the rest of your argument, it seems like you are talking about situation where the client is a browser, given that you mentioned so with Consul. I wrongly assumed you might be talking about becoming a bottleneck on a high hit-rate, which is why I brought up caching.

                              So for browsers. It’s typically not the DNS lookups that cause load times relevant to bounce rates. It’s mostly the application (server, javascript, images, videos being loaded). Also given that even big IT company often enough waste time, for example by doing unnecessary waits, blocks, unnecessary cache validation or avoidable CORS negotiation DNS will not be the big factor.

                              Don’t get me wrong, given that I initially considered the sub-millisecond requests an issue, which you seem to be fine with, yes, there’s overhead, yes, depending on the case it will be as big as you say, but again, this only is true for non-cached situations, so it’s rather pronounced for search engine results because for small pages, small enough to not be cached on a closer/lower latency DNS server, that’s true.

                              In contrast, if an SRV record isn’t present then you fall back to using the default ports. That means that you can’t initiate the HTTP connection until you’ve got both the A and SRV responses.

                              That is correct, however, what about IPv6? We already have that problem there. There’s “Happy Eyeballs”. One could run both and both would be required anyways, theoretically and also a third if the server matched in SRV would require another A(AAA) record. So it’s not quite the same of course.

                              Honestly, I wished that applications would try to connect to multiple A (or AAAA) records when the first one isn’t available. That would still give you an easy way to stay up, independently of a third box (or service). I think that even used to be supported by browsers years ago, but seemingly was dropped again. To my knowledge it never was in any standard, so it’s certainly not reliable with other HTTP clients.

                              Google has data showing that the bounce rate is significantly higher if the delay between clicking on a link and seeing the page is >200ms.

                              Do you mean the study from 2017? If so the numbers aren’t right. There it said that compared to a 1 second a 3 seconds page load time would result in a higher bounce rate. Given that 3s is a lot, I’d also wonder about the overall quality of the website, which might also have an effect here.

                              Amazon evidently conducted a similar study, claiming 1% sales drop per 100ms, however to my understanding that would be in a scenario where DNS is cached (after search/opened products list).

                              Don’t get me wrong, I certainly don’t dismiss that, but I’d argue that something being non-reachable is worse. You are right, that SRV record might not be the best option, but it would be nice to have an option to failover when a request doesn’t even hit a LB, which after all is also a machine/service that can go down, be exhausted, have other circumstances where it’s unable to receive a request or respond to it. While it’s probably also not the first thing you should worry about, a nice side effect would be that you could just set up two servers and have a your failover. Sadly CARP isn’t an option with many networks/setups. But that’s a somewhat different topic.

                        2. 2

                          You might be wanting something like the proposed SVCB record, a variant of which (HTTPS record) is already used by browsers. Read more here: https://datatracker.ietf.org/doc/html/draft-ietf-dnsop-svcb-https-10

                          1. 1

                            Or URI records

                          2. 2

                            Seems like a lot of extra traffic, latency and complexity for something that’s not really a problem :/

                            1. 3

                              It’s clearly a problem. Here’s just one way: This would fix the SNI information leakage that currently exists in TLS!

                              1. 1

                                Separate ports for each domain could fix the SNI leakage issue, but that’s hardly the only (or least disruptive) way.

                                1. 2

                                  Do you have a favorite way to fix the SNI leakage issue? I don’t know of any more elegant solution than this one.

                                  1. 1

                                    Encrypted Client Hello puts a key in DNS that browsers can use to secure the TLS metadata (including SNI) that is normally transmitted in the clear. It’s supported by browsers and DNS servers already, but HTTPS servers don’t use it yet.

                                    The DNS resource record is “HTTPS” and you can read more about it here: https://datatracker.ietf.org/doc/html/draft-ietf-dnsop-svcb-https-10

                                    The HTTPS record solves the latency issue because conforming DNS servers include the A, AAAA, CNAME records with the HTTPS record response.

                          3. 10

                            Surely there’s something like privileges(7) and you can just grant the fine-grained privilege (like PRIV_NET_PRIVADDR) required to open privileged ports to a regular user, without needing them to be root?

                            1. 24

                              You can, by setting CAP_NET_BIND_SERVICE on a binary

                            2. 4

                              The right question is: Do you have a capability to this port?

                              1. 4

                                I noticed that I could run Caddy on Mac OS without sudo, but I didn’t know why it worked until now. TIL, thanks.

                                1. 3

                                  Maybe I’m missing something here, but why would you want to expose your hobbyist web server directly to the internet, without fronting it with something like nginx?

                                  1. 22

                                    You might just be exposing it to a local network.

                                    You might prefer to expose a server in a managed language, rather than nginx (C). The days of needing to be protected by Apache/Nginx are over.

                                    You might want to run your Nginx as a non-privileged user (without the complexity of a separate process doing the listening).

                                    1. 6
                                      • Since I switched from Apache to Nginx, I lost the ability to properly manage the accept-language header.
                                      • Configuring Apache or Nginx is a significant chore.
                                      • I like to keep things simple, and the most popular web servers out there are not so simple.

                                      The better question would be “why would you want to front your hobbyist web server with anything?”

                                      1. 1

                                        All about that .venv/bin/flask run life

                                        1. 1

                                          Can you decipher that for me? I have absolutely no clue what you’re hinting at.

                                    2. 4

                                      I think there are 2 reasons why to restrict the listening on the ports

                                      1. Malware could maybe crash sshd and then grab port 22 & do SSH MITM as a means of privilege escalation (or something similar to this)

                                      2. Multiple unprivileged users are using the same machine, if user A starts a process which listens on port 80, now users B, C, and D are locked out of listening on port 80 and they can’t stop user A’s process

                                      I think both of those concerns can be addressed:

                                      1. Then just make ports less than 80 require root instead of less than 1000
                                      2. “Multiple unprivileged users are using the same machine”? What ?? No one does this any more :P

                                      IMO whoever wants this <1000 behavior can opt into it, it should not be the default when its actively harmful to 90+% of usecases, for individuals and enterprises alike.

                                      1. 8

                                        Then just make ports less than 80 require root instead of less than 1000

                                        Now you have the same problem for ports >80, including e.g. 443, several SMTP ports, and both IMAP ports. What makes 80 special?

                                        “Multiple unprivileged users are using the same machine”? What ?? No one does this any more :P

                                        This is emphatically untrue. I can name three off the top of my head - University of Washington does this on their supercomputing cluster, my university (University of Rochester) runs a teaching cluster for the CS department, and Chris Siebenmann (the WanderingThoughts guy)’s university does as well I believe.

                                        1. 3

                                          Chris Siebenmann (the WanderingThoughts guy)’s university does as well I believe.

                                          He’s at UofT(oronto), I use to be a student there. I’m aware of 4 different sets of shared servers (delineating sets by who is allowed to access them) at UofT, and I’m sure there are more I’m unaware of.

                                          But, these are already systems with a lot of custom configuration, the ones I’m more most familiar with don’t even have home directories under /home. I can’t imagine that having to modify the privileged set of ports away from the distros default would be a problem.

                                          1. 1

                                            What makes 80 special?

                                            It’s not 80 that’s special, it’s 22 that’s special. Really the cutoff just has to be > 22. But 80 is the first port that folks who don’t know what privileged ports are yet are going to need to listen on, so it was picked arbitrarily.

                                            some malicious actor able to listen on 80 just means that the “normal” HTTP server won’t start. On the other hand hijacking SSH’s port almost certainly allows getting root eventually. Hijacking HTTP probably won’t. Users are also more likely to notice when their app has been replaced by an attacker’s app than when their SSH daemon has been replaced by an attackers SSH daemon. The app is harder to impersonate.

                                            No one does this any more :P

                                            Sorry maybe my “:P” signifying an intentionally exaggerated / silly statement wasn’t clear, What I mean is that the grand majority of linux users and linux servers don’t operate like this. Shared “mainframe” servers used to be the norm whether for academic use, commercial use, or personal use. But that’s not the case any more. Now the norm is that individual users or organizational units rent VM(s) from a public cloud.

                                            1. 1

                                              Users are also more likely to notice when their app has been replaced by an attacker’s app than when their SSH daemon has been replaced by an attackers SSH daemon

                                              I wonder if that’s true. If the malware can read the ssh daemon’s private key, then it already has root and doesn’t need to do this. If it can’t, then a connection will show up with the wrong fingerprint, so a user will notice immediately. I suspect, unfortunately, that most users will just assume that it was intentional and move to using the new key.

                                        2. 2

                                          And for the three folks in Finland who administer multi-user Linux instances and rely on privileged ports for their mainframe-era security properties

                                          Why single out Finland there?

                                          1. 5

                                            Total guess: reference to Finland being the birthplace of IRC, which is just about the last remaining thing in the wild that uses ident, which sits on TCP port 113?

                                            1. 18

                                              Finland is the birthplace of Linus Torvalds.

                                              1. 3

                                                Total guess: reference to Finland being the birthplace of IRC, which is just about the last remaining thing in the wild that uses ident, which sits on TCP port 113?

                                                More fun facts: Almost no IRC servers use the RFC-defined port of 194 - it’s almost always 6667 or 6697.

                                                1. 2

                                                  I’ve never heard of an IRC server using 194, nor seen it mentioned as an example in any IRC daemon’s config file templates.

                                                  Edit to add: I feel like IRC has more… RFC documents which don’t bear much resemblance to reality written about it than most protocols.

                                              2. 3

                                                torilla tavataan

                                              3. 2

                                                and there’s no comparable concept on Windows either.

                                                This is not true. You need elevated privileges to bind a privileged port in Windows (at least before a certain build # I can’t find documentation for). My home PC apparently binds without issue now (latest win10), but my work PC gives me a privilege error (Server 2016)

                                                1. 2

                                                  Or you know… design your programs with privilege separation, dropping privileges, and sandboxing in mind?

                                                  1. 14

                                                    sure, but we should attempt to design systems that are secure by default. remembering to drop privileges is another thing people forget (and in some languages/runtimes it’s not super easy to do even when you remember to do it)

                                                    1. 9

                                                      You don’t have to drop privileges that you didn’t get in the first place. It’s strictly more secure to not have those privileges from the beginning.