1. 32
    1. 21

      This post has a worrying tendency to refer to “messages”, as in

      If I send() a message, I have no guarantees that the other machine will recv() it

      which may indicate the author is making a really bad newbie error: thinking that TCP honors write boundaries. It doesn’t. TCP doesn’t transmit messages, it transmits byte streams. The fact that you sent a particular string of bytes in one call doesn’t mean the receiver will read that same string in one call.

      What makes this conceptual error nasty is that during development TCP will often appear to work this way, because it’s common for each send() or write() to result in one packet if the two peers are on the same machine or same LAN and thus have very high bandwidth and low latency. (And assuming of course that your writes are smaller than an Ethernet frame.) Then you try your program across a longer distance and it breaks horribly.

      Even if you know better, this can cause problems, because over a LAN the code that decodes frames often isn’t being completely exercised, so you only test the “easy” case where you read complete frames… Then later on you find that your code doesn’t handle reassemble partial frames correctly. I’ve been bitten by this before.

      The more I work with networking code, the more I feel that no app developer should use the C socket APIs directly. They look straightforward, and you can build a trivial app without much fuss, but there are so many subtle behaviors and edge cases and platform variations that make it quite difficult to get solid shippable code.

      I totally made that mistake as recently as three years ago. I looked at libUV and said “ugh, this looks big and complex, I don’t need all that for my purposes”, and started rolling my own C++ code. I really wish I hadn’t. ☠️

      1. 8

        I worked with a warehouse automation vendor who had operated with this misconception for decades. We asked them to implement a JSON protocol that resulted in messages bigger than 1500 bytes, and the ensuing back and forth trying to get them to understand you have to call read() in a loop was one of the most frustrating technical conversations I’ve ever had! They thought TCP was message-oriented, but didn’t even have the vocabulary to say that.

      2. 4

        The more I work with networking code, the more I feel that no app developer should use the C socket APIs directly.

        I don’t mean to jump on this as I appreciate the sentiment. I think I’m just easily triggered by more and more areas coming under the “don’t roll your own crypto” banner. Of course: you really shouldn’t roll your own crypto. But do write your own implementation of well-defined systems so you can understand exactly how complicated it all is. And do write networking code using the C APIs, maybe even use raw sockets and see if you’re up to putting the various protocols’ headers together too. Don’t for goodness’ sake deploy it into production …

        We need to keep practising these arts in some safe space so that we can learn where the sharp edges are.

        1. 1

          Yeah, I’m implicitly assuming ‘the reader’ wants to build something serious. Of course you can and should play with whatever APIs you want!

          Personally, learning and using the C socket APIs is not something I’d do for fun. I did it because I was being paid to design and build a cross-platform product with a small footprint and the higher level libraries looked too big (and I think I was wrong there.)

          Ideally we get rid of these APIs someday in favor of better ones, but Unix APIs seem to be damn near immortal unfortunately.

      3. 2

        Thanks for your advice. I wish these man pages came with this sort of info as well. It’s really quite impossible to write decent code where you cover all your bases.

        1. 2

          Let me recommend “Unix Network Programming”, a book I wish I’d had when I started working on this stuff — partly because it probably would have convinced me not to DIY. It makes a great doorstop too.

          Oh, and FYI things get even more “fun” when you try to integrate TLS. Obviously you’d grab an existing library, probably OpenSSL, but it’s quite confusing to integrate into your own socket code. (Plus there are good reasons to use a platform’s integrated TLS lib instead, like Apple’s SecureTransport, because it’s tied into things like the device’s root-cert list and cert revocation. So that gives you multiple APIs to figure out…)

      4. 2

        I’ve seen a few systems that didn’t behave well when I synthetically flushed bytes to them one at a time.
        In practice they seem to work more reliably. If you disable nagles algorithm, and your message fits within the TCP segment size, I think you can get relatively reliable message semantics.
        That said, I haven’t written an application myself that relies on such behavior. Systems with strange maximum segment size settings might break things.

      5. 1

        I hacked together a webserver in C recently by following beej’s networking guide, and the level of appreciation I have for libraries that handle all the open/recv/select stuff for me increased immensely.

    2. 6

      Also, you should avoid select() in applications which may have an unbounded number of file descriptors open. The fd_set macros operate on a structure that can hold a fixed number of file descriptors (usually a fixed-sized array). If you open file descriptors beyond the value of FD_SETSIZE, you will typically end up with a buffer overrun vulnerability. This is rather tactfully worded in the manual as

      The behavior of these macros is undefined if the fd argument is less than 0 or greater than or equal to FD_SETSIZE, or if fd is not a valid file descriptor, or if any of the arguments are expressions with side-effects.

      The value of FD_SETSIZE is of course platform-dependent, but it’s typically something in the order of 1024. Since file descriptors are assigned as an incrementing number, even non-socket files you open will affect your networking code.

      To avoid this, you’ll have to use the POSIX poll (or platform-specific epoll/kqueue whatever) syscall, which (of course) is not implemented by Windows.

      1. 5

        Hit this bug in Cargo! To restate this, what matters is not the number of fds you pass to select, but their numeric values. In Cargo, we used select on just two fds – stdout/err pipes of a child process. And that broke due to a bunch of file descriptors created elsewhere.

        1. 4

          How do you guys deal with this on Windows? The linked PR doesn’t seem to add any special case for it. In CHICKEN, we had to punt on the issue and just use Winsock’s select() and hope for the best. Their docs only state that the fd_set macros are “completely different” from Berkely sockets and use an “opaque representation”, but there’s no mention on whether it’s safe (or what else to use).

          1. 4

            There’s a completely separate impl for high-level “read both stderr and strout” for windows:


            That read2 function is than just copy-pasted across all projects that need it. IIRC, all of cargo, rust, and rust-analyzer have independent copies.

            1. 2

              Thanks! Sounds like a lot more work than simply trying to shim UNIX on top of Windows (like we do) though :S

      2. 4

        Windows uses a different signalling paradigm (in the general sense, not in the uses signal() sense). On POSIX , it’s “Can I do the operation? Then do the operation” and on Windows it’s “Do the operation. Is the operation finished?”

      3. 1

        This limitation is per-process, correct? It’s not as if you can have a well-designed program that fails due to some other process using up all the file descriptors.

        1. 2

          Yes … but if you’re writing a library, or a program that uses a 3rd party library that can open file descriptors, you’re no longer in control of all the file descriptors in the process.

          1. 2

            I remember this causing me pain in the 90s. Then poll() came along!

    3. 3

      In my first operating systems class (back when dinosaurs roamed the earth and it was still the ARPANet), when we started in on network protocols the professor said to imagine you’re communicating with someone on the other side of an 8-foot-high wall by writing messages on small water balloons. :) Actually even that isn’t quite good enough because you could at least see that you missed a packet.