1. 26

  2. 9

    I think the biggest mistake was too much undefined behaviour. UB undermines C’s (modern) role as “portable assembly” – a small machine-close language that lets you do everything, does that faithfully, and can be understood by understanding the machine.

    I don’t quite buy the optimization argument. Two reasons:

    1. Safety first, and voluntary optimization hints (like the restrict keyword) second, is always more forgivable.
    2. I can optimize those aspects of my C code myself, “thank you very much”. For most purposes, I prefer taking this responsiblility over allowing the compiler to delete my code at the sight of UB. As an (extremist) example of this, I forbid signed integers in my own code.
    1. 3

      Of note also are Salvatore Sanfilippo’s simple dynamic strings and Sean Barrett’s stretchy buffers.

      1. 3

        I agree, though for a slightly different reason. C has, largely because of UNIX, become the lingua franca for interoperability. The fact that C doesn’t differentiate between output value, in-out value, array (which may be used for input, output, or both) of length specified somewhere else or terminated with a 0 value , is a headache when trying to use C as an interop target. As the article says, this gets even worse with strings. Is a char* parameter an output parameter of a char, a pointer to a NULL-terminated string, or a pointer to an array of bytes? Should I map it to a multiple-return-value where one of the returned values is a character, to a string or to an array in my language?

        1. 2

          Is Robert C. Seacord here? He maybe can comment if fat pointer is on the spec roadmap (also see https://news.ycombinator.com/item?id=22865357).

          1. 3

            He’s relatively active on twitter; can ask there.

            Unfortunately, however, I suspect that the answer is ‘no’.

            As it turns out, it’s quite easy to create your own fat pointer implementation, and use it throughout your codebase. (My version is 50 lines, and stb stretchy buffers is slightly smaller (it does less). Neither accounts for automatic arrays, but it would be trivial to add.) What’s needed is boundschecking for regular array accesses, which is to say operator overloading, which is to say ‘probably not’.

            However, if what you want is boundschecking, then address sanitizer has you covered.

          2. 2

            I still think the biggest mistake was implicit fallthrough.

            1. 2

              meaning an array is passed as a so-called “fat pointer”, i.e. a pair consisting of a pointer to the start of the array, and a size_t of the array dimension

              This sounds a lot like Go’s slices in a sense, with syntax like foo[a:b]. It still doesn’t give you much of a guarantee about the underlying memory though.

              1. 1

                std::array has been pretty nice in C++. You can even do simple “dependent typing” (not rly) kinda stuff like make an append function that returns an array whose length is the sum of that of the input arrays (and the append is compile time): https://godbolt.org/z/6b9nxr (haven’t tried this on huge arrays, it might blow up 😅 )

                1. 1

                  I don’t know if this is C’s biggest mistake, but I do consider Rust’s separation of fixed-sized arrays, dynamic resizable arrays (vec’s), and fat pointers to bounds-checked memory (slices) to be one of its biggest subtle wins.

                  Also it’s interesting to note that, as far as I can tell, system languages with fat pointers didn’t really become common until after AMD64 became common, with 14ish general purpose registers. I may be connecting dots that don’t exist here, but when you’re on x86 with 6ish registers, spending two of them on a slice pointer Feels Bad. There are of course other architectures developed in the 80’s and 90’s with lots of registers, but people generally didn’t write new programming languages for them, and my impression is that it was uncommon to have more than 8 registers before then.