Threads for cocoafleck

    1. 8

      I like this blog, the posts are very informative. I’d feel so bitter and exhausted if I wanted to push changes into the C standard, but some people are that patient, it’s admirable.

      Anyway, I’m not a C expert, but it seems that many C programmers want to have their cake and eat it too, depending on the topic at hand: they want compilers to optimize their code, and think C is the fastest language out there; but at the same time C is believed to be a low level, portable language, even though standard C bars you from even checking an integer overflow. If Linux can’t use standard C how can this myth stand?

      1. 4

        You can check for integer overflow, you just need to check before you encounter integer overflow and skip the integer overflow codepath in cases of overflow. Compilers are pretty good at recognising this kind of idiom and so often turn it into a branch-on-carry.

        To give a concrete and more interesting example, consider integer division by zero. On x86, this will usually trap and on *NIX it’s (confusingly) controlled via the floating point environment whether it traps or not and you get SIGFPE if it does. If you write a/b in your C code and b is zero, then on weird architectures like x86, once you reach that line it will explode. You don’t check this by writing a/b and then trying to check if the result is the result of integer division by zero in some weird way, you do it by writing c = b==0 ? oopsValue : a/b. Whatever the compiler does, your code will explode if you don’t do the check (on some architectures, in some modes) and so correct code will never execute the division by zero and a compiler is completely free to assume that on any path where a/b is reachable, b is non-zero.

        1. 2

          Another thing to consider is that C does not expose any way to check the overflow bit that exists in many CPUs after a calculation. It might be cognitively simpler to program in this way. But we can’t really do it in C. Assembly is always an option.

          1. 5

            no you’re not understanding the problem. Integer overflow is well defined everywhere, but C and C++ have adopted an unsupported nonsense stand that it is undefined - not implementation or specification defined, outright ub. It’s not a matter of “how do I detect that overflow happened”, the compiler writers decided to take an insane decision that if a behavior is not defined, you can optimize any code by saying the the UB operation cannot happen. So a hypothetical “did this overflow” flag is irrelevant, overflow is UB so the “did this overflow” flag is definitionally always false.

            For the flag (or similar) to make sense you would have to define overflow behavior. If you defined overflow behavior these nonsense “optimizations” go away: the only reason “a+b<a” isn’t a valid overflow check is because overflow is bogusly considered UB.

            1. 3

              The problem is that signed overflow isn’t defined for every machine where a C compiler exists. There are architectures out there that will trap on signed overflow, much like the x86 will trap when trying to to an integer division by 0 [1]. And even if there isn’t a trap on overflow, what will the result of INT_MAX + 1 be? You get three different answers, depending if the machine is sign magnitude (-0), 1s complement (-INT_MAX) or 2s complement (-INT_MAX-1). Of course, you’ll have to search far and wide these days for any computer that isn’t 2s complement these days [2], but they were still very popular throughout the 70s and into the 80s, when the C standard committee started its work, and every vendor wanted their C compiler to be labeled “standard” with the minimum amount of work, thus all the undefined behavior of C.

              Hindsight is 20/20.

              [1] On the VAX, a 2s complement machine, you can have the CPU trap or not on signed overflow. This can be done on a per-function basis. The MIPS, also a 2s complement machine, has instructions that will trap on overflow, and a second set of instructions that will not trap on overflow.

              [2] Every system I’ve ever used for the past 40 years has always been 2s complement, with defined overflow behavior.

              1. 5

                No, signed overflow is defined on every architecture you just listed. “Not the same across all platforms” is not “this behavior is undefined”.

                I did not say “this behavior must be defined by the specification” I said this has no justification for being “undefined behaviour”. Implementation defined and unspecified behavior exist specifically to handle behaviors the differ between targets, and that’s what these should be.

                You need to understand there is a very real an meaningful difference here: by stating overflow is UB the compiler can incorrectly pretend overflow cannot happen, and then creates security vulnerabilities. The specification should actually say that overflow is implementation defined or unspecified - exactly which probably lands in language lawyer town, but for a developer doesn’t matter, the end result is the same: the behavior of overflow becomes consistent throughout the codegen from the compiler, and the compiler can’t just pretend that overflow does not/cannot happen.

            2. 1

              the compiler writers decided to take an insane decision that if a behavior is not defined, you can optimize any code by saying the the UB operation cannot happen

              This is not insane. Specifying something as UB means that the algorithms in a compiler can be written with the assumption that the input program does not do these things. When that is the case the code will be compiled correctly. When that is not the case there is no guarantee about the behavior of the output program. That is why it is called ‘unspecified behavior’

              1. 3

                The whole point of the article is that this definition of what UB means - that now the compiler got to choose what would happen, not the hardware it was running on - was introduced decades after the widespread adoption of C.

        2. 2

          It’s important to recall that when gcc introduced the definition of UB that said “I can optimize assuming the UB of overflow cannot happen” their were no builtins to test, so a whole bunch of people had to go through a lot of code working out the correct set of operations - often extremely expensive ones like division, which would also require its own UB check. This is all for the “performance” gained by pretending that overflow is not well defined behaviour on every platform ever made.

        3. 1

          Sure, you want to check beforehand if it’s simpler. But why is C only now adding some operators for checked addition and the likes? It’s tricky to check for overflow beforehand, especially if you have a longer chain of operations, so it’s another source of errors piling up. Having standard operators would also, I suppose, help optimizers?

          1. 1

            Part of the goal of C has always been portability. This means that every construct in the core language is expected to be possible to lower to something simple on every architecture. On some architectures, it’s trivial to check a carry flag for overflow, on others it’s more complex (RISC-V and MIPS both lack this, though MIPSr6 finally fixed it for addition, with an encoding trick where add with the lower source operand register first does an add, with the opposite order returns the carry bit).

            Most compilers have had overflow-checked builtins, and before that a lot of projects had inline assembly helpers. It’s the kind of thing that, for C, probably belongs in the standard library rather than the core language.

      2. 2

        I like it too. It’s really well researched and thought out. But fundamentally I disagree with the conclusion.

        UB is UB. you get nothing. It has to be like this. The freedom of the standards can and will be exploited to its fullest. This isn’t because implementors are malicious (except in certain situations where pathological implementations are created to try to stress these things) but it’s the nature of implementing semantics preserving transformations. We simply can’t write a compiler that “understands” the intention of code and rewrites it without breaking it.

        The arguments about the difficulty of working with integers are valid though. And it is something we can and should address. I think concepts like ‘boring c’ are a realistic way to address the problem: we need the expectations to be codified into a standard before compilers can be rewritten to respect them.

        We can also look at more semantic analysis that tries to identify some common examples of UB but the responsibility is fundamentally on the programmer.

        1. 6

          It has to be like this.

          It doesn’t though, that’s the whole point. We got this way through a particular evolution of circumstances: the standard started off as a formalization of the common behaviors of what different compilers do on different systems. Compilers use the unportable parts of the standard to add optimizations that make them generate faster code, and computers are slow so the standard consciously doesn’t include things (like integer overflow being a runtime check) that will make compilers suffer. Computers grow more powerful and more uniform in functionality and compilers get better at exploiting edge cases and shit like this starts happening. Now if the standard tries to define those edge cases, the compiler implementors scream bloody murder… not because they’re malicious but because it will cause a performance regression and people judge compilers by the performance of their generated code. It’s not a technical problem, it’s a human problem.

          In a language as low-level as C you will always have some UB because there are no handrails. As you say, semantic preserving transformations are hard, and some things have meaning that can only be known at runtime.. But there is zero reason your compiler can’t have, say, checked integer math that errors safely on overflow. Indeed, by the rules of UB in the standard this is totally fine! But no compiler does this… because it would be slower.

          Compiler implementors value speed over UB sanity, because their users value speed over UB sanity, because it’s a lot easier to compare compiler speed than compiler UB sanity.

          1. 5

            Compilers use the unportable parts of the standard to add optimizations that make them generate faster code, and computers are slow so the standard consciously doesn’t include things (like integer overflow being a runtime check) that will make compilers suffer.

            No, it’s not about inserting runtime checks for overflow. Overflow happens whether you check or not. The reason compiler writers are so hellbent on signed overflow specifically being UB, is that it means in the most idiomatic for loop for (int j = 0; j < N; j++) { ... } they can claim that they have proved the for loop will terminate and/or perform vectorizing optimizations if internal multiplication in the loop are guaranteed to never decrease.

            In a language as low-level as C you will always have some UB because there are no handrails.

            Sure, you may have some, but even in C with “no handrails” it does not need what it has done.

            There are incredibly few places where you have behaviour that is UB, and the vast majority of UB in C has no reason to be UB. There is for example, no actual real justification for integer overflow being UB in C, as demonstrated by unsigned overflow not being UB - the only reason this is still up for debate is that there are compiler benchmarks that benefit from pretending overflow is not a thing.

            The places where you have unavoidable UB are for example using a pointer outside of the range of the target object, use of a pointer after it has become dead, use of a garbage pointer, access to uninitialized memory*, etc. In other words places where you cannot define what will happen. I’ve *’d uninitialized memory, because this can also just be fixed by saying all memory should be initialized.

            1. 1

              Glad we agree.

          2. 3

            But no compiler does this… because it would be slower.

            I find these comments extremely confusing as someone who feels very familiar with C, and has read bits of the C standard. Are the sanitizers provided by gcc, and clang insufficient for your needs? It feels to me that a lot of people are trying to standardize that C compiles by default with sanitizers enabled when (in my opinion) they chose to compile without them. I suppose, if one is targeting an obscure compiler that doesn’t support equivalents to gcc’s, and clang’s sanitizers, that I would understand your frustrations.

            Could you help me out in understanding why sanitizers aren’t meeting your desires, or if there is some fallacy in my reasoning? I am hoping to clear up my confusion as I feel that I am missing something with how often I see people say they want to remove some undefined behavior from the standard.

            1. 3

              I’m not sure about the other implementations, but Clang’s UBSan has the same completeness problem as any dynamic analysis. For example, it converts signed addition into overflow-checked signed addition with a branch to a check on overflow. If your test workload does not trigger signed overflow then it will never be a be triggered but that doesn’t mean that it’s fine in production where a malicious adversary may provide input data that does trigger the overflow.

              1. 2

                but that doesn’t mean that it’s fine in production

                So ship with sanitizers enabled in production code? Am I missing something here?

                1. 2

                  If the overheads are low enough, you could ship them, though I’ve heard complaints that they’re not. Now you still have a denial-of-service vulnerability in your code. This is avoidable if a language either requires explicit checking of all potential overflows or provides integers with sub-word ranges so that overflow is statically avoidable. C has a bunch of things that make this much more of a problem than it should be. For example, the result of T* - T* is ptrdiff_t, which is a signed quantity, but array indexes are size_t, which is a signed quantity of the same size. This makes it easy to write code that won’t ever actually overflow but can’t be statically proven not to.

                2. 1

                  The problem is not the overhead of checking for overflow. The problem is not that the compiler is not checking for overflow.

                  The problem is that the compiler is deciding that the overflow cannot happen and then optimizing from that. I am not, and I believe that most involved people are not, advocating for the compiler generating overflow checks or anything. What we are saying is the compiler should not be using a clearly bogus decision that overflow is undefined for signed arithmetic, and therefore it’s ok to pretend it doesn’t happen.

                  When this exciting definition of what a compiler could do for UB was introduced, the first thing it did was create a large an unending swathe of security bugs. Security bugs that took years to discover.

                  You could claim this response to UB is acceptable, but then the difference between something being classified as UB vs implementation defined or unspecified behaviour in the spec becomes security critical, and so the old habit of being lazy and classifying edge cases as UB is not acceptable, and the specification needs to be updated appropriately. But this definition of UB has been adopted solely to help compiler benchmarks, it’s why signed overflow is UB, but unsigned overflow is not, despite the exact same conditions apply to both: it’s just compiler benchmarks use for(int ...) instead of for(unsigned ..). Similarly the optimizing out null dereferences because “that’s UB” which also introduced security vulnerabilities. It’s telling that there are no optimizations that come from actual undefinable behavior (OoB, UaF, uninitialized memory, etc), but the definable cases have all resulted in “optimizations” that create security bugs.

            2. 1

              Are the sanitizers provided by gcc, and clang insufficient for your needs?

              If you can catch this behavior at compile-time, why isn’t it part of the compiler?

              If you have to insert checks at runtime, they should be opt-out instead of opt-in.

    2. 12

      Shame it is light on details, and the author didn’t investigate why the allocator is so slow.

      1. 13

        I don’t know about your experience investigating issues like this on Windows, but I find it very frustrating mainly because you pretty quickly hit the wall of undocumented internal APIs (NTAPI) and the lack of source code. In this particular case, while the MSVCRT source code is available, AFAICS, malloc is not part of that but rather comes from the PlatformSDK which, AFAIK, is closed-source.

        Also, while at it, I think the author’s own experience with LLVM provides an answer to the “Why doesn’t Microsoft replace its slow malloc with its own much faster mimalloc” question: because it will likely break a non-trivial number of applications even if only by exposing some latent bugs in those applications themselves. And a few percent of all the crap ever written for Windows stopping to work is not an acceptable outcome to Microsoft.

        1. 3

          Also, while at it, I think the author’s own experience with LLVM provides an answer to the “Why doesn’t Microsoft replace its slow malloc with its own much faster mimalloc” question: because it will likely break a non-trivial number of applications even if only by exposing some latent bugs in those applications themselves.

          But the author themselves notes this and points out that it could be an optional opt-in.

    3. 2

      I’m currently looking for hosting for a large term frequency data file that is necessary for several of the search engine’s core functions. I really don’t have the bandwidth to serve it myself. It’s only a couple of hundred megabytes so it’ll probably be solvable somehow.

      Isn’t this a perfect use case for seeding it as a torrent with a bandwidth limit set on it?