1. 45
  1.  

  2. 10

    This looks like a sensible set of features:

    The consexpr thing resolves an annoyance that C actually has a very well specified notion of a constant expression. This requires, for example, clang to have a constant expression evaluation engine in the front end that does a subset of things that could be done later in the optimisation pipeline, so that it can provide errors later on (C++ also requires it for constants in templates because the front end does need to handle these, but it’s necessary for C even without templates). Until now, the only way of reusing a constant expression was to define a macro with it, at which point the compiler ends up evaluating it at every instantiation point. Now you can assign it to an identifier. This has been possible in C++ via template hackery since C++98 and cleanly since C++11.

    The compound literals thing is a nice cleanup, I hope the C++ version is also adopted. Compound literals in C++ could do with a bit more work.

    Every use I’ve had for something like #emded has ended up being better served by something else but I’ve seen other places where it’s valuable, so good to see it in the standard.

    __VA_OPT__ is fine, I guess, but GCC has had a work-around for this limitation for over 20 years and I don’t really see myself changing from , ## to __VA_OPT__(,) any time soon.

    Allowing (...) is interesting. C originally passed all arguments on the stack and didn’t have prototypes, so every call was effectively variadic (in your implementation, you provided the set of types for your formal parameters but you could add variadic ones by taking the address of the last one subtracting the size of it to get at the next one. This was originally wrapped up in some horrible macros. With C++89 and function prototypes, the standard version of variadics made it very hard to implement in macros and so most compilers added builtins for handling the variadics. More complex calling conventions meant that the compiler had to get involved. Once you drop the need to implement variadics purely in macros, you don’t need the first argument (which exists because otherwise you can’t get the address of the start of the argument frame).

    All of that said, in C++ variadic functions are regarded as legacy crap. If you want a variadic interface then you define a variadic template as an inline function. You can then parcel the arguments up into some sensible type such as an array or a tuple and forward it to a function that does something with the arguments. This lets you capture type information and allows you to write type-safe variadics. C-style variadics are basically on the list of things that you should never use, because they are so easy to get wrong and introduce stack corruption. Anything that makes people more likely to write variadic functions in C makes me a bit sad.

    The every-enum-is-an-int thing has annoyed me since I first discovered it over 20 years ago. Nice to see a fix here. C++ requires that you explicitly declare the enum as a wider type. The auto-widening thing makes me a bit nervous because it makes it harder to guarantee stable ABIs that take enums. If I expose a function that takes an enum and the values all fit in an int, the function will take an int. If I add a new 33-bit value to that enum then the argument type for that function will change in the ABI and I’ve now broken any code that calls that function. In C++, at least, I get a warning and have to explicitly change the type to enum class Thing : uint64_t or similar and since I’ve explicitly changed the width of the type then I know that I’ve broken it. Oh, and because I have operator overloading and using I can have both the old and new definitions coexist in my library and ship the narrower version as a wrapper around the version that uses the wider one.

    Fortunately, from the next section, C has now also got support for explicit types for unions and so I can probably get a compiler warning if I do the dangerous thing. Probably should be enabled by default for any enum declared in a header. It’s a shame when a language adds a feature that immediately makes me want a compiler warning in case I use it though.

    The qualifier-preserving standard functions make me unreasonably happy. One of the first things that I tried to do in CHERI C was enforce const. Unfortunately, this was monotonic: CHERI doesn’t give you a mechanism for adding permissions (by design) and so you can’t have the compiler implicitly cast something to const and then have you rip it off and things like memchr in the standard library required explicitly this: they took a const pointer and returned a non-const pointer derived from it. This broke. Hard. I ended up adding __input and __output qualifiers to mean ‘read / write-only and I really mean it’. I should be able to add these in _Generic macros, so the __input qualifier is preserved nicely.

    The explanation of nullptr is exactly the thing that’s bitten me and one of the reasons that I hate variadics. Good to see, shame it wasn’t in C11.

    stdbit.h has been a long time coming, but it’s great that it’s there now. I have a non-portable header that implements wrappers around compiler intrinsics for these for clang / gcc / MSVC, I’m looking forward to throwing it away. On the stdc_ prefix, I have two comments:

    • I wish they had a consistent prefix on all of the standard functions.
    • I hope the C++ version drops the prefix and puts them all in the std namespace.

    auto in C++ is useful and I used GCC’s __auto_type a lot back before I gave up on C.

    I think memset_explicit came from OpenBSD. It’s surprisingly hard to implement without compiler support or inline assembly, so nice to see it in the standard.

    [u]intmax_t is one of those things that was obviously never a good idea, from the time that it was introduced. Given that 32-bit platforms implemented 64-bit integers via software paths, and some 16-bit architectures grew backwards-compatible extensions for 32- and 64-bit arithmetic, it was obvious that future versions of a platform would gain wider types than the current ones supported. I wish C23 had done the right thing here and deprecated these types. I have never seen any code that uses them and is correct, though I have seen them used quite a bit.

    All in all, it makes C a marginally less bad language, but there’s nothing in there that makes me want to start writing C again. It still feels like a crippled version of the worst bits of C++.

    1. 2

      I think memset_explicit came from OpenBSD. It’s surprisingly hard to implement without compiler support or inline assembly, so nice to see it in the standard.

      Why would it be hard to implement? It needs an optimisation barrier; translation unit boundaries usually act as such, unless you lto—and who statically links libc? Alternately, I’ve commonly seen it implemented as follows: void *(*volatile mmemset)(...) = memset; mmemset(...).

      1. 5

        It needs an optimisation barrier; translation unit boundaries usually act as such, unless you lto—and who statically links libc?

        That is one of the problems, yes. Lots of folks enable LTO with an LTO-built libc, and suddenly the compiler can see what’s happening. As far as I know, glibc and macOS libc are the only mainstream libcs that don’t support static linking (and Apple does, I believe, use thinLTO of their libc for app store builds, so the compiler can look inside libc for analysis, even if it doesn’t inline much).

        But that isn’t the biggest problem. It’s commonly used before free and the compiler will helpfully realise that it’s safe to elide any stores to an object just before free because any subsequent load is UB and the stores are therefore dead.

        Alternately, I’ve commonly seen it implemented as follows: void *(*volatile mmemset)(…) = memset; mmemset(…).

        Did you look at the generated code? Testing that in clang, the IR that it generates is identical to a normal memset. It expands to a memset intrinsic without the volatile flag. If this appears before a free, the compiler is at liberty to elide the memset, because it knows the semantics of memset as defined by the C standard and it knows that it is UB for any reads afterwards. There is no happens-before edge established via an atomic operation and so the stores are also not visible to another thread.

        I’ve seen a lot of folks try to implement functions like this and find that at least one compiler that they’re using will completely elide their security properties. I wrote a paper about this a few years back.

        1. 1

          Did you look at the generated code? Testing that in clang, the IR that it generates is identical to a normal memset. It expands to a memset intrinsic without the volatile flag. If this appears before a free, the compiler is at liberty to elide the memset, because it knows the semantics of memset as defined by the C standard and it knows that it is UB for any reads afterwards. There is no happens-before edge established via an atomic operation and so the stores are also not visible to another thread.

          How did you test? I do not get those results. The volatile qualifier means that the compiler does not get to constant-propagate from the definition of mmemset to its use, so at the latter point, not knowing which function is pointed to, it must assume that that function might have arbitrary side effects.

          Here is a godbolt link demonstrating this.

          1. 1

            I tested it locally with a different version of clang, but I had the assignment in the global scope. Clang happily lowered it to a non-volatile LLVM memset intrinsic. It didn’t, in this case, elide the memset even without the volatile qualifier, but there’s no guarantee that it won’t. LLVM has changed its interpretation of what volatile means a few times. There’s still ongoing discussion of whether it’s safe for the compiler to elide volatile stores if it can prove that the memory region is non-side-effecting and that the stores are not observable within the LLVM (or C) abstract machines. In general, using volatile for anything other than MMIO is relying on the compiler’s interpretation of underspecified bits of C and is very dangerous if you’re doing it for security.

            1. 1

              I don’t quite follow. There was a situation in which mmemset was not volatile, but the call to it was not optimised away? I do not observe such behaviour. Was this with your ‘different version of clang’?

              There’s still ongoing discussion of whether it’s safe for the compiler to elide volatile stores if it can prove that the memory region is non-side-effecting and that the stores are not observable within the LLVM (or C) abstract machines

              Interesting. How is it determined whether a memory region is ‘side-effecting’? As far as I know, the c and llvm abstract models make no explicit allowance for, for instance, char *text = (char*)0xb8000; but if I can’t write that and have it work ‘correctly’, I’ll sue.

              1. 1

                I don’t quite follow. There was a situation in which mmemset was not volatile, but the call to it was not optimised away? I do not observe such behaviour. Was this with your ‘different version of clang’?

                I ran this locally with Apple clang something or other (whatever was on the Mac I had in front of me last week) and looked at the generated IR. With and without the volatile cast, it generated an LLVM memset intrinsic call, with the volatile parameter for that call set to 0 (false).

                Interesting. How is it determined whether a memory region is ‘side-effecting’? As far as I know, the c and llvm abstract models make no explicit allowance for, for instance, char text = (char)0xb8000; but if I can’t write that and have it work ‘correctly’, I’ll sue.

                Automatic storage locations are created by the compiler. In theory, you could put your stack in an MMIO region, but if an automatic storage variable’s address is does not escape a function then the C standard does not provide any mechanism by which a store to that location is visible elsewhere, within the C abstract machine. This means that even a volatile store to a variable with automatic storage location may (the standard is not explicit either way) trigger an as-if rule that lets you elide it: You may not elide stores of volatile variables, but if you can prove that the store is not visible then the version of the program that elides is behaves as-if it were the version that did not.

                This is really the root problem for trying to implement most of the things in this space. They’re intending to prevent vulnerabilities that exist outside of the C abstract machine, from within the C abstract machine itself. In the C abstract machine, once an automatic-storage variable goes out of scope or once a heap allocation is passed to free, then it is gone. Any access to it is undefined behaviour. Any store to it immediately prior to its deallocation that is not accompanied by something that establishes a happens-before edge that guarantees visibility in another thread and also a happens-before edge backwards that guarantees that the load in the other thread completes before the dealloaction is not observable within the abstract machine. In contrast, it often is observable in a concrete lowering of the abstract machine to mainstream hardware and so can leak secrets. You need the compiler to be aware of things beyond the abstract machine during mid-level optimisations to be able to guarantee the correct semantics.

                My favourite thing from our paper was an idiom in OpenSSL that tried to do constant-time conditionals via xor. It turns out that multiple versions of GCC were clever enough to recognise this and convert it into a conditional in their mid-level IR and then, depending on the target and the optimisation level, either turn it into a conditional move (fine) or a branch (not fine). C doesn’t have any notion of time and so can’t express a notion of constant time and so the compiler didn’t feel any need to preserve a property that couldn’t be expressed in the source-language abstract machine at all.

                1. 1

                  With and without the volatile cast, it generated an LLVM memset intrinsic call, with the volatile parameter for that call set to 0 (false)

                  I think you misread what I wrote. I did not cast anything. I created a volatile pointer to function, set it to point to memset, and then called the function. But the type of the function pointed to is exactly the type of memset. The goal was not to have a memset intrinsic call with a volatile argument; the goal was to generate a call to an unknown—and potentially side-effecting—function which just happens to be memset.

                  Automatic storage locations

                  Huh. I guess the idea is to optimise cases when variables are qualified ‘volatile’ only for the purpose of setjmp/longjmp? Seems somewhat marginal, but. (And presumably it only applies when the location is not aliased—so e.g. signals work fine?)

                  My favourite thing from our paper was an idiom in OpenSSL that tried to do constant-time conditionals via xor. It turns out that multiple versions of GCC were clever enough to recognise this and convert it into a conditional in their mid-level IR and then, depending on the target and the optimisation level, either turn it into a conditional move (fine) or a branch (not fine)

                  Fun. I once caught gcc/clang generating branches in a tight loop where there should have been conditional moves; rewrote it in assembly and got a nice speedup.

                  1. 1

                    Huh. I guess the idea is to optimise cases when variables are qualified ‘volatile’ only for the purpose of setjmp/longjmp? Seems somewhat marginal, but. (And presumably it only applies when the location is not aliased—so e.g. signals work fine?)

                    It’s also to handle cases where you have a C++ template that takes a volatile pointer so that it can be used with MMIO, but where you can also instantiate it with normal memory. I forget the codebase that this came up in (it was someone’s in-house thing) but apparently they got a big end-to-end win from the inlined versions of the template that were operating on stack memory being able to do non-volatile stores and allow the compiler to elide a load of them on a hot path. I’m not sure why they couldn’t pick up the volatile qualifier from the template parameter, but possibly it was in C++ generated from some other language and difficult to change. It was interesting to me because it’s an under-specified area of the language. The general consensus on volatile from WG14 in recent years has been ‘if you’re using it for anything other than MMIO, you’re probably using it wrong’ and so I’m deeply sceptical that any approach that involves using volatile on normal memory and expecting the compiler to interpret the standard in the same way as you will work.

      2. 1

        Responding to one specific point: explicit_memset() indeed appears to be inspired by OpenBSD’s explicit_bzero().

        The described changes are pretty nice, I think!