1. 51
  1. 9

    I’m working on the OCaml compiler. The runtime code in C has been similarly stuck with c89 for MSVC support, but we recently merged the “Multicore OCaml” project which relies on C11 atomics. For now the plan is to stick with C11-aware compilers, and let our MSVC users hope that Microsoft adds support for atomics quickly enough (I hear that it has been announced as a feature coming soon for some years now). I spend most of my own time outside the C runtime, but I still find it super relaxing to be able to declare variables wherever, and in particular for (int i = 0; ...). Yay!

    1. 4

      MSVC, GCC, and Clang all have pretty good support for C++20 and the gaps in C++17 support are so small that I’ve not encountered them in real-world code that has to build with all three. I believe XLC and ICC are in a similar state, though I’ve not used either. C11 atomics are a port of C++11 atomics to C, with some very ugly corner cases. I don’t know how much effort it would be to make the OCaml compiler build as C++ but you’d have at least 3 production-grade compilers to choose between.

      1. 2

        C++ is a totally different standard and code can be safe in C, but unsafe in C++, and vice versa (IIRC there was a malloc difference such that C’s interpretation of the code is safer, and a zeroing related bug, too, it’s been about a half decade since I’ve seen them come up though and apparently my google-fu does not exist anymore). The languages are close enough lexically to trick you that they are compatible, but they are not. Such a change is likely to unnoticably break semantics meaning OCaml might in later years be found to be less secure, safe, and stable.

      2. 1

        The runtime code in C has been similarly stuck with c89 for MSVC support

        MSVC support C17 for few years.

        1. 4

          Actually the “declarations after statements” feature I mentioned was already available in Visual Studio 2013 (which still didn’t have all of C99, of course), but the problem is that we VS2013 was released, most Windows machines around did not have it installed. I don’t use Windows myself, but my understanding is that some OCaml users sell software that packages/extends the OCaml compiler, and they want to be able to compile their software on their clients machines, and those may only have old versions of MSVC / Visual Studio available.

          Long story short: for years Microsoft has been lagging a decade behind on C compiler support, and Windows users routinely use decades-old operating systems (even enterprise users, thanks to long-term-support from Microsoft), and the combination of the two creates incentives to stick with super-old versions of C.

          (Again, now we need C11 atomics, and MSVC support for C11 atomics is… not yet released? Well it’s only been 11 years now for the most important extension of C in the last two decades… So Maybe Microsoft will support C11 atomics in a year or so, and then it’s only ten more years of Windows users complaining that the software doesn’t build on their own systems.)

      3. 6

        libgit2 is on c89, but that’s mostly because it has to support so many platforms and c89 is the easiest common language for all these compilers.

        See what linux can do by not supporting the Microsoft C compiler?

        1. 14

          Writing a kernel in any C standard below C11 is a terrible idea. C11 was the first C standard to introduce atomics. Prior to that, atomics were fudged with a mixture of inline assembly and volatile. The guarantee that you get from volatile is not sufficient for atomics (only that loads and stores will not be elided and that access to the same object will not be reordered with respect to each other). Since atomics have existed for over a decade and anyone who cares about parallel execution has had a long time to migrate their code, compilers are becoming increasingly willing to optimise volatile.

          FreeBSD is in a much better position to make this switch because the kernel’s memory model is an acquire-release model that maps directly to C11 (and to atomic instructions in modern ISAs), whereas the Linux kernel memory model is a set of barriers based on the ones in the Alpha. I was able to compile the FreeBSD kernel in C11 mode with the kernel’s atomics.h reimplemented using C11 stdatomic.h functions a few years ago (it compiles in gnu99 mode by default today).

          The motivating example in this is another reminder to me of why I would not write a kernel in C today. You’re trying to write C++ iterators in C. If you did it in C++, you’d have type-safe code and the compiler already includes a load of lifetime warnings. If you did it in Rust, then the lifetime warnings are part of the type system. C++11 and later are good fits for a modern kernel, they have well-defined atomics, a memory model that works well with modern hardware, smart pointers for tracking ownership, type-safe generic data structures, and so on, all of which C lacks. Rust needs you to use unsafe for a lot of the code and the static analysis tooling for unsafe Rust is in a worse state than for unsafe C++, but outside of a core set of kernel routines you can probably stick to safe Rust and be in a better position.

          1. 8

            I’m no expert, but if I understand correctly the reason why Linux does not use C11 atomics is not just “legacy code”, but also the fact that C11 atomics are not expressive enough to cover all of Linux’ use-cases: keeping existing Linux ordering guarantees would require “too strong” C11 ordering that would noticeably hurt performance. (For most other projects I would say that this is crazy, it’s already hard enough to avoid UBs when reasoning with C11 atomics, going to something more fine-grained brings the certainty of getting things wrong. But then Linux has enough leverage to get hardware engineers looking at its synchronization code, so maybe this is reasonable?) Do other kernels have better luck mapping their own memory model to the C11 memory model after the fact?

            (In OCaml we have the interesting problem that the OCaml Multicore memory model does not coincide with any of the C11 memory orderings – it is more relaxed in some aspects, less in others, and it has less undefined behaviors. As a consequence, it’s not clear which C11 memory ordering to use when implementing part of the OCaml runtime system in C. The problem is not completely dissimilar to the Linux situation, as noticed by Guillaume Munch-Maccagnoni.)

            1. 1

              I’m no expert, but if I understand correctly the reason why Linux does not use C11 atomics is not just “legacy code”, but also the fact that C11 atomics are not expressive enough to cover all of Linux’ use-cases:

              This may be true, but I’d be very surprised if it were. A number of core idioms from the Linux kernel were part of the requirements for the atomics working group.

              1. 2

                They could be referencing “Data/Control dependency” barriers from https://www.kernel.org/doc/Documentation/memory-barriers.txt

                The C11 memory orderings have no such thing atm. It should have been memory_order_consume but it can’t be held up under certain optimizations which break control-flow and data-flow chains at codegen. Instead, memory_order_acquire is required which has unnecessary happens before barrier costs on weakly ordered architectures.

                Linux also uses things like seq-lock in which accessing the protected data soundly isn’t obvious under C11. This can be a bit restrictive as it often breaks the ability to do things like race-ily memcpy. LLVM introduces the Unordered memory ordering which is still atomic but doesn’t require total ordering on the atomic variable like relaxed which is easier to optimize.

                It’s unfortunate, but I don’t think that there’s alternatives to these in the C11 memory model for achieving similar codegen without compiler-specific extensions (i.e. atomic intrinsics or inline assembly).

                1. 1

                  Thanks!

                  • Indeed the compiler-side issues with memory_order_consume were among the things I vaguely remember from Linux-atomics discussions, so your more informed explanation is certainly in line with what I had in mind.

                  • I didn’t know about Unordered, and it looks useful to describe the behavior of non-atomic variables in the OCaml memory model. (The model is more defined than C11 non-atomics, as it gives a semantics to races, but it is more relaxed than “relaxed” C11 atomics.) This is probably no coincidence as I read that Unordered was introduced to represent parts of the Java memory model. Sounds useful. Thanks!

                  (But: I’m not familiar with LLVM, and I don’t see an easy way to access those intrisics from a C program using Clang. Searching the web suggests that to use LLVM intrisics in a language runtime written in C, we would need a separate file with LLVM IR)

            2. 4

              wonder why it took 40 years to add atomics to a language designed for writing operating systems. did atomics become more important at some point?

              1. 10

                Before widespread use of multiple cores and multithreading I guess they were a lot less important.

                1. 1

                  it’s not confined to multiple cores and threading - it is reordering (out of order execution). On the one hand you have primitives (like memory-mapped IO) with implicit or explicit rules on ordering where some load-store operations MUST happen in a certain sequence, and the language lacked the ability to model that.

                  *a = 1
                  *b = 2
                  c = *a + *b
                  

                  am I allowed to actually set b before a, can I reduce c to a constant 3? within the language, “it depends” (without side-effects verbiage in the standard) - older C versions permitted volatility, but that only solves for part of the equation, not for ordering or atomicity, just that the operation has to happen - before C11 you had to resort to barriers and other compiler extensions.

                  1. 4

                    it’s not confined to multiple cores and threading - it is reordering (out of order execution).

                    There are two kinds of reordering that matter, reordering by the CPU and reordering by the compiler. Reordering by the CPU matters only in multicore systems. On a single core, out-of-order execution is not observable by the running process. Interrupts result in things in store buffers being either committed or discarded and so if you context switch to another thread then it will see the same sequence of operations as appeared in the instruction stream.

                    Prior to C11, threads were outside the C abstract machine entirely and so the compiler was free to do any reordering that was not observable to the (one) thread. In C11, the language requires valid source programs to exhibit ‘data-race freedom’, which means that it is able to reorder anything that does not involve atomics or explicit fences and it is undefined behaviour if the program is able to observe this. Both defining c in your example as 3 (assuming that they can prove that a and b don’t alias - if they do alias it has to be 4, if they might alias then the compiler is still free to assume that c must be either 3 or 4).

                    The thing that makes the C++11 memory model interesting is the set of things that it permits, rather than the set of things that it prevents. In GNU C89 (which the Linux kernel uses), you need to use a big hammer to prevent compiler reorderings: an inline assembly block with the "memory" clobber, which tells the compilers ‘memory changes in some unspecified way here, ensure that nothing moves across this point’. With the C++11 memory model, the compiler is allowed to move some operations (either loads or stores) in one direction (either forwards or backwards) across certain kinds of atomic. This exposes more optimisation opportunities to the implementation.

                    C++11 defines two kinds of fence: signal fences and thread fences. A signal fence defines synchronisation with respect to signals, which practically means that it restricts what a compiler can reorder. A thread fence defines synchronisation with respect to threads and so restricts what the combination of the compiler and CPU can reorder (and therefore requires the compiler to insert explicit barrier instructions on weakly ordered architectures). The Linux memory model has only the latter kind.

                    1. 1

                      did “reordering” become more important at some point due to advances in technology?

              2. 1

                I mean, I’m also part of the reason why libgit2 is strict with C89 😬

                1. 3

                  So what platforms hold it at that level? How much non-retro-enthysiast stuff would break if it became C11 now?

                  1. 1

                    Just a little thing called windows I guess, could be wrong though.

                    1. 2
              3. 3

                I didn’t really understand this at first, but this technique is based on the fact that a struct inside a struct is essentially flattened out in terms of memory representation. The technique is called an intrusive data structure and it lets you do things generically (with macros) with only one linked list struct. I am used to making a new linked list struct for every data type in C - so it’s a pretty clever hack!

                1. 2

                  These macros originated with 4BSD, as far as I can tell. Modified versions are present in the *BSD, Linux, and the NT kernel. They’re slightly terrifying because they require pointer arithmetic that removes type information, including using offsetof to identify the location of the field and then subtracting it.

                  1. 1

                    They’re quite useful! Like you said, it lets you write generic functions easily (you could write generic versions of a non intrusive list, but with indirection and allocation overhead). The other thing that’s very nice is they allow for local manipulation - if you have a reference to the struct, you can eg. remove it in O(1) from the containing list without traversing the list or even needing a reference to the list head. This can make cleanup code much simpler (like a C++ destructor could remove the object from lists it’s present in), and it also makes cases where you have a single item that needs to be present in multiple lists much easier to manage. They tend to show up a decent amount in systems and games programming.

                    1. 1

                      When I first saw this I thought it was genius. The macro to convert a list pointer to its parent is a fun piece of pointer arithmetic to unpick. It involves casting the literal 0 to a pointer.

                      Another significant advantage is normal use (outside of that controlled and tested macro) doesn’t need casts: with the traditional list-struct-on-the-outside, you end up having to use void pointers for the payload, and there are many more opportunities to make a type error. Also the list structure and parent structure can be easily contiguous in memory.

                    2. 2

                      the kernel has moved its minimum GCC requirement to version 5.1

                      Yeah, thanks for moving to a 5.x gcc and failing compilation on CentOS 7’s default gcc which is 8 years old, while still

                      look to moving to the C99 standard — it is still over 20 years old

                      I am a little bit confused where you can always have the latest and greatest linux kernel from upstream (lesser than 1 week old and being confident about it), but hesitant to trust a compiler or C standard that’s been there for 20 years. While asking developers to trust a compiler that’s 7 years old (gcc 5.1).

                      1. 1

                        Is anyone still updating centos7? It’s legacy now surely

                        1. 2

                          Centos7 still gets package updates via upstream, unlike 8.

                          1. 1

                            But there’s precious few of those. Only cve severity important or higher are applied by default, with other updates at Redhat’s discretion. Disclaimer: I work for red hat, but may still be misrepresenting their policy.