Threads for adfernandes

    1. 53

      BSDs also did not want to mix C and C++ for kernel programming for various reasons, sticking with one language for the kernel (assembly excluded).

      I’ve written C++ kernel modules for FreeBSD (and ported a chunk of libc++ to build for the kernel). The build system actually supported it with nothing custom. There are a couple of issues, some of which also apply to Linux

      The header files may not be valid C++. FreeBSD has a bunch of macros that need undefining because they conflict with other things. Linux is so much worse. They use strings followed by macro names with no space, which is C++ for user-defined literals, and they use class as an identifier all over the place.

      On x86-64 (possibly elsewhere), FreeBSD kernel modules are not shared libraries, they’re basically .o files. The kernel’s loader is actually an in-memory static linker. Unfortunately, it doesn’t handle COMDATs or copy relocations. You can avoid them for a C++ kernel module, but I wanted a libkxx that provided a standard library and this led to the kernel module that depended on it having relocations that couldn’t be resolved. As a result, every kernel module using C++ needs to carry definitions of everything, which causes a lot of code bloat. If LLD were better written, I’d write a tool using it to do some ahead of time processing, but it’s really not written to be modified.

      These problems don’t really apply in Rust. You can’t use the C headers at all, so you have a more scoped problem. I’m not sure if Rust now has tooling to translate header files into Rust definitions, but I think the Linux Rust project was manually exposing individual subsystems in a more structured way. I’m also not sure how Rust kernel modules handle generics.

      In userspace C projects, there’s a very simple incremental adoption path for C++. You can reuse the same headers (maybe with a few tweaks) and you can add new components in C++ without any disruption. This is far less true In the kernel and so you have a similar adoption cost for C++ and Rust.

      There’s some history. Linus hates C++ for deeply irrational reasons, and a few more rational ones. A lot of kernels have been written in C++ now (and the code quality in the ones I’ve looked at has typically been much higher than Linux). Both NT[1] and XNU use a lot of C++. XNU inherited an Objective-C driver model, replaced it with C++, and it’s far nicer than the Linux equivalent. We wrote the CHERIoT RTOS in C++, but we will probably rewrite some bits in Rust once we have a working Rust compiler.

      The rational reasons come largely from timing. C++98 was not a great language and a lot of people used it to write bad Smalltalk code. Worse, gcc had a couple of big ABI changes in C++ that basically broke all userland C++ things for a couple of years. These happened about 20 years ago and the C++ ABI has been pretty stable since then. Apple suffered from this because they had baked the gcc 2.95 ABI into their kernel (they managed to ditch it with architecture switches).

      I don’t think popularity is particularly relevant. There are more C++ developers than either C or Rust developers these days.

      [1] There’s a lot that I hate about the NT kernel, but none of those things are due to use of C++. The use of C++ is typically a big win for readability in the bits I’ve read. The use of SEH to read userspace memory is an abomination.

      1. 25

        The header files may not be valid C++.

        This needs to be repeated more often. People often operate under the mistaken assumption that all valid C code is automatically also valid C++ code, but the two languages have various hard incompatibilities. And yes, the reserved keyword class is an absolute classic. Snippets like int8_t class; will blow up in C++ and you see it all over systems headers.

        I think another reason for why the Linux project opened up to Rust but not C++ is that the (memory-)safety that Rust provides (while still being very fast) is a novel innovation and a leap forward that far surpasses the benefits of using C++ over C.

        1. 18

          The safety (memory and type both) are by far the main reason, and exactly why it’s being targeted at writing kernel modules first, not writing kernel code actual: modules being safer and stricter is a huge boon to the kernel as that’s where there is the most variability (and no oversight for third party modules).

      2. 16

        I’m not sure if Rust now has tooling to translate header files into Rust definitions

        Yes, that exists: https://github.com/rust-lang/rust-bindgen

        The opposite direction also exists: https://github.com/mozilla/cbindgen

        And yeah, you don’t necessarily always want to use these. For certain problems, it’s better to hand-write bindings to specific things.

        1. 2

          Those tools still sidestep the “C is not always C++” problem, though.

      3. 7

        Another thing to consider is that large chunks of C++ are very inconvenient to write without the standard library, which large chunks of use exceptions and dynamic memory.

        Yes, you can write C++ without these things - see the embedded template library for example.

        But it’s a very different dialect of C++ than most programmers ever use or see, almost to the point of being a completely different language!

      4. 3

        The header files may not be valid C++.

        Given that Rust uses libclang to parse and generate valid definitions for Rust to ingest, I wonder if/expect you could technically get away with this on the C++ side. extern "C" { ... } is, after all, a perfectly valid construct, and as long as you are working with std::bit_cast capable types you technically don’t need to worry about ABI issues at that point, beyond the relocation and COMDAT folding you’d already mentioned 🙂

        1. 4

          The problem is that C++ includes headers via copy and paste. They are parsed with the C++ parser, after being injected into the source file by the preprocessor.

          In theory, C++ modules ought to allow parsing a header in a different dialect (and possibly a different language) and then importing it. I believe this was one of Herb Sutter’s goals.

          1. 1

            What I mean is using libclang to parse the C headers, translate them into C++ safe declarations, and ignore the non-C++ parts, effectively generating C++ specific headers that can be ingested for creating modules. Unless various operating systems are naming kernel functions class and namespace I don’t think it would be too problematic. At least in theory :P

            1. 1

              That’s what modules should do, except without libclang. They will use clang in C mode to build a C AST and make declarations from it accessible in C++.

              1. 1

                The committee isn’t interested in language or behavior changes “on import”, and has pushed back on “epochs” a few times. So whatever modules should do is not what they will do. Herb has a bit of a modus operandi to make big promises, work on them for a bit, and then not be able to get them through the committee so he can at least say he tried to improve things. I expect cpp2 will end up in this same position.

                IMHO, it’s better to just run a one-time tool against a hostile C interface, make some ABI compatible transforms however you can, and then generate a C++ capable interface (header or module), rather than try to work around using said headers. This is especially true for targets where the C that C++20 and later target is newer than the C used in a given environment such as kernel development (e.g., C++20 acknowledges C18, but not C11 or C99) and at that point you’re in the realm of implementation defined behavior and will end up throwing out most rules and depending upon observable behavior.

      5. 2

        [1] There’s a lot that I hate about the NT kernel, but none of those things are due to use of C++. The use of C++ is typically a big win for readability in the bits I’ve read. The use of SEH to read userspace memory is an abomination.

        What is SEH?

        1. 3

          Structured Exception Handling. The Windows exception ABI runs on the stack calling funclets (functions that run on the top of the stack with access to a lower frame) to do cleanup. This means that it doesn’t require heap allocation and so can run in the kernel.

          In the NT kernel, rather than having copy-in and copy-out helpers that handle page faults when accessing userspace, drivers just dereference user memory directly and catch an exception if one is thrown.

        2. 2

          Structured Exception Handling

          Basically, the NT kernel and Windows API calling conventions have explicit support for a form of exceptions and stack unwinding baked in.

          (Which makes sense, given that NT and Win32 are a product of the mid 90s. So much of how modern WinAPI works traces back to reasonable-but-not-prescient attempts to future-proof NT and Win32 using what was believed to be good design in the 1990s.)

          1. 3

            SEH is actually not bad - it provides a reasonable cross-language exception ABI. Itanium C++ ABI exceptions are a little wonky with the allocation step. I think the part David finds objectionable is a specific usage of it in the kernel to do something gross.

            1. 3

              Yup, I really like how SEH works and I keep pondering doing a version for ELF. Itanium didn’t use it because Borland had some patents covering it (MS licensed them) and enforced them. They were filed in the early ‘90s, and so all expired a decade or so ago. The way that they’re used in the NT kernel should be discussed around a campfire with people who have grown jaded to ghost stories.

              1. 2

                The use of SEH to read userspace memory is an abomination. The way that they’re used in the NT kernel should be discussed around a campfire with people who have grown jaded to ghost stories.

                Wait wait wait… you can’t start down that road and not continue. ;) We want more details or a blog post of its own!

            2. 1

              *nod* I was more commenting on how I’d be surprised if NT hadn’t provided OS support for exceptions, given how heavily they factored in the common wisdom of the period.

              Exceptions, COM being an object-oriented API, platform-level support for fibers, UTF-16 strings in platform APIs, etc. …they’re all quintessential “what 90s computer science believes the future will be”. A microkernel is the only “big thing of the 90s” thing that comes to mind which NT didn’t wind up having.

    2. 3

      I’ve been using it on a number of personal Linux machines for a while now, and have had zero problems… and it’s saved my skin several times.

      No issues at all with memory or humungous cache files (I’m looking at you, Arq!)

      My only nitpick is the ghastly UX of the web tool. Yes, it works, but… ick. Interviews it just often enough that you can figure it out, but then forget it within a couple of days…

      Actual homepage is https://duplicacy.com/

      1. 1

        I also do my backups with it and it has seemed to work fine. I haven’t yet dared to try to restore anything so good to hear it has helped you.

        When it comes to backup tools, Duplicacy seemed like the only free (for personal use) tool that didn’t have a history of surprise data corruptions. So it was an easy choice.

    3. 9

      To anyone reading this thread: after years and years of experience with attempting ASN.1 interop, STAY AWAY from ASN.1!

      Run, hide, lie, do whatever is necessary to avoid this design-by-committee monstrosity.

      Yes, it looks just fine on paper, but in actual implementation you will enter a never-ending hell of “why isn’t that working?” followed by the most persnickety language-lawyering from everyone involved.

      Read Peter Gutmann’s tirade about X.509 with the mindset of “How would I interoperate with this?” and you might begin to understand.

      1. 5

        Confirmed. It’s one of the things I know I can look up when I suspect I need it.

        1. 2

          The problem is not knowing that there is even something that needs to be looked up.

          Even basic things like “no, not all base-10 numbers have exact (finite) base-2 representations”, to “yes, signed zero is a thing, for very good reasons” to “please don’t forget that Inf, -Inf, and NaN are valid (and useful) values”.

      2. 4

        One of the design aims of IEEE floating-point was to reduce the likelihood of developers, who did not know much about floating-point, producing incorrect results.

        The survey asks a load of detailed questions that only a few people who have studied the subject in detail will be able to answer. Surprise! The results were little better than random.

        I’m sure I could come up with a survey to find out whether developers understood integer arithmetic, and get essentially random results.

      3. 2

        Darn. You beat me to posting exactly the same thing.

    4. 4

      Is it better to be slowly correct all the time, or quickly incorrect some of the time?

      Really, I would suggest not giving advice like “turn on -ffast-math and friends” unless you really understand what those flags are doing to your code.

      There is an awful lot of numerical analysis that depends on the surprisingly complex behavior of floating-point complex arithmetic. That complex-multiplication function is written that way for a reason. Many reasons, actually.

      Perhaps that warrants a little more investigation beyond “LOL the standard library authors are just a bunch of pedants”…?