1. 24
  1. 8

    My favourite lesser-known C feature:

    The header <iso646.h> defines the following eleven macros
     (on the left) that expand to the corresponding tokens (on the right): 
    
    and         && 
    and_eq      &= 
    bitand      & 
    bitor       | 
    compl       ~ 
    not         ! 
    not_eq      != 
    or          || 
    or_eq       |= 
    xor         ^ 
    xor_eq      ^= 
    

    So you can write conditions Python-style!

    #include <iso646.h>
    
    if (not (cond0 or cond1)) {
        // ...
    }
    

    Also, since these are naive text substitutions, the particularly perverse among you might enjoy:

    int i = 0;
    int *i_p = bitand i;
    
    1. 6

      If anyone’s curious exactly why the Vogon delegation proposed that thing and how it wound up in the C90 standard, the C committee fortunately kept records: https://www.lysator.liu.se/c/na1.html .

      Also, as someone who still writes Lisp pretty often because of Emacs, this:

      if (not (cond0 or cond1))
      

      makes me extremely uncomfortable.

      1. 1

        this cobolification makes very little sense to me (fwiw)

      2. 1

        And in C++, these work out of the box, no need to use a header.

      3. 4

        The %n format specifier is nice because you can use it to overwrite arbitrary memory if you control the format string: https://formatstringexploiter.readthedocs.io/en/latest/examples/hacker_level.html

        1. 3

          The comma operator is most useful in macros, where you want something to evaluate to an expression. GNU C has a nice extension for this where you can put a block in brackets and it evaluates to the value of the last statement. So ({ a; b; }) evaluates to b after executing a. Constant string concatenation is also incredibly useful in macros, especially combined with the ability to stringify identifiers.

          The caveat in the multi-character constant thing is worth paying attention to. I used to use it, until I hit a load of compatibility problems. It has very exciting interactions with both the source charset and the target endian.

          Bitfields also need to come with a big impdef warning. Their layout is not portable. I’ve had to debug some painful things where people used them in wire protocols, file formats, and so on, and then found that things broke on different architectures / platforms. This is one of the big reasons to keep a Solaris / SPARC machine in CI: if your code works there as well as more mainstream platforms, then it’s probably fine.

          This qualifier tells the compiler that a variable may be accessed by other means than the current code (e.g. by code run in another thread or it’s MMIO device),

          No, no, no. It does not tell you anything about other threads. If you use volatile for anything other than MMIO then you are doing it wrong. The only guarantee that you get from volatile is that the compiler may not elide loads or stores or reorder them with respect to other loads and stores of the same object. This is insufficient for most multithreaded use. The compiler is free to reorder accesses to two different atomics (so if you use it for locks, you’d better make sure you have explicit barriers) and it does not expose anything for atomic read-modify-writes on volatiles, so you cannot use it in most multi-threaded scenarios.

          Nowadays register is usually meaningless as modern compilers place variables in a register if appropriate regardless of whether the hint is given.

          Not quite. register guarantees that the value is never address-taken. This doesn’t really help the optimiser, but it generates an error if you accidentally take the address.

          The compile-time checking with enums is interesting. Pre-C11, the canonical way of doing this was declaring an array with a potentially negative size, like this:

          static int CheckSomething[(condition) ? 0 : -1];
          

          If condition is false, this declares a -1-element array, which is invalid and causes an error. Since C11, _Static_assert exists and there’s no point in using either of these tricks.

          The ad-hoc struct definition is much more useful in C++, where it can be used in combination with structured decomposition.

          1. 4

            No, no, no. It does not tell you anything about other threads. If you use volatile for anything other than MMIO then you are doing it wrong.

            I have had to use it when using a variable accessed both in a signal handler and in the main function.

            static volatile sig_atomic_t done = 0;
            

            The SIGINT handler sets done to 1, and main has a while (!done) loop. Without volatile, the compiler optimized the loop out.

            1. 2

              There are probably a few legitimate use cases outside MMIO but, erm, you know those cute short films where someone adopts an old police dog, and they read their shopping list to the dog, and the list says “eggs, bacon, milk, orange juice, cocaine”, and when it gets to “cocaine” the dog jumps up and makes a really funny face because he’s been conditioned to associate “cocaine” with “trouble”? That kind of how some of us are when we hear volatile and “threads” together :-D.

              Signals are one such case, I think. Another common example is flags set by IRQ handlers, and it works pretty much the same way (i.e. IRQ handlers are usually not called explicitly, so depending on how the IRQ table is populated and how the IRQ handler is written, some compilers on some platforms helpfully conclude that the code that sets the flag is never reached, and will promptly label any loop that waits for it to be set as neverending and yank any code that follows it as unreachable, or “optimize” reading that flag to a constant if it’s in a series of conditions etc.). Both might be made obsolete by better mechanisms at some point but there are definitely compilers for which this is the way, AFAIK.

              However, there was a lot of multithreading hoodoo shit going around the Internet a long time ago which probably started with both sound knowledge and good intentions but somehow ended up being widely misunderstood to mean that volatile variables are thread-safe and you don’t need to keep them behind a mutex, or that you can use them as semaphores.

              This was misunderstood often enough that Intel at some point published a pretty famous note on it (see here: https://web.archive.org/web/20120229214202/http://software.intel.com/en-us/blogs/2007/11/30/volatile-almost-useless-for-multi-threaded-programming/ ).

              So… I guess there might be legitimate uses left for it besides MMIO. But not in a context related to multithreading. Nowhere near threads. If I see volatile and anything with pthread in it on the screen at the same time I start to sweat and I check that we’re well-stocked with coffee.

              1. 3

                That kind of how some of us are when we hear volatile and “threads” together :-D.

                I strongly feel that this reaction is appropriate.

                It makes it funny that Java has a volatile keyword that does roughly the same thing as c/c++’s atomic (and the specification for the latter was based on the former ofc).

                1. 3

                  cute short films where someone adopts an old police dog, and they read their shopping list to the dog, and the list says “eggs, bacon, milk, orange juice, cocaine”, and when it gets to “cocaine” the dog jumps up and makes a really funny face because he’s been conditioned to associate “cocaine” with “trouble”

                  For those of us who had never seen this: https://www.facebook.com/sixties.timemachine/videos/when-you-have-a-police-dog-as-a-pet-%EF%B8%8F/3283138251789397/

              2. 1

                with respect to other loads and stores of the same object

                That semantics sounds too permissive for MMIO? Say I’m writing data to a soundcard, I have volatile uint16_t *sample_buffer; and volatile uint32_t *sample_buffer_indices;. I write some music into sample_buffer and then write to sample_buffer_indices to tell the hardware which part of sample_buffer it should read from. If the indices writes are reordered to before the buffer writes then the sound card will end up playing uninitialised garbage instead of music.

                My understanding is that volatile memory accesses are treated as side effects (like how system calls are), and side effects occur in program order, within a given thread. (This absolutely does not imply that any particular ordering will be visible to other processes or threads.)

                Separate question, can you use volatile for variables that might be updated from a signal handler (or an interrupt handler)? I think I’ve seen it claimed both ways.

                1. 6

                  That semantics sounds too permissive for MMIO? Say I’m writing data to a soundcard, I have volatile uint16_t *sample_buffer; and volatile uint32_t *sample_buffer_indices;. I write some music into sample_buffer and then write to sample_buffer_indices to tell the hardware which part of sample_buffer it should read from. If the indices writes are reordered to before the buffer writes then the sound card will end up playing uninitialised garbage instead of music.

                  This is one of the many gotchas in C++ volatile. The correct thing to do is have a pointer to a volatile struct that points to the MMIO region for the device. Then field accesses are to the same object and cannot be reordered with respect to each other.

                  In practice, the compiler often won’t reorder volatile accesses because to do so it must prove that they point to disjoint objects and alias analysis is usually not good enough to do this. This is especially true when they come from some bus enumeration or casts from integers (impdef), because they’re treated as having escaped from the provenance model.

                  My understanding is that volatile memory accesses are treated as side effects (like how system calls are)

                  System calls are not part of the C abstract machine. A system call is just a function call where the compiler cannot see the body. As such, they are assumed to potentially modify any memory that is reachable from their arguments and from globals. As a result, the compiler may not elide them because it cannot prove that they don’t have side effects. Volatile loads and stores are not allowed to modify other globals: accesses to non-volatile globals can be reordered with respect to volatile accesses.

                  Separate question, can you use volatile for variables that might be updated from a signal handler (or an interrupt handler)? I think I’ve seen it claimed both ways.

                  Not automatically, but (for specific volatile types) you may get this as the result of other properties.

                  1. 1

                    The C Standard calls out longjmp() and sig_atomic_t (used by signal handlers) where the use of volatile is warranted. And that’s pretty much the only instances where I use volatile.

                    1. 1

                      I think that’s somewhat stale text in the standard. C11 introduced signal barriers, which provide an explicit tool to prevent a compiler from moving loads and stores out of loops and to communicate that arbitrary state may be modified by a signal handler. The sig_atomic_t type is just one that can be loaded or stored with a single instruction (on some targets, some primitive integers may need cracking into pairs of loads and stores and for this type it is guaranteed that you will never see tearing if a signal arrives in the middle of a C abstract machine store)

              3. 2

                “Static array indices in function parameter declarations” looks really useful, but the page it links to says “C only” so I’m assuming a C++ compiler won’t like it. This makes me sad because I’ve worked on C API wrappers for C++ code, where this feature would be very useful, but the headers need to be parseable in both languages.