Threads for mcf

  1. 4

    The fact that you have to pass -no-pie has nothing to do with ARM64 or Linux. I’m guessing your gcc was configured with --enable-default-pie, so it will try to link as PIE by default. QBE doesn’t generate position-independent code, so it can’t be linked as a PIE. This is why you have to pass -no-pie, and why gcc was complaining about the relocation types. The same thing will happen if you compile code with gcc -fno-PIE and then link it with gcc -pie.

    Regarding the “bug” described, if you define a function with one set of parameters and call it with a different set of parameters, of course you’ll get incorrect code. QBE has no notion of “function declarations” and doesn’t do type checking. The function call syntax gives QBE all the information it needs to generate code for the function call, but you have to give it the correct information. It is up to the frontend to output correctly typed function calls.

    Ignoring the env feature entirely, this also doesn’t work:

    data $str = { b "%f", b 0 }
    
    function $print_f64(d %x) {
    @start
        call $printf(l $str, ..., d %x)
        ret
    }
    export function w $main() {
    @start
            call $print_f64(l 12)
            ret 0
    }
    

    This is because print_f64 is defined to take a d parameter, so the code generated for it expects it in a floating point register, but it is called with an l parameter, so the code generated for the call puts the argument in an integer register.

    1. 1

      The fact that you have to pass -no-pie has nothing to do with ARM64 or Linux…

      It has something to do with ARM64 or Linux, because on my AMD64 system with the same Linux distro I don’t have to do it. QBE appears to generate code that is position independent on AMD64:

      > ./qbe thing.ssa > thing.s
      > gcc thing.s
       > file ./a.out
      ./a.out: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, ...
      

      On the other hand the default cc command line args used in the unit test script all appear to include -no-pie by default. Soooooo I can only conclude that in this case it generates x86 asm code that is just coincidentally position-independent, and a different/more complex case would not.

      QBE has no notion of “function declarations” and doesn’t do type checking.

      …I could have sworn this was incorrect, and remember even seeing the code for arg type checking, but upon testing it appears that you are correct. I am quite confused, and annoyed. Thank you for the correction though. Maybe the code I was looking at was for checking the arg types of instructions rather than function calls?

    1. 6

      It’s pretty cool that Brian found it easy to add an additional stage to the pipeline in cproc. I think it’s better to implement these sort of optimizations in QBE itself, but it’s a neat exercise nonetheless.

      At one point in time, the author of cproc sent me a diff for QBE that did make the optimization but I guess it was never applied to the QBE repository.

      This is actually the last commit in my qbe branch compared to upstream (Quentin has done some great work recently merging various patches and finishing up riscv64 support).

      The patch is available here, and is based on one proposed by Érico Nogueira: https://github.com/oasislinux/oasis/blob/master/pkg/qbe/patch/0001-amd64-optimize-loading-0-into-registers.patch

      Motivated by this post, I just sent the updated patch to the mailing list.

      1. 1

        Couldn’t the last example be reduced to:

        char *one = "one";
        char *end;
        errno = 0; // remember errno?
        long i = strtol(one, &end, 10);
        if (errno) {
            perror("Error parsing integer from string: ");
        } else if(*end) {
            fprintf(stderr, "Error: invalid input: %s\n", one);
        }
        

        For a similar expected behavior?

        The one == end is kind of redundant with *end != '\0'.

        Might not be perfect, but API-wise this is not as smelly as it is presented, in my opinion. Actually with C limitations it kind of makes sense. Any parsing error due to numbers that cannot get represented in a long are passed into errno, which is a common way to represent errors in stdlib. Using the end pointer seems like a reasonable and very flexible way to let the caller deal with what should be considered an invalid string. In some case, someone might not care about trailing character (so could use end == buf instead of *end).

        1. 2

          The one == end is kind of redundant with *end != '\0'.

          Not necessarily. If the string is "", then we would have one == end, but not *end != '\0'.

          1. 1

            Good point didn’t thought about this case!

          2. 1

            Not quite; you have to check the return value before you can rely on errno.

          1. 4

            The “42b” error checking isn’t even quite right. The article says

            This will return 0, will not trip errno (remember errno?), and the pointer has moved forward by two bytes.

            But actually, it will return 42, not 0, so the condition i == 0 && *end != '\0' will not trigger.

            Also, the i == 0 isn’t necessary for the end == one check, since if end == one, we know strtol returned 0.

            You could combine all into a single condition as

            if (errno || end == str || *end)
            	fprintf(stderr, "invalid input\n");
            
            1. 21

              Coding in C is like camping. It’s fun for a while, but eventually you really miss things like flushing toilets and grocery stores. You can fancy-up your sleeping bag + tent with a huge motor home/caravan by using C++. It’s like you have this house on wheels, but if you just want to take a sleeping bag + tent with you on a trip - you can. Take whatever you need.

              That being said… you could, and probably should (unless you’re doing embedded/retro-coding?), compile most C code with a C++ compiler, since C++ does enforce additional type safety/const correctness. The simple code examples in this article do not compile with a C++ compiler. Of course, C++ compilers cannot compile all C code due to this type safety so you’ll perhaps have to add casts to all your malloc/realloc/calloc functions or convert them to a modern C++ equivalent. Honestly if your goal is to reduce the mistakes you make in a C code base and you’re not interested in rewriting it in Zig or Rust, consider using a C++ compiler to compile it.

              1. 16

                I’m sorry, but I have C code, right now, that won’t compile with C++. There are features that are in C99 that aren’t in C++ yet, like designated initializers, or the ability to inline a structure inside a function call (I can never remember what this is called), and NULL. You can pry NULL from my cold dead hands, because it stands out way more than a bare 0 does (who thought that was a good idea for C++?). I do, however, crank the warnings up as high as they go and fix the warnings.

                1. 11

                  You can pry NULL from my cold dead hands, because it stands out way more than a bare 0 does (who thought that was a good idea for C++?)

                  Null pointers are complicated. In C, they are a horrible compromise because not all targets used a zero bit pattern to represent null. Any integer constant expression that evaluates to zero, and any such expression cast to any pointer type, are defined to be null pointers by the C spec. The following are all null pointers (or, at least, may be depending on context):

                  0;
                  (void*)0;
                  1-1;
                  (char*)(42 - 12 - 30);
                  

                  The following is not a null pointer:

                  int x = 1;
                  int *notNull = (int*)(x-1);
                  

                  Though in most implementations it will happen to have the same bit pattern as a null pointer and so will compare equal to one. Relying on this will bite you on some mainframes where the null pointer bit pattern is not 0 and address 0 is a valid address.

                  There is also a macro in C called NULL that must be defined as a valid null pointer constant. Any of the examples above is a valid definition of this macro. Most implementations define it as (void*)0 because of one very painful corner case in C. Consider the following C function declaration and call:

                  /// Append a null-terminated list of pointers to the container.
                  void appendValues(struct SomeContainer *c, ...);
                  
                  appendValues(&container, &a, &b, &c, &d, NULL);
                  

                  On most 64-bit (LP64) C ABIs, if NULL is defined as 0 (which, remember, is a valid definition) then this will fail. The compiler will pass five pointers and one 32-bit integer to the variadic function and the high or low (depending on the target endian) 32 bits of the final pointer will be undefined and are likely to not be zero (this gets more likely as you add more arguments - in register the value may end up being sign extended, on the stack it will not and so you’ll read 4 bytes from outside of the argument frame).

                  To prevent this kind of breakage, the null pointer macro must be defined with a cast to a pointer type (and I’ve never seen an implementation where it wasn’t). In C++, this introduced some difficulty. C permits implicitly casting from void* to any pointer type (which is a big part of the reason that the author of the article refers to it as ‘The Type Safety Is For Losers Language’). This means that it is completely valid to write:

                  int *x = (void*)0;
                  

                  C++ does not have this escape hatch. If you’re doing dangerous things with pointers in C++ then you must do it explicitly. You can cast any pointer type to a void* (mostly for compatibility with C APIs) but the converse cast must be explicit, to highlight in the code that you’re doing a dangerous thing.

                  In C++98, this meant that a definition of NULL as (void*)0 required a cast on every use. This was annoying and so the recommendation was to use 0 and an explicit cast to a pointer type if necessary (for examples, as in appendValues). This was not very satisfactory because in a language that at least sometimes pretends to be type safe, it’s a good idea if integers and pointers are not confused, even in the specific case of null.

                  C++11 introduced nullptr and std::nullptr_t to address this. In C++11, null is not just some constant integer value, it is a singleton value, nullptr (which may, as with C, have any bit pattern representation) and it is of the type nullptr_t, which may be cast to any pointer type. Because it is a separate type, it also participates in overload resolution, which is useful for some compile-time checks because it allows you to specialise methods or templated functions differently if one parameter is known at compile time to be null (including static_asserting that it shouldn’t be).

                  In most C++11 headers, the C NULL macro is defined to be nullptr and so NULL can be used in both languages.

                  1. 2

                    The following are all null pointers (or, at least, may be depending on context):

                    0;
                    (void*)0;
                    1-1;
                    (char*)(42 - 12 - 30);
                    

                    Some of these are not null pointers (0 and 1-1 are not, they don’t have pointer type), and some of these are not null pointer constants ((char *)(42 - 12 - 30) is not).

                    There is also a macro in C called NULL that must be defined as a valid null pointer constant. Any of the examples above is a valid definition of this macro.

                    Not quite. While your example (char*)(42 - 12 - 30) is a null pointer, it is not a null pointer constant, since it does not have integer type or type void *. If NULL had type char *, then you’d need an explicit cast every time you used an operator with NULL and a non-char pointer.

                    Your example about variadic functions is a good one, but I think the takeaway is not that you should assume that implementations use (void *)0 as NULL, but that you must use an explicit cast when calling such functions (such as (char *)0 when using execl).

                    1. 2

                      Some of these are not null pointers (0 and 1-1 are not, they don’t have pointer type), and some of these are not null pointer constants ((char *)(42 - 12 - 30) is not).

                      C11 Section 6.3.2.3, paragraph 4 says:

                      An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant

                      So you’re right about the char* version (though it is a null pointer constant cast to a different type) but not about 0 and 1-1. You are correct that the char* one isn’t a valid definition of NULL though.

                      1. 2

                        The definition of a null pointer is a null pointer constant converted to a pointer type. 0 and 1-1 are null pointer constants, but they have type int, so they are not null pointers.

                        1. 2

                          I don’t see that definition anywhere in the C11 spec. Can you give me a section and paragraph reference?

                          1. 2

                            Sure, it’s the same paragraph you quoted earlier, C11 6.3.2.3p3:

                            If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.

                  2. 3

                    Eh, what’s the problem with NULL in C++? I thought C++ standard library header cstddef defines NULL.

                    1. 1

                      C and C++ define NULL differently.

                      In C, I think NULL is (void *)0 but in C++ it’s just 0. This is fixed in C++11 and higher with nullptr.

                      1. 4

                        How does nullptr fix anything for people who are targeting the common subset of C and C++ so that one can use C++ compiler to compile C? As I understand nullptr is not in C, is it?

                        1. 7
                          #ifdef __cplusplus
                          # ifdef NULL
                          #  undef NULL
                          # endif
                          # define NULL nullptr
                          #endif
                          
                        2. 3

                          In C, NULL is an implementation-defined null pointer constant, which is any integer constant expression with value 0, possibly cast to void *. So it could be 0, ((void *)0), or even something crazy like (1 - sizeof(char))

                    2. 10

                      The suggestion to take your C codebase and compile it at C++ doesn’t make sense. You can’t compile C as C++. You will need to port it. This will take a lot of work and the final result will not have the advantages of a program designed and implemented in C++ from the start - because it was originally written with C programming patterns. I do not think this approach is worth it. If you wanted to integrate C with C++ you can do that with linking, extern C etc.

                      1. 4

                        I’ll add it’s the same with Rust. You can run a C codebase through c2rust, but then you’re merely replacing gcc with rustc. For real improvements you need to refactor the code to use safe patterns.

                        I’ve ported a few C libraries to Rust, and I found it’s hard. There’s a mismatch on a high level in how programs are architected. It’s normal for C programs to be pointer-heavy, rely on implicit sizes of buffers (a function gets a pointer to a buffer of unknown length, and just assumes it’s large enough), and vague runtime-dynamic memory ownership (you free some pointer only if some other flag is set, or just set it to null after free, so that it doesn’t crash when you try to free it for a second time).

                        1. 2

                          You will need to port it. This will take a lot of work and the final result will not have the advantages of a program designed and implemented in C++ from the start - because it was originally written with C programming patterns

                          I half agree, having worked on some truly terrible C-to-C++-converted codebases. The minimal port is usually just adding a load of explicit casts, particularly on malloc calls. Turning the malloc and free into new and delete avoids these casts but requires you to do it everywhere (since allocating with new and freeing with free is UB - it might work, but if you’re allocating with new you might be tempted to add a destructor and then the free calls will deallocate but not destroy the objects).

                          That said, if you write C that compiles with a C++ compiler, then you get C that has a lot more type checking at compile time. I’ve seen some folks do this (though not for over 10 years) for code that had to run on microcontrollers. They’d write it in the common subset of C and C++, compile as C++ on a PC for extra type checking, and then as C for deployment.

                          These days, when every platform that has a C compiler but not a C++ compiler is a tiny embedded system that also lacks most of the C standard library, I can’t see any good reason for not just writing C++ and if you have a C codebase then incrementally transitioning it to C++ is easier than doing the same with any other language. You can keep the same struct definitions and extern "C" definitions for C APIs and gradually refactor internal things into idiomatic C++ one file and then one component at a time. A lot of the things in modern C++, such as std::unique_ptr are ways of expressing things in the type system that are implicitly part of the API contract in C, so you can move to using smart pointers internally and wrap them for the C interface, then remove the C interface when all callers are gone.

                        2. 7

                          Coding in C is like camping.

                          Coding in C is like tightrope walking.

                          It can be fun to get good at, and it can be very impressive to watch. But please don’t do it while you’re carrying my breakable stuff!

                        1. 3

                          The article doesn’t seem to explain why this should be avoided. It claims that it has “harmful side effects ranging from subtle breakage to miscompilation”, but has no examples to back it up. If it is indeed harmful, wouldn’t this also cause problems with software built with one set of defines, linked against libraries built with another?

                          Sometimes, if my project uses POSIX in only one or two files, I’ll prefer the #define approach rather than global -D flags. This way, it’s clear that use of POSIX functions is expected to be contained within those files and prevents them from getting used accidentally in the others.

                          Similarly, if a project is entirely POSIX compliant, but one file needs a single function guarded by _GNU_SOURCE (for instance, pipe2, which will be in POSIX issue 8 but is still guarded in glibc), I’ll define it only in that one file, since I don’t intend to use GNU APIs elsewhere.

                          The main problem with feature-test macros that I see in the wild is people thinking that they are supposed to check them to see whether those features are supported, when they are really supposed to define them to tell libc to expose those features. For example, it is fairly common to see code like

                          #if _POSIX_C_SOURCE >= 200809L || _XOPEN_SOURCE >= 700
                          

                          intending to check whether st_mtim is available in struct stat, but the project never defined those feature-test macros anywhere.

                          1. 3

                            I believe the problem from the pattern in the link is specific to things like this:

                            #include <stdlib.h>
                            #include <unistd.h>
                            #define _XOPEN_SOURCE
                            #include <string.h>
                            

                            The first two headers may include something that the third one includes. When string.h is included, these files will hit the include guards and not be re-parsed, so anything exposed differently by _XOPEN_SOURCE will not be caught.

                            This doesn’t matter if you do the define at the top of the file (-DFoo is equivalent to adding #define Foo at the start of the file). But if you need to change the flags depending on the target, it’s much better to have that centralised in the build system than scattered in a load of different files.

                            The article also doesn’t discuss the fact that glibc does this the opposite way around to BSD libcs (including Darwin libc). In GNU libc, everything except a core set is hidden by default and must be exposed by feature test macros. In BSD libcs, everything is exposed by default and the feature test macros are used for writing portable code by restricting you to a standard subset. This means that, in code, you typically want something like:

                            #ifdef __linux__
                            #  define _XOPEN_SOURCE
                            #endif
                            

                            Unfortunately, that’s really a glibc-specific thing, not a Linux-specific thing and compilers don’t define a pre-defined macro for glibc, only for operating systems.

                            1. 2

                              Yes, defining in between includes is clearly wrong, but the article calls out any sort of #define _XOPEN_SOURCE as bad practice. As you said, -D is equivalent to adding a #define at the start of the file, which is why I am a bit skeptical.

                              1. 1

                                My rule of thumb is that it’s okay to add one or two of these in a codebase but if you have more than a couple of files containing them then it belongs in the build system. Don’t put them in files is probably a better rule of thumb than always put them in files but the article misses a lot of nuance.

                          1. 5

                            I’m pretty sure the first one is wrong. I thought the spec said explicitly that evaluating sizeof may not have side effects. This construct seems to be accepted by GCC and Clang but I think it’s actually UB. In C++ decltype may not have side effects and I believe this follows from the same rule.

                            VLA typedefs are useful for the same reason that VLAs are useful. If you are using a VLA, it’s useful to be able to name the type.

                            Array designators were a GNU extension but were added to C in C11, I think. They’re really useful and it’s a shame that they’re not in C++.

                            The preprocessor is a functional language and you can pass a macro to another macro, but evaluation may not happen when you think it does. The most common case where this bites people is in stringify macros.

                            The switch thing comes with the same caveat as normal switch: It’s basically goto and you must be really careful not to branch over variable initialisation. I can’t remember if C rejects constructs that do or just makes them UB. It also doesn’t generate better code with a modern compiler than writing it the sane way - it’s just basic blocks and branches at the IR layer.

                            The a[b] thing is actually even more fun. Addition in C is commutative, so a+b is equivalent to b+a in all contexts. Pointer + integer is defined as address + (size of pointee) * integer and because addition is commutative this doesn’t matter if you put the pointer on the left or right side. The same rule is why integer promotion works in C. If I were trying to design a language feature that looked sensible at first glance but would lead to many bugs, this is probably about as good as anything I could come up with.

                            1. 4

                              I thought the spec said explicitly that evaluating sizeof may not have side effects.

                              Here’s what it says in section 6.5.3.4 paragragh 2 (about the sizeof and _Alignof operators) in my copy of the C11 spec:

                              If the type of the operand is a variable length array type, the operand is evaluated;

                              Then in section 6.7.6.2 paragraph 5 (Array declarators):

                              If the size is an expression that is not an integer constant expression: if it occurs in a declaration at function prototype scope, it is treated as if it were replaced by *; otherwise, each time it is evaluated it shall have a value greater than zero.

                              I can’t find any mention of side-effects in array declarator expressions.

                              1. 2

                                Array designators were a GNU extension but were added to C in C11, I think. They’re really useful and it’s a shame that they’re not in C++.

                                They were introduced in C99 at the same time as member designators (https://port70.net/~nsz/c/c99/n1256.html#6.7.8p6). The obsolete GNU syntax for designators is x: 123 for members and [10] "foo" for indices.

                                I am surprised that they aren’t supported in C++, I wonder why that is. Perhaps they conflict with some other C++ syntax?

                                1. 1

                                  As of C++11, initialiser lists are an object type and can be forwarded and so on. The array indexing extension doesn’t play very nicely with this because it’s special constructor syntax of a single kind of object. It also doesn’t play so nicely with type deduction.

                                2. 1

                                  I’m pretty sure the first one is wrong.

                                  I think so too, but mainly as a bit of semantic hair-splitting. Technically, it’s the type that is causing the side effect, not sizeof.

                                  1. 1

                                    Damn, I was about to ask whether int b[static 12][*][*] is valid C++.

                                    I realize you can use std::array to express this in C++, but a lot of my work involves public C APIs around C++ implementations, so making those APIs safer and more expressive would be great.

                                  1. 10

                                    Are they finally going to fix the abomination that is C11 atomics? As far as I can tell, WG14 copied atomics from WG21 without understanding them and ended up with a mess that causes problems for both C and C++.

                                    In C++11 atomics, std::atomic<T> is a new, distinct type. An implementation is required to provide a hardware-enforced (or, in the worst case, OS-enforced) atomic boolean. If the hardware supports a richer set of atomics, then it can be used directly, but a std::atomic<T> implementation can always fall back to using std::atomic_flag to implement a spinlock that guards access to larger types. This means that std::atomic<T> can be defined for all types and be reasonably efficient (if you have a futex-like primitive then, in the uncontended case it’s almost as fast as T and in the contended state it doesn’t consume much CPU time or power spinning).

                                    Then WG14 came along and wanted to define _Atomic(T) to be compatible with std::atomic<T>. That would require the C compiler and C++ standard library to agree on data layout and locking policy for things larger than the hardware-supported atomic size, but it’s still feasible. Then they completely screwed up by making all of the arguments to the functions declared in stdatomic.h take a volatile T* instead of an _Atomic(T)*. For historical reasons, the representation of volatile T and T have to be the same, which means that _Atomic(T) and T must have the same representation and there is nowhere that you can stash a lock. The desire to make _Atomic(T) and std::atomic<T> interchangeable means that C++ implementers are stuck with this.

                                    Large atomics are now implemented by calls to a library but there is no way to implement this in a way that is both fast and correct, so everyone picks fast. The atomics library provides a pool of locks and acquires one keyed on the address. That’s fine, except that most modern operating systems allow virtual addresses to be aliased and so there are situations (particularly in multi-process situations, but also when you have a GC or similar doing exciting virtual memory tricks) where simple operations _Atomic(T) are not atomic. Fixing that would requiring asking the OS if a particular page is aliased before performing an operation (and preventing it from becoming aliased during the operation), at which point you may as well just move atomic operations into the kernel anyway, because you’re paying system call for each one.

                                    C++20 has worked around this by defining std::atomic_ref, which provides the option of storing the lock out-of-line with the object, at the expense of punting the determination of the sharing set for an object to the programmer.

                                    Oh, and let’s not forget the mtx_timedlock fiasco. Ignoring decades of experience in API design, WG14 decided to make the timeout for a mutex the wall-clock time, not the monotonic clock. As a result, it is impossible to write correct code using C11’s mutexes because the wall-clock time may move arbitrarily. You can wait on a mutex with a 1ms timeout and discover that the clock was wrong and after it was reset in the middle of your ‘get time, add 1ms, timedwait’ sequence, you’re now waiting a year (more likely, you’re waiting multiple seconds and now the tail latency of your distributed system has weird spikes). The C++ version of this API gets it right and allows you to specify the clock to use, pthread_mutex_timedlock got it wrong and ended up with platform-specific work-arounds. Even pthreads got it right for condition variables, C11 predictable got it wrong.

                                    C is completely inappropriate as a systems programming language for modern hardware. All of these tweaks are nice cleanups but they’re missing the fundamental issues.

                                    1. 3

                                      Then they completely screwed up by making all of the arguments to the functions declared in stdatomic.h take a volatile T* instead of an _Atomic(T)*. For historical reasons, the representation of volatile T and T have to be the same, which means that _Atomic(T) and T must have the same representation and there is nowhere that you can stash a lock.

                                      I’m not too familiar with atomics and their implementation details, but my reading of the standard is that the functions in stdatomic.h take a volatile _Atomic(T) * (i.e. a pointer to volatile-qualified atomic type).

                                      They are described with the syntax volatile A *object, and earlier on in the stdatomic.h introduction it says “In the following synopses: An A refers to one of the atomic types”.

                                      Maybe I’m missing something?

                                      1. 2

                                        Huh, it looks as if you’re right. That’s how I read the standard in 2011 when I added the atomics builtins to clang, but I reread it later and thought that I’d initially misunderstood. It looks as if I get to blame GCC for the current mess then (their atomic builtins don’t require _Atomic-qualified types and their stdatomic.h doesn’t check it).

                                        Sorry WG14, you didn’t get atomics wrong, you just got mutexes and condition variables wrong.

                                        That said, I’ve no idea why they felt the need to make the arguments to these functions volatile and _Atomic. I am not sure what a volatile _Atomic(T)* actually means. Presumably the compiler is not allowed to elide the load or store even if it can prove that no other thread can see it?

                                        1. 1

                                          I’ve no idea why they felt the need to make the arguments to these functions volatile and _Atomic

                                          I’ve no idea; but a guess: they want to preserve the volatility of arguments to atomic_*. That is, it should be possible to perform operations on variables of volatile type without losing the ‘volatile’. I will note that the c++ atomics contain one overload with volatile and one without. But if that’s the case, why the committee felt they could get away with being polymorphic wrt type, but not with being polymorphic wrt volatility is beyond me.

                                          There is this stackoverflow answer from a committee member, but I did not find it at all illuminating.

                                          not allowed to elide the load or store even if it can prove that no other thread can see it?

                                          That would be silly; a big part of the impetus for atomics was to allow the compiler to optimize in ways that it couldn’t using just volatile + intrinsics. Dead loads should definitely be discarded, even if atomic!


                                          One thing that is clear from this exchange: there is a massive rift between specifiers, implementors, and users. Thankfully the current spec editor (JeanHeyd Meneide, also the author of the linked post) seems to be aware of this and to be acting to improve the situation; so we will see what (if anything) changes.

                                          1. 3

                                            One thing that is clear from this exchange: there is a massive rift between specifiers, implementors, and users. Thankfully the current spec editor (JeanHeyd Meneide, also the author of the linked post) seems to be aware of this and to be acting to improve the situation; so we will see what (if anything) changes.

                                            It’s not really clear to me how many implementers are left that care:

                                            • MSVC is a C++ compiler that has a C mode. The authors write in C++ and care a lot about C++.
                                            • Clang is a C++ compiler that has C and Objective-C[++] modes. The authors write in C++ and care a lot about C++.
                                            • GCC includes C and C++ compilers with separate front ends, it’s primarily C so historically the authors have cared a lot about C, but for new code it’s moving to C++ and so the authors increasingly care about C++.

                                            That leaves things like PCC, TCC, an so on, and a few surviving 16-bit microcontroller toolchains, as the only C implementations that are not C++ with C as an afterthought.

                                            I honestly have no idea why someone would choose to write C rather than C++ these days. You end up writing more code, you have a higher cognitive load just to get things like ownership right (even if you use nothing from C++ other than smart pointers, your live is significantly better than that of a C programmer), you don’t get generic data structures, and you don’t even get more efficient code because the compilers are all written in C++ and so care about C++ optimisation because it directly affects the compiler writers.

                                            C++ is not seeing its market eroded by C but by things like Rust and Zig (and, increasingly, Python and JavaScript, since computers are fast now). C fits in a niche that doesn’t really exist anymore.

                                            1. 2

                                              I honestly have no idea why someone would choose to write C rather than C++ these days.

                                              For applications, perhaps, but for libraries and support code, ABI stability and ease of integration with the outside world are big ones. It’s also a much less volatile language in ways that start to really matter if you are deploying code across a wide range of systems, especially if old and/or embedded ones are included.

                                              Avoiding C++ (and especially bleeding edge revisions of it) avoids a lot of real life problems, risks, and hassles. You lose out on a lot of power, of course, but for some projects the kind of power that C++ offers isn’t terribly important, but the ability to easily run on systems 20 years old or 20 years into the future might be. There’s definitely a sort of irony in C being the real “write once, run anywhere” victor, but… in many ways it is.

                                              C fits in a niche that doesn’t really exist anymore.

                                              It might not exist in the realm of trendy programming language debates on the Internet, but we’re having this conversation on systems largely implemented in it (UNIX won after all), so I think it’s safe to say that it very much exists, and will continue to for a long time. That niche is just mostly occupied by people who don’t tend to participate in programming language debates. One of the niche’s best features is being largely insulated from all of that noise, after all.

                                              It’s a very conservative niche in a way, but sometimes that’s appropriate. Hell, in the absolute worst case scenario, you could write your own compiler if you really needed to. That’s of course nuts, but it is possible, which is reassuring compared to languages like C++ and Rust where it isn’t. More realistically, diversity of implementation is just a good indicator of the “security” of a language “investment”. Those implementations you mention might be nichey, but they exist, and you could pretty easily use them (or adapt them) if you wanted to. This is a good thing. Frankly I don’t imagine any new language will ever manage to actually replace C unless it pulls the same thing off. Simplicity matters in the end, just in very indirect ways…

                                              1. 4

                                                For applications, perhaps, but for libraries and support code, ABI stability and ease of integration with the outside world are big ones. It’s also a much less volatile language in ways that start to really matter if you are deploying code across a wide range of systems, especially if old and/or embedded ones are included.

                                                I’d definitely have agreed with you 10 years ago, but the C++ ABI has been stable and backwards compatible on all *NIX systems, and fairly stable on Windows, for over 15 years. C++ provides you with some tools that allow you to make unstable ABIs for your libraries, but it also provides tools for avoiding these problems. The same problems exist in C: you can’t add a field to a C structure without breaking the ABI, just as you can’t add a field to a C++ class without breaking the ABI.

                                                I should point out that most of the things that I work on these days are low-level libraries and C++17 is the default tool for all of these.

                                                You lose out on a lot of power, of course, but for some projects the kind of power that C++ offers isn’t terribly important, but the ability to easily run on systems 20 years old or 20 years into the future might be.

                                                Neither C nor C++ guarantees this, in my experience old C code needs just as much updating as C++ code, and it’s often harder to do because C code does not encourage clean abstractions. This is particularly true when talking about running on new platforms. From my personal experience, we and another group have recently written memory allocators. Ours is written in C++, theirs in C. This is what our platform and architecture abstractions look like. They’re clean, small, and self-contained. Theirs? Not so much. We’ve ported ours to CHERI, where the hardware enforces strict pointers and bounds enforcement on pointers with quite a small set of changes, made possible (and maintainable when most of our targets don’t have CHERI support) by the fact that C++ lets us define pointer wrapper types that describe high-level semantics of the associated pointer and a state machine for which transitions are permitted, porting theirs would require invasive changes.

                                                It might not exist in the realm of trendy programming language debates on the Internet, but we’re having this conversation on systems largely implemented in it (UNIX won after all), so I think it’s safe to say that it very much exists, and will continue to for a long time.

                                                I’m writing this on a Windows system, where much of the kernel and most of the userland is C++. I also post from my Mac, where the kernel is a mix of C and C++, with more C++ being added over time, and the userland is C for the old bits, C++ for the low-level new bits, and Objective-C / Swift for the high-level new bits. The only places either of these systems chose C were parts that were written before C++11 was standardised.

                                                Hell, in the absolute worst case scenario, you could write your own compiler if you really needed to.

                                                This is true for ISO C. In my experienced (based in part on building a new architecture designed to run C code in a memory-safe environment and working on defining a formal model of the de-facto C standard), there is almost no C code that is actually ISO C. The language is so limited that anything nontrivial ends up using vendor extensions. ‘Portable’ C code uses a load of #ifdefs so that it can use two or more different vendor extensions. There’s a lot of GNU C in the world, for example.

                                                Reimplementing GNU C is definitely possible (clang, ICC, and XLC all did it, with varying levels of success) but it’s hard, to the extent that of these three none actually achieve 100% compatibility to the degree that they can compile, for example, all of the C code in the FreeBSD ports tree out of the box. They actually have better compatibility with C++ codebases, especially post-C++11 codebases (most of the C++ codebases that don’t work are ones that are doing things so far outside the standard that they have things like ‘works with G++ 4.3 but not 4.2 or 4.4’ in their build instructions).

                                                More realistically, diversity of implementation is just a good indicator of the “security” of a language “investment”. Those implementations you mention might be nichey, but they exist, and you could pretty easily use them (or adapt them) if you wanted to.

                                                There are a few niche C compilers (e.g. PCC / TCC), but almost all of the mainstream C compilers (MSVC, GCC, Clang, XLC, ICC) are C++ compilers that also have a C mode. Most of them are either written in C++ or are being gradually rewritten in C++. Most of the effort in ‘C’ compiler is focused on improving C++ support and performance.

                                                By 2018, C++17 was pretty much universally supported by C++ compilers. We waited until 2019 to move to C++17 for a few stragglers, we’re now pretty confident being able to move to C++20. The days when a new standard took 5+ years to support are long gone for C++. Even a decade ago, C++11 got full support across the board before C11.

                                                If you want to guarantee good long-term support, look at what the people who maintain your compiler are investing in. For C compilers, the folks that maintain them are investing heavily in C++ and in C as an afterthought.

                                                1. 3

                                                  I’d definitely have agreed with you 10 years ago, but the C++ ABI has been stable and backwards compatible on all *NIX systems, and fairly stable on Windows, for over 15 years. C++ provides you with some tools that allow you to make unstable ABIs for your libraries, but it also provides tools for avoiding these problems. The same problems exist in C: you can’t add a field to a C structure without breaking the ABI, just as you can’t add a field to a C++ class without breaking the ABI.

                                                  The C++ ABI is stable now, but the problem is binding it from other languages (i.e. try binding a mangled symbol), because C is the lowest common denominator on Unix. Of course, with C++, you can just define a C-level ABI and just use C++ for everything.

                                                  edit

                                                  Reimplementing GNU C is definitely possible (clang, ICC, and XLC all did it, with varying levels of success) but it’s hard, to the extent that of these three none actually achieve 100% compatibility to the degree that they can compile, for example, all of the C code in the FreeBSD ports tree out of the box. They actually have better compatibility with C++ codebases, especially post-C++11 codebases (most of the C++ codebases that don’t work are ones that are doing things so far outside the standard that they have things like ‘works with G++ 4.3 but not 4.2 or 4.4’ in their build instructions).

                                                  It’s funny no one ever complains about GNU’s extensions to C being so prevalent that it makes implementing other C compilers hard, yet loses their minds over say, a Microsoft extension.

                                                  1. 2

                                                    The C++ ABI is stable now, but the problem is binding it from other languages (i.e. try binding a mangled symbol), because C is the lowest common denominator on Unix. Of course, with C++, you can just define a C-level ABI and just use C++ for everything.

                                                    That depends a lot on what you’re binding. If you’re using SWIG or similar, then having a C++ API can be better because it can wrap C++ types and get things like memory management for free if you’ve used smart pointers at the boundaries. The binding generator doesn’t care about name mangling because it’s just producing a C++ file.

                                                    If you’re binding to Lua, then you can use Sol2 and directly surface C++ types into Lua without any external support. With something like Sol2 in C++, you write C++ classes and then just expose them directly from within C++ code, using compile-time reflection. There are similar things for other languages.

                                                    If you’re trying to import C code into a vaguely object-oriented scripting language then you need to implement an object model in C and then write code that translates from your ad-hoc language into the scripting language’s one. You have to explicitly write all memory-management things in the bindings, because they’re API contracts in C but part of the type system in C++.

                                                    From my personal experience, binding modern C++ to a high-level language is fairly easy (though not quite free) if you have a well-designed API, binding Objective-C (which has rich run-time reflection) is trivial to the extent that you can write completely generic bridges, and binding C is possible but requires writing bridge code that is specific to the API for anything non-trivial.

                                                    1. 1

                                                      Right; I suspect it’s actually better with a binding generator or environments where you have to write native binding code (i.e. JNI/PHP). It’s just annoying for the ad-hoc cases (i.e. .NET P/Invoke).

                                                      1. 2

                                                        On the other hand, if you’re targeting .NET on Windows then you can expose COM objects directly to .NET code without any bridging code and you can generate COM objects directly from C++ classes with a little bit of template goo.

                                      2. 2

                                        Looks like Hans Boehm is working on it, as mentioned in the bottom of the article. They are apparently “bringing it back up to parity with C++” which should fix the problems you mentioned.

                                        1. 4

                                          That link is just Hans adding a <cstdatomic> to C++ that adds a #define _Atomic(T) std::atomic<T>. This ‘fixes’ the problem by letting you build C code as C++, it doesn’t fix the fact that C is fundamentally broken and can’t be fixed without breaking backwards source and binary compatibility.

                                      1. 7

                                        The quote from the C standard is not quite the right one. %d is a perfectly valid conversion specification. The problem arises with the sentence afterward:

                                        If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.

                                        Also note that the cast syntax in this example is only valid in C++. If you try to compile that code as C, you’ll get an error. It’s a bit strange to me that the author used some ambiguous looking C++-specific syntax to argue against writing C.

                                        That said, I agree that variadic functions can be very error-prone. You miss out on type-checking for the variadic arguments, so they should be only used sparingly and with extra caution.

                                        1. 11

                                          the code, as presented, won’t compile under any actual C compiler. The line:

                                          printf("%d\n",double(42));
                                          

                                          is not valid C syntax. Given that he use #include <cstdio> tells me he used a C++ compiler, which may very well allow such dubious syntax (I don’t know, I don’t program in C++). This blog post to me reads as an anti-C screed by someone who doesn’t program in C.

                                          1. 3

                                            double(1) is a call to the double primitive type’s instrinsic constructor. It’s not really a type cast, though that’s effectively what happens (at least as I understand it).

                                            1. 1

                                              It’s completely legit syntax in C++.

                                              The principle behind it is that user-defined types should generally be able to everything primitive types can do, and to some extent vice-versa. This is why C++ supports operator overloading, for example.

                                              In this case allowing primitive types to use constructor-like syntax means you can write something like T(42) in a template and it will compile and do the expected thing whether T is a user-defined class or the primitive double type.

                                              1. 3

                                                Nobody said it was legitimate syntax in C++. They said it was illegitimate syntax in C

                                          1. 1

                                            I think using plain char for something that is not text was the first mistake. If you use unsigned char you can also cast the value to the size you need and shift.

                                            static uint32_t read32be(const unsigned char *p)
                                            {
                                            	return ((uint32_t) p[0] << 24)
                                            	     | ((uint32_t) p[1] << 16)
                                            	     | ((uint32_t) p[2] << 8)
                                            	     | ((uint32_t) p[3]);
                                            }
                                            

                                            Interestingly (unlike Clang and GCC) MSVC does not appear to be able to recognize these patterns and generate the bswap.

                                            1. 2

                                              If you want bytes, use uint8_t, not unsigned char. See, sizeof(char) is not fully specified in C. Some actual architectures in current use (DSP) do not support byte level addressing, and on those machines the width of char can actually be 32 bits. (Of course, on those machines uint8_t would not even compile, but that’s kind of the point: if you can’t have bytes, you need to rethink your serialization code.)

                                              1. 1

                                                While I agree in theory, I believe the standard does not guarantee that uint8_t is a character type, which means you could get in trouble with strict aliasing if a compiler vendor goes crazy. For storing bytes uint8_t is great, but for accessing bytes (like in the function above), unsigned char is safer. You can always check if CHAR_BIT is 8.

                                                1. 3

                                                  I believe the standard does not guarantee that uint8_t is a character type,

                                                  It indeed does not guarantees that, and in practice sanitisers do warn me about careless casting from uint8_t.

                                                  which means you could get in trouble with strict aliasing if a compiler vendor goes crazy.

                                                  It can indeed be a problem if we do something like this:

                                                  transform(uint32_t *out, const uint8_t *in); // strict aliasing
                                                  
                                                  uint8_t data[32];
                                                  read(file, data, 32);
                                                  transform(data, data); // strict aliasing violation!!
                                                  

                                                  To get to that however, we’d have to be a little dirty. And to be honest, as much as I hate having way too much undefined behaviour in C, I do like the performance improvements that come with strict aliasing. Besides, while we could turn off strict aliasing by using unsigned char here, there’s no way we could turn it off in a case like this:

                                                  transform2(struct foo *out, const uint32_t *in);
                                                  

                                                  Now some C user might indeed be surprised by the fact that strict aliasing applies to uint8_t, even though it invariably has the same representation as unsigned char (at least on 2’s complement machines, which comprise every single machine in active use). That is indeed unfortunate. An API designer however may still set those expectations right:

                                                  transform(uint32_t *out, const uint8_t * restrict in);
                                                  
                                                  1. 1

                                                    Where is that written? “A typedef declaration does not introduce a new type” and is “for syntactic convenience only” quoth ANSI X3.159-1988. The uint8_t type isn’t the uint8_least_t type so if it’s available then it must be char, unless your environment defines char as fewer than 8 bits and defines either short int or long to be 8-bits, which is about as likely as your code being compiled on a Setun.

                                                    1. 2

                                                      You’d have to guarantee that uint8_t comes from a typedef in the first place, and the standard provides no such guarantee. Yes, in practice this will be a typedef, but that typedef is defined in a standard header, so I’m not sure that actually counts. As far as I know, compilers are allowed to special-case this type and pretend it does not come from a typedef, so they can enable strict aliasing.

                                                      1. 1

                                                        Where is that written?

                                                        1. 1

                                                          It’s not, that I know of. And with the C standard, if it’s not written, it’s not guaranteed.

                                                          1. 1

                                                            How would you know? You’re only speaking for yourself.

                                                            1. 1

                                                              You go find that place in the standard that says uint8_t is a character type. I’m not going to copy & paste 700 pages to show you it’s not there. You wouldn’t read them even if I could. You on the other hand could easily disprove my claim with a couple short citations. Please take the effort to do so.

                                                              Ninja Edit: what do you know, it looks like we can disprove my claim after all. From the C99 standard, §7.20.4:

                                                              For each type described herein that the implementation provides, <stdint.h> shall declare that typedef name and defined the associated macros. […]

                                                              That seems to mean that we have to use typedef to represent an uint8_t, and as far as I know your reasoning that it has to be unsigned char is sound as far as I can tell. I’ve tested the following under all sanitizers I could find (including the TIS interpreter), they find nothing wrong with it:

                                                              #include <stdio.h>
                                                              #include <string.h>
                                                              #include <inttypes.h>
                                                              
                                                              int main()
                                                              {
                                                                  uint32_t x = 42;
                                                                  uint32_t y;
                                                                  uint8_t t8[4];
                                                                  memcpy(t8, &x, sizeof(uint32_t));
                                                                  memcpy(&y, t8, sizeof(uint32_t));
                                                                  printf("x = %" PRIu32 "\n", x);
                                                                  printf("y = %" PRIu32 "\n", y);
                                                                  return 0;
                                                              }
                                                              

                                                              (Now my problem is that they find nothing wrong with it even if I replace the uint8_t buffer with an uint16_t buffer.)

                                                              I stand corrected, my apologies.

                                                              1. 2

                                                                Thanks for researching that. I did a bit more research and I think uint8_t being non-char is unlikely for different reasons now. The standard says char can be 7+ bits[1] that short/int/long can be any size greater than or equal to char, but must be a multiple of the size of char. Therefore uint8_t and uint_least8_t can only be defined in an environment where char is eight bits. Because if char was 7 bits then short wouldn’t be able to be 8 bits since it’s not a multiple of 7. The only legal way for uint8_t to be short would be if the environment defined both char and short as being 8-bits and then the C library author chose to use short when defining the typedef instead to torture us. Here is the relevant text from the standard:

                                                                 * Byte --- the unit of data storage in the execution environment
                                                                   large enough to hold any member of the basic character set of the
                                                                   execution environment.
                                                                ...
                                                                   Both the basic source and basic execution character sets shall
                                                                have at least the following members: the 26 upper-case letters of
                                                                the English alphabet [...] the 26 lower-case letters of the English
                                                                alphabet [...] the 10 decimal digits [...] the following 29 graphic
                                                                characters [...] the space character, and control characters
                                                                representing horizontal tab, vertical tab, and form feed. In both
                                                                the source and execution basic character sets, the value of each
                                                                character after 0 in the above list of decimal digits shall be one
                                                                greater than the value of the previous. [...] In the execution
                                                                character set, there shall be control characters representing alert,
                                                                backspace, carriage return, and new line.
                                                                ...
                                                                 * Character --- a single byte representing a member of the basic
                                                                   character set of either the source or the execution environment.
                                                                ...
                                                                   There are four signed integer types, designated as signed char,
                                                                short int, int, and long int.
                                                                ...
                                                                In the list of signed integer types above, the range of values of
                                                                each type is a subrange of the values of the next type in the list.
                                                                ...
                                                                   For each of the signed integer types, there is a corresponding (but
                                                                different) unsigned integer type (designated with the keyword unsigned)
                                                                that uses the same amount of storage (including sign information)
                                                                ...
                                                                2 The sizeof operator yields the size (in bytes) of its operand, which may
                                                                be an expression or the parenthesized name of a type. The size is
                                                                ...
                                                                3 When [sizeof is] to an operand that has type char, unsigned char, or
                                                                signed char, (or a qualified version thereof) the result is 1.
                                                                ...
                                                                requirement that objects of a particular type be located on storage
                                                                boundaries with addresses that are particular multiples of a byte address
                                                                

                                                                [1] char must be 7+ bits because the standard specifies exactly 100 values which it says needs to be representable in char. Fun fact: that set of legal characters per ANSI X3.159-1988 is basically everything you’d expect from ASCII except $@` which the standard defines as undefined behavior lool Maybe C20 or whatever the next one is should use those for bsr, bsf, and popcnt

                                                                Edit: It makes sense that @` weren’t required since their positions in the ASCII table kept being redefined between the ASA X3.4-1963 and USAS x3.4-1967 standards. Not sure what the rationale is for dollar. The ANSI C89 standard also has text saying that dollar may be used in identifiers, along with anything else, but it isn’t mandatory. GNU lets us use dollar identifiers which is cool although I wish they let us use UNICODE symbols too.

                                                                1. 2

                                                                  Note that, although it has to be a typedef, it doesn’t have to be a typedef of a standard type. For example, in CHERI C, intptr_t is a typedef of the built-in type __incap_t. This is permitted by the standard (as far as we could tell) in the same way that it’s permitted for intmax_t to be __int128_t or __int256_t or whatever on systems that expose these as non-standard types.

                                                                  1. 1

                                                                    Shit, so that means we could have a build in __unsigned_octet_t type that’s not unsigned char, and and alias uint8_t to that?

                                                                    That would invalidate the whole aliasing assumption.

                                                                    1. 1

                                                                      intmax_t being 64-bit in gnu system five environments always seemed to me like the biggest contradiction with the wording of the standard. cosmopolitan libc defines intmax_t as __int128 for that reason but i’ve often wondered if that’s wise. Do you know off hand if there’s any other environments doing that?

                                                                      1. 2

                                                                        intmax_t is defined as int64_t because __int128 didn’t exist in the ‘90s (when most things were 32-bit and a lot of platforms that GNU and BSD systems supported couldn’t even do 64-bit arithmetic without calling out to a software implementation) and it’s an ABI-breaking change to change it. It’s a shame that it exists at all, because it’s predicated on the assumption that your library will never be run linked to a program compiled for a newer system that supports a wider integer value. On a modern x86 system with AVX-512, you could store a 512-bit integer in a register and write a fairly fast set of operations on it, so should intmax_t be 512 bits?

                                                                        1. 1

                                                                          __int128 is a recycling of the 32-bit code for having 64-bit integers. Why throw away all that effort after the move to 64-bit systems? As for AVX-512 as far as I know SSE and AVX do not provide arithmetic types that are wider than 64-bits.

                                                                          1. 2

                                                                            Most 64-bit ABIs were defined before __int128 came along. AVX-512 doesn’t natively support 512-bit integers, but it does support 512-bit data in registers. You can implement addition by doing vector addition and then applying the carry bits. You can implement in-register multiply in a similar way. This makes a 512-bit integer a more realistic machine type than __int128, which is generally stored in a pair of registers (if you’re going to have an integer type that doesn’t fit in one register, why stop at two registers? Why not have a type split between four or more integer registers?).

                                                                            1. 1

                                                                              Could you teach me how to add the carry bits in SSE vectors? I know how to do it with VPTERNLOGD but it sounds like you know a more general approach than me.

                                                    2. 1

                                                      See, sizeof(char) is not fully specified in C.

                                                      I think what you mean is CHAR_BIT (the number of bits in a byte) that is not fully specified. sizeof(char)==1 by C11 6.5.3.4p4:

                                                      When sizeof is applied to an operand that has type char, unsigned char, or signed char, (or a qualified version thereof) the result is 1.

                                                      1. 1

                                                        Whoops, my bad. Makes sense. Oh my, I guess that means sizeof(uint32_t) might be like 1. Goodness.

                                                      2. 1

                                                        Changing jibsen’s function to use uint8_t* instead will simply make the code refuse to compile in those kinds of environments. That’s why the blog post recommends mask+shift. The linked macros would work on those DSPs provided you store one octet per word.

                                                        1. 2

                                                          As I said in italics, refusing to compile was the point.

                                                          1. 1

                                                            That sort of attitude leads to about a third of all github issues for C projects like STB last time I checked. There’s always some finger wagger who adds a compiler warning that breaks code due to things like unused parameters because they feel it’s time for us to rethink things. If it’s possible and legal it should be permitted.

                                                            1. 1

                                                              Monocypher uses uint8_t for every buffer, and many (possibly most) of its users are in the embedded space.

                                                              I don’t recall having even a single complaint about it.

                                                              1. 1

                                                                Yeah if you’re writing crypto libraries I can see the reluctance to accommodate weird architectures. Is valiant wolf your legal name? Makes the sheep like me a bit afraid to trust a library like that.

                                                                1. 1

                                                                  It is my legal name, believe it or not. And if you don’t trust my work, trust its audit.

                                                                  1. 1

                                                                    Carry on then. I looked through your code and it looked like you’re doing things the right way.

                                                        2. 1

                                                          See, sizeof(char) is not fully specified in C.

                                                          So wrong. N1570 §6.5.3.4p4. sizeof (char) and sizeof (unsigned char) are defined to be 1.

                                                          Of course, on those machines uint8_t would not even compile, but that’s kind of the point: if you can’t have bytes, you need to rethink your serialization code.

                                                          People generally don’t run their code on DSPs, but let’s say that a popular machine architecture came out with 9 bit bytes. It would be incredibly unusual if that architecture exposed data streams coming over the internet by spreading a sequence of 9 8 bit bytes across 8 9 bit bytes. It’s more likely that this architecture would put the 9 8 bit bytes in 9 9 bit bytes with the MSB unset. It’s entirely possible to write code which handles this correctly and portably.

                                                          That being said, if you’re of the opinion that it’s not worth worrying about machines where uint8_t is not defined then you probably don’t care about this hypothetical scenario in which case your entire point about using uint8_t over unsigned char is moot since it won’t matter anyway.

                                                          1. 1

                                                            N1570 §6.5.3.4p4. sizeof (char) and sizeof (unsigned char) are defined to be 1.

                                                            Yeah, I was confusing sizeof and the width of bytes. (An interesting consequence is that sizeof(uint32_t) is one on machines with 32-bit bytes.)

                                                            People generally don’t run their code on DSPs

                                                            The reason I’ve even heard of DSPs with 32-bit bytes was because a colleague of mine had to write a C program for it, and he ran into all sorts of interesting problems because of that unusual byte size. Sure, the purpose of the chip was probably to do some simple and very fast signal processing, but if you can get away with cramming in more general purpose code in there as well, you could lower the manufacturing costs.

                                                            It would be incredibly unusual if that architecture exposed data streams coming over the internet by spreading a sequence of 9 8 bit bytes across 8 9 bit bytes.

                                                            It would indeed. I was more thinking of the (real) machines that have 32-bit bytes. It makes more sense for them to pack 4 network octets into a single 32-bit byte.

                                                            That being said, if you’re of the opinion that it’s not worth worrying about machines where uint8_t is not defined

                                                            I’m on the opinion that we should worry about them. Which is why I advocate explicit exclusion by using uint8_t.

                                                            1. 2

                                                              An interesting consequence is that sizeof(uint32_t) is one on machines with 32-bit bytes.

                                                              uint32_t doesn’t exist on machines where CHAR_BIT is not 8.

                                                              The reason I’ve even heard of DSPs with 32-bit bytes was because a colleague of mine had to write a C program for it, and he ran into all sorts of interesting problems because of that unusual byte size. Sure, the purpose of the chip was probably to do some simple and very fast signal processing, but if you can get away with cramming in more general purpose code in there as well, you could lower the manufacturing costs.

                                                              Oh for sure, I write my code with the idea that if someone wanted to run it on a DSP for some reason they would at least get a predictable result. The only problem is that when CHAR_BIT is not 8 it’s difficult to know if whatever input data stream coming from some unknown source is going to be octets merged into a bitstream and spread over bytes or if each octet only gets one byte.

                                                              It would indeed. I was more thinking of the (real) machines that have 32-bit bytes. It makes more sense for them to pack 4 network octets into a single 32-bit byte.

                                                              So in this case a lot of the serialisation/deserialisation code I write deals with one octet per byte. You would need to write a frontend to translate from whatever packed representation appears inside a byte into separate octets per byte to use that code on a machine where octets are merged over bytes.

                                                              The problem with writing general purpose code targeting this is that it’s incredibly difficult to make the code clean, auditable and simple while giving enough options to cover the wealth of possible ways you may take octets and fit them into non-8-bit-bytes (even if you just restrict yourself to power-of-two machines).

                                                              At the point where you have a general purpose serialisation/deserialisation library which handles this in a way which is flexible enough to handle all possible cases the code will get complicated enough that it would probably be less error prone to modify the original code to specifically work for the intended architecture. Especially when such a modification will actually be quite minor in what would also be quite a tiny codebase.

                                                              I’m on the opinion that we should worry about them. Which is why I advocate explicit exclusion by using uint8_t.

                                                              I personally think that in that case it is easier, clearer and more explicit to just write:

                                                              #include <limits.h>
                                                              #if CHAR_BIT != 8
                                                              #error "This codebase does not support non-octet bytes."
                                                              #endif
                                                              

                                                              I would then agree that in cases where you’re explicitly operating on the assumption that chars hold 8 bits you should then use uint8_t or a typedef of octet. In general I think because of C’s lacking type system people are reluctant to rely on typenames to add clarity to codebases where it may actually bring a lot of benefit.

                                                              Personally I handle this by using unsigned char everywhere inside [de]serialisation functions and using masking and shifting to treat any width char as only holding octets. The interfaces use Since this kind of problem only occurs at interfaces I then document this assumption in the API documentation.

                                                      1. 9

                                                        I heard the news last week, and while I was using LibreSSL, I was really using it for libtls. I then saw https://git.causal.agency/libretls/ which is libtls for OpenSSL. I made the switch, and the software just worked like it always has. I think some people from the Gentoo project are behind this.

                                                        For me, libtls is what most people should be using, not OpenSSL or LibreSSL.

                                                        1. 4

                                                          I think some people from the Gentoo project are behind this.

                                                          Where did you hear that? LibreTLS was written by June of ascii.town to make it easier to use pounce and catgirl on systems without libressl (also motivated by some problems with libressl’s TLS 1.3 implementation related to the status_request extension, which have since been fixed).

                                                          For me, libtls is what most people should be using, not OpenSSL or LibreSSL.

                                                          Agreed. It is a much nicer API to use, and at this point has three implementations I’m aware of (the third being my own libtls-bearssl).

                                                          1. 2

                                                            I recall reading the Gentoo mailing list archive about them dropping LibreSSL, and I recall them keeping libretls. I might have mistaken that as a Gentoo initiative.

                                                          2. 3

                                                            LibreSSL-portable also can be built to have a self-contained libtls. When I write code touching those libraries, I too wish to focus on libtls instead of caring about the underlying SSL lib versions.

                                                          1. 2

                                                            another interesting subject is zero initializing without memset():

                                                            struct foo x = {};

                                                            or

                                                            x = (struct foo){};

                                                            or

                                                            x = (struct foo){ .a = 23, };

                                                            though it may omit padding bytes

                                                            1. 10

                                                              Note that empty initializer lists (struct foo x = {}) are not valid in ISO C. The portable equivalent is struct foo x = {0}, which initializes the first field to 0, and all unspecified fields are initialized to 0. This works even for non-aggregate types, for example int x = {0}.

                                                              1. 1

                                                                didn’t know this works for int, too, wow!

                                                                what about this case though:

                                                                struct foo { struct { int x; int y; } sub; int z; };

                                                                struct foo f = {0}; f = (struct foo){0};

                                                                Since assigning 0 to an aggregate type (f.sub = 0) does not make sense, so far I thought that i would spare compiler errors by just omitting that zero. colleagues did complain, but the point that {0} seems to work no matter the nature of the type was never made. Does x = (struct x){0}; always work for all thinkable aggregate and nested aggregate types??

                                                            1. 2

                                                              also known as memset_explicit() or memset_s() in other libc

                                                              Note that memset_s isn’t just an “other libc” thing, it’s part of the C11 standard.

                                                              1. 1

                                                                To be clear, it is part of the optional Annex K, which has a number of problems and MSVC is the only major implementation to support it: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1967.htm#impementations

                                                                So I think it’s fair to call it an “other libc” thing.

                                                                1. 2

                                                                  MSVC doesn’t even support it; they implement a pre-specification version of it, and they refuse to update their implementation due to backwards compatibility.

                                                                  That said, however, though annex K is maligned, memset_s has seen wider adoption and interest. There have been talks in WG14 to standardize (but not the rest of annex k); or to standardize another function with equivalent behaviour. FreeBSD implements memset_s, but none of the other annex K functions.

                                                                  1. 1

                                                                    There have been talks in WG14 to standardize the function itself; or to standardize another function with equivalent behaviour.

                                                                    I’d be very happy to see that. For anyone interested, it looks like the most recent document is http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2599.htm

                                                              1. 23

                                                                There’s also the ninja-compatible samurai. Written in C instead of C++, and significantly less of it.

                                                                1. 4

                                                                  What makes samurai so much smaller? I’m not familiar with either codebase, but I would guess that Ninja has more complex optimizations for fast start up for massive builds. I vaguely recall reading something about lots of effort going into that.

                                                                  I have no bias either way, just curious.

                                                                  1. 21

                                                                    I would guess that Ninja has more complex optimizations for fast start up for massive builds

                                                                    In my personal benchmarks, samurai uses less memory and runs just as fast (or slightly faster) than ninja, even on massive builds like chromium. If you find a case where that’s not true, I’d be interesting in hearing about it.

                                                                    As for why the code is smaller, I think it’s a combination of a few things. Small code size and simplicity were deliberate goals of samurai. As a result, it uses a more concise and efficient coding style. It also lacks certain inessential features like running a python web server to browse the dependency graph, spell checking for target and tool names, and graphviz output. samurai also only supports POSIX systems currently, while ninja supports Windows as well.

                                                                    In some cases, samurai uses simpler algorithms that are easier to implement. For example, when a job is finished, ninja looks up whether each dependent job has an entry in a map, and if it does, it checks each other input of that job to see if it was finished, and if all of them are ready it starts the job. samurai, on the other hand, just keeps a count of pending inputs for each edge and when an output was built, it decreases that count for every dependent job, starting those that reached 0. This approach is thanks to @orib from his work on Myrddin’s mbld.

                                                                    I have no bias either way, just curious.

                                                                    As the author of samurai, I am a bit biased of course :)

                                                                    1. 2

                                                                      If you find a case where that’s not true, I’d be interesting in hearing about it.

                                                                      I don’t have any observations. It was just an off-the-cuff guess based on an article I read about ninja’s internals, and the possibility that you may have traded a little performance for simpler code. But on the contrary, your example of a simpler algorithm also sounds more efficient!

                                                                      Thanks for the detailed reply. I didn’t even realize ninja had all those extra features, so naturally I can see why you’d omit them. I just installed samurai on all my machines! :)

                                                                      1. 1

                                                                        samurai also only supports POSIX systems currently, while ninja supports Windows as well.

                                                                        Any chance fixing this and adding Windows?

                                                                  1. 3

                                                                    I know I’m doing my part to kill this beast as fast as possible. At work, there is still this giant ~25k LoC code generating perl script that I haven’t even begun to unravel.

                                                                    Perl and Python have always come pre-installed on Linux. If I remember well, there was actually a Posix specification or something that had it as a requirement.

                                                                    I think the author was referring to Linux Standard Base

                                                                    In terms of the question:

                                                                    It’s only a matter of time before other distributions follow. The next logical step is for Perl to go away, the only question is when?

                                                                    As long as Git still depends on perl based modules, I don’t think it will be permanently gone any time soon. Maybe someone will reimplement them in not-perl, but I don’t know if anyone is working on that. Actually I’m not even sure perl is ootb in RHEL.

                                                                    1. 8

                                                                      Git has been steadily rewriting the parts in shell script and Perl in C for years. I haven’t been following closely, so I don’t honestly know where that situation is, but I somewhat doubt that Git’s going to be Perl’s saviour by 2023.

                                                                      1. 11

                                                                        Current git master contains 13k lines of Perl code in 41 files, which is actually not all that much compared to 281k lines of C code. Much of it actually seems related to GIT:SVN, git-cvsserver, gitweb, and other stuff that’s not used all that frequently. The only fairly common things are git send-email and git add --interactive, but I think vast swaths of people have ever used either.

                                                                        Either way, from the looks of it, you can create a perfectly functioning git build without Perl already. The Makefile has a NO_PERL flag for it.

                                                                        1. 3

                                                                          I recently discovered the add.interactive.usebuiltin setting, which uses a C implementation for git add --interactive. It is not enabled by default since I guess it is still experimental, but I haven’t run into any problems so far.

                                                                        2. 5

                                                                          OpenBSD’s packages system is largely written in Perl IIRC.

                                                                      1. 2

                                                                        That’s all cool and all but my biggest concern with statically linked binaries is: How does ASLR even work? What mechanism can a static binary do to make sure the libc it shoved into itself isn’t predictably located?

                                                                        1. 5

                                                                          Look into static PIE. gcc has had support for a few years now, and musl even before that (musl-cross-make patched gcc before support was upstreamed in version 8).

                                                                          1. 2

                                                                            Does ASLR work?

                                                                          1. 5

                                                                            Not the main topic but I just discovered libtls-bearssl and it looks super cool! It’s not easy to distribute the original libtls since it doesn’t really work with OpenSSL, and using the raw OpenSSL API is pure madness.

                                                                            1. 3

                                                                              Thanks! I’m very happy with how libtls-bearssl turned out.

                                                                              It didn’t really occur to me at the time that it might be easier to distribute (my goal was just to port various libtls applications to BearSSL), but someone else recently mentioned this to me as well. It makes sense, and could be a good way to expand adoption of the libtls API where it otherwise wouldn’t be available.

                                                                            1. 21

                                                                              Hi, this is my crazy Linux OS project. I wasn’t really prepared for this to be shared today, so documentation is a bit lacking. I’m happy to answer any questions.

                                                                              I just updated the screenshot on the wiki, now featuring the oasis netsurf frontend. I also added a script to build QEMU images using builds.sr.ht. The latest one is available here: https://patchouli.sr.ht/builds.sr.ht/artifacts/~mcf/226248/1b6626238a895943/oasis-qemu.tar.xz

                                                                              If you want to try it out, just extract the tarball and run ./run (graphics mode) or ./run -s (serial mode). More information can be found in README.md alongside the image (including how to rebuild from source), which is also available in the home directories inside the image.

                                                                              1. 6

                                                                                Has any thought been placed on the security downsides of using static linking? Since Linux doesn’t support static PIE binaries, ASLR is made ineffectual with statically-compiled applications.

                                                                                1. 10

                                                                                  Linux doesn’t support static PIE binaries

                                                                                  musl and gcc fully support static PIE. If you have a toolchain that supports it, you just need to put -fPIE in your CFLAGS and -static-pie in your LDFLAGS.

                                                                                  This used to be the default actually, but I just changed it in case someone might try to build with a toolchain from musl.cc, which does not build libc.a with -fPIE so it can’t be linked into a static PIE.

                                                                                  1. 1

                                                                                    Awesome! I didn’t know musl supported static PIE. I haven’t really paid much attention to musl (and if I’m being honest, Linux in general.)

                                                                                2. 5

                                                                                  I’m really a big fan of what you have done, everything fits together in such a tidy way, thanks so much!

                                                                                  1. 1

                                                                                    I thought it was neat to see both your projects on the same day because they solve some similar problems in different ways. Both are really neat.

                                                                                    1. 1

                                                                                      This one isn’t my project, but I agree it solves similar problems in a more idealistic way.

                                                                                      1. 2

                                                                                        That was ambiguous. I meant both your as in you and the other person.

                                                                                  2. 4

                                                                                    Looks neat. As I understand it, your model is closer to a firmware image than a traditional Linux distro (i.e. you build a set of components you want and install them as an atomic set). I can see that being really useful for cloud / server deployments.

                                                                                    Are you using anything like crunchgen to get some of the benefits of dynamic linking in your programs, or do they all carry copies of the same libraries? I’d love to see a system that generated a single binary for all of the programs in the image, did dead-code elimination and link-time optimisation across the entire thing.

                                                                                    (Totally unrelated, but I’m oddly pleased by all of the things using NetSurf suddenly. I remember using it on RiscOS back when AltaVista was an exciting new competitor to Yahoo! and Lycos)

                                                                                    1. 3

                                                                                      Thanks! Yeah, that seems like a fair comparison. The idea for that stemmed from dissatisfaction with how typical Linux distributions split up source packages into several binary packages (if they even do that at all). With this approach, you select the contents based on whatever criteria you want. Anything that doesn’t get selected doesn’t even get built. Due to the use of static linking, you don’t really have to worry about runtime dependencies. This gives you a lot of control depending on your use case. For example, on my VPS, I use something like

                                                                                      fs = {
                                                                                      	-- I need development files from these libraries to rebuild the kernel
                                                                                      	{'linux-headers', 'musl', 'ncurses', 'elftoolchain', 'libressl', 'zlib'},
                                                                                      	-- I want the st terminfo file, but I don't need st itself
                                                                                      	{'st', include={'^share/terminfo/'}},
                                                                                      	{
                                                                                      		sets.core, sets.extra,
                                                                                      		'acme-client', 'dnssec-rr', 'make', 'nginx', 'nsd', 'pounce',
                                                                                      		exclude={'^include/', 'lib/.*%.a$'},
                                                                                      	},
                                                                                      }
                                                                                      

                                                                                      On my desktop, I use fs = {exclude={}}, which builds every package, excluding nothing.

                                                                                      I’m not using anything like crunchgen, so everything carries a copy of everything it links to. However, due to the use of lightweight packages, most binaries are really small anyway. Only a few packages such as mpv or mupdf which link in a lot of libraries have huge binaries (and by huge I still mean < 10M).

                                                                                      Yes, I’m a big fan of NetSurf. It’s quite a capable browser considering their resources. Unfortunately, more and more sites require the monstrosity that is the modern web browser, so I installed firefox via pkgsrc for those.

                                                                                      1. 1

                                                                                        on my VPS

                                                                                        How do you install your custom image on your VPS? I ask because most VPS providers give you a selection of OS images (built with varying levels of care) that you have to start with.

                                                                                        1. 4

                                                                                          I started with a Debian image and used that to install oasis on a separate partition. Then I used a rescue image to move/expand oasis to fill the whole drive. I don’t remember the exact procedure I used, it was a few years ago.

                                                                                          1. 2

                                                                                            Several providers allow you to upload an ISO, Vultr for example, AWS also allows it I think

                                                                                            1. 1

                                                                                              On top of supporting providers, there’s tricks to get OS’s onto VMs of unsupporting providers. I’ll be even more impressed when I see someone get something running with an unsupported ISA. I already have an idea about how that might happen.

                                                                                        2. 1

                                                                                          Velox seems to be a window manager, not a display server, unless there’s something I missed?

                                                                                          Forgive me, I misread the description.

                                                                                        1. 7

                                                                                          I was curious about how the CommonMark specification deals with this problem. Looks like it is quite tricky, but they did manage to come up with a bunch of rules to eliminate the ambiguity, while still allowing efficient parsing: https://spec.commonmark.org/0.29/#emphasis-and-strong-emphasis

                                                                                          It has a whopping 130 examples to show the corner cases!

                                                                                          1. 11

                                                                                            I’ve been using this since it was merged into OpenSSH, and can confirm how well this works.

                                                                                            It is quite refreshing compared to using gpg, which required 3 (!) daemons to work. You had to run a gpg-agent in ssh-agent mode, which spawned its own scdaemon, which communicated with pcscd (which you had to start yourself), which dynamically loaded a CCID module, which used libusb to finally talk to your device.

                                                                                            I saw that OpenSSH also supports ed25519-sk keys. Do you know which security key models support this? I have a super old Yubikey NEO (U2F only) and a Solo key, and neither of them do. Perhaps one of the newer Yubikey models?

                                                                                            1. 7

                                                                                              It looks like YubiKey firmware v 5.2.3 from January 2020 includes curve 25519 support; it’s not clear to me if you can get this on the U2F-only models.

                                                                                              1. 1

                                                                                                Thanks for the link. It sounds like any recent YubiKey 5 series should have this support.

                                                                                                The U2F protocol is specified to use P-256 keys[0], so you’d definitely need a FIDO2-capable device.

                                                                                                [0] https://fidoalliance.org/specs/fido-u2f-v1.2-ps-20170411/fido-u2f-raw-message-formats-v1.2-ps-20170411.html#registration-messages

                                                                                              2. 2

                                                                                                You could already use many YubiKey’s PIV application with OpenSSH through the OpenSC PKCS#11 support [1]. Since some recent version of OpenSSH (8.0 IIRC), you could also use some elliptic curve keys through that route.

                                                                                                The U2F key support is cooler, because the keys are much cheaper. However, the PKCS#11 route has the benefit that it also supports authenticating with servers that have older OpenSSH versions, whereas the U2F support also requires that a server is on OpenSSH 8.2.

                                                                                                [1] https://developers.yubico.com/PIV/Guides/SSH_with_PIV_and_PKCS11.html

                                                                                                1. 1

                                                                                                  I only have to run gpg-agent on my Mac to use my Yubikey as an ssh key. I don’t manage any other daemons. Everything just works.

                                                                                                  1. 2

                                                                                                    This is because macOS is always running a smart card services daemon, to support e.g. logins with a smart card and code signing with smart cards. So on the Mac only gpg-agent and scdaemon are needed, which are automatically started by gpg.