1. 48
    1. 25

      Literal 0 for null pointers.

      Oh dear. I’m glad I never had to work with the author’s code then. NULL signifies intent in a way 0 doesn’t. The distinction is important when you’re exploring an unknown codebase. And no, IDEs and LSP will not always be able to help.

      1. 28

        Here’s a really fun case:

        Look at a variadic function that takes a null terminator to indicate the end of an argument list. You pass a load of pointers and then 0. What happens?

        The last argument will be passed as an int. On 32-bit platforms, this will (almost always) be the same size as a pointer and so everything works. Popping the argument as a void* (or any other pointer type) will give null. On 64-bit platforms, if the argument is passed in a register, the ABI will often require it to be sign extended and so it will also be fine. But if you have enough arguments that the null is passed on the stack (I believe Apple’s AArch64 ABI always passes variadic on the stack, but I might have misremembered) then the store will do a 32-bit store, but the load will do a 64-bit load and will get four bytes of zero and four bytes of whatever happened to be on the stack. This might be zero, so you may find that it even works intermittently. But then a compiler upgrade will change code generation slightly and it will crash. Or you will add another argument pushing the 0 onto the stack and then it crashes.

        Debugging this is painful if you haven’t seen it before (I, sadly, speak from firsthand experience). This is one of the reasons that C++11 introduce nullptr.

        1. 6

          Another reason is that sometimes (not always!) in C NULL is defined as ((void*)0) to help with things like variadic arguments. But this definition does not work in C++ because C++ does not auto-convert void* like C does, so C++ needed a different workaround.

          1. 6

            It does work for variadics in C++, because you’re never (in a way visible to the type system) casting from the null to a T*. It doesn’t work for overloads. With overloads, you can have:

            void foo(T*);
            void foo(U*);
            

            And then if you call it with (void*)0 then neither is allowed. With nullptr, it can be converted to any pointer type and so both are visible and the compile will fail. You can add another overload that takes a std::nullptr_t to address this (often as an inline function that dispatches to a specific one). You don’t want to add a void* wrapper because anything is convertible to void* and so that would allow it to be called with any pointer type. Now imagine that you also have:

            void foo(int);
            

            If you used 0 (which C++ used to recommend before realising that it was a terrible idea) then this is the overload that would be called because it’s the most specific. That is almost certainly surprising.

            Similar problems arise in template specialisation.

            In C, (void*)0 is fine, but 0 is a problem because it is a valid null pointer constant but will only be interpreted as one if the compiler infers that the expected type is a pointer type.

        2. 2

          My reason for using 0 instead of NULL in Monocypher was avoiding the stdio.h dependency. Though in practice it didn’t really matter, I rarely use the null pointer, and almost never test it (inputs are assumed correct, and I make sure pointers to a zero-length buffer are never dereferenced). In applicative C code I use NULL. In C++ I use nullptr.

          Hmm, looking at my code I only see one comparison with the null pointer. In a 1800 lines library. Perhaps I should start using (void*)0, the noise isn’t nearly as bad as I thought it might be.

          1. 10

            My reason for using 0 instead of NULL in Monocypher was avoiding the stdio.h dependency

            NULL is also required to be defined if you include stdlib.h or stddef.h. The latter is required to exist in all free-standing C implementations (as well as hosted ones). Anything that can reasonably claim to be a C implementation defines NULL and there are very good reasons for using it instead of 0.

            1. 5

              It’s in stddef.h? That’s perfect! Adding a note to fix my code.

              1. 3

                Yup. In FreeBSD libc, it’s actually in sys/_null.h, which is included from several of the standard headers, but it’s guaranteed to be in stddef.h (cppreference is a surprisingly good reference for the C standard). It’s actually guaranteed to be defined in several headers, but I think stddef.h is the only one that’s guaranteed to be present in a freestanding C environment.

                1. 1

                  Do note that it’s perfectly legal for NULL to be defined as just 0, rather than (void*)0, so it might be a better idea to just make your own define.

                  C23 adds nullptr, which is guaranteed to be nice and well behaved, but obviously this is not widely supported yet.

              2. 1

                You can also have a #define NULL (void *)0 on the top of your header, if you want to avoid a dependency on stdio.h. Though, I’d just use it anyway, even for things that can be compiled without a libc: in that case, I’d provide a local stdio.h with the necessary defines for the non-libc compilation case.

            2. 15

              While UTF-16 might seem niche, it’s a necessary evil when dealing with Win32, so c16 (“16-bit character”) has made a frequent appearance. I could have based it on uint16_t, but putting the name char16_t in its “type hierarchy” communicates to debuggers, particularly GDB, that for display purposes these variables hold character data. Officially Win32 uses a type named wchar_t, but I like being explicit about UTF-16.

              Careful about this one – wchar_t represents the UCS-* family of encodings, not UTF-*. On Windows wchar_t is 16 bits (UCS-2), but on most Unix platforms it’s 32 bits (UCS-4 / UTF-32). It’s similar to int vs long where using the wrong name will work all of the time on some platforms, then either fail to compile or silently produce incorrect behavior on others. And since the difference between UCS-2 and UTF-16 only matters for non-BMP data, developers in Latin-orthography countries usually don’t notice or care[0].

              [0] Recent example of a UTF handling bug I hit in a widely-used PDF library: https://bugs.ghostscript.com/show_bug.cgi?id=706551

              1. 5

                This hasn’t been true since Windows 2000. From Microsoft:

                The wchar_t type is an implementation-defined wide character type. In the Microsoft compiler, it represents a 16-bit wide character used to store Unicode encoded as UTF-16LE, the native character type on Windows operating systems.

                1. 3

                  There is one version of truth that is written in documentation, and there is another version of truth written in bug reports.

                  I’m sure that for any given Win32 API written and tested by Microsoft a wchar_t* will be treated as UTF-16. I’m also sure, based on personal experience, that there are lots and lots and lots of third-party libraries and applications written for the Windows platform by developers who have never heard of the term “surrogate pair”.

                  You can tell those people until you’re blue in the face that Microsoft has declared UTF-16 to be the native character encoding, they’ll still try to count codepoints with wcslen().

                  1. 3

                    Fair enough: I don’t doubt that there’s loads of buggy code out there that gets wide strings wrong. Still, this isn’t true:

                    wchar_t represents the UCS-* family of encodings, not UTF-*.

                    wchar_t definitely is used to represent UTF-16 within Microsoft APIs and - as you said - you’ll have bugs if you assume its UCS-2 and encounter characters outside the BMP.

                    1. 1

                      wchar_t definitely is used to represent UTF-16 within Microsoft APIs and - as you said - you’ll have bugs if you assume its UCS-2 and encounter characters outside the BMP.

                      You’ll also have bugs if you assume that Microsoft APIs use UTF-16, as they are perfectly happy with unpaired surrogates in many cases.

                      1. 3

                        AKA WTF-16, with its close cousin WTF-8, i.e. UTF-8 with bare surrogates.

                        I am a bit sad I never submitted my version of WTF-8 to the RFC Editor back in the day (because I never came up with a joke for section 7). It isn’t quite as funny now as it was when I wrote it, when UTF-8 had not yet completely taken over the world and email mojibake was still common. (Also the jokes are more obvious if you read it alongside the UTF-8 RFC, which is perhaps asking a bit too much of the reader.)

                2. 2

                  Luckily (or unluckily, if you like UCS-2), due to the rise of emojis, most of which are in the Supplemental Multilingual Plane, even users of Latin-orthography languages are increasingly going to notice incomplete Unicode support.

                3. 10

                  Have they ever looked at Zig? This feels like an attempt to bend C into something like Zig… It’s really nice, but it could be nicer if they just used Zig :D.

                  1. 11

                    Switching languages is a much more significant change than what the author describes as their personal tweaks on how they write C.

                    Besides, i don’t think anything said in the article would qualify as “bending” the C language, in fact they seem pretty standard things to do. Some #define’s for common types, some useful macros, using a length-aware string representation, etc. I have almost no experience programming in C and i have seen almost all of these things in the wild before.

                    1. 4

                      Bend is the wrong word, you’re right.

                      Why is such boilerplate standard to do? If it was so standard, why isn’t it in the standard?

                      For example:

                      typedef all structures. I used to shy away from it, but eliminating the struct keyword makes code easier to read.

                      Is a sample of trying to change the language for convenience. Actually reviewing the post, it seems most of these are.

                      So maybe “bend” is an extreme way to describe this, but it is somewhat close to what I meant to express.

                      They also go on to invent slices, or ‘fat pointers’, a common thing in languages these days as first class citizens:

                      #define s8(s) (s8){(u8 *)s, lengthof(s)} typedef struct { u8 *data; size len; } s8;

                      I’m sorry but this looks a lot like Zig to me at least… A better variant of C… I think the author could really like the language! That’s all I really meant. :)

                    2. 2

                      Every new section made me more convinced that it’s all a setup for a conclusion “now that we’re so close to language X, we can just switch to it instead”. Alas it was not…

                    3. 9

                      signed sizes are the way

                      Only in C where they made the incorrect decision to make unsigned integers have wrapping semantics and signed integers UB on overflow.

                      In a good programming language, you can use information theory to make natural decisions about types, such as, actually matching your integer type range to the possible values being stored there.

                      I think Chris Wellons has reached a local maximum of programming. I follow and enjoy his posts. He’s reached the peak of what you can do with C. But I’m over here halfway up Mt Everest looking down.

                      1. 3

                        The history of how signed integer overflow became actual critical CVE nasal demon kind of UB doesn’t give me much hope for humanity:

                        1. Some obscure platforms go bananas when signed integers are overflown, so the standard made it UB.
                        2. Compiler writers interpreted “UB” in the most expansive way possible for their pet optimisations.
                        3. Code broke.
                        4. Compiler writers blamed the user for daring to rely on UB.

                        This is malpractice, plain and simple. Instead they should have fixed the standard. Too bad for the obscure platforms nobody uses any more, but let’s be honest 2’s complement has won for decades now, it should have been standardised for C99 already.

                        (As for the usual “why don’t you use -fwrapv?” quip, I can’t do that when my product is the source code of a library.)

                        1. 2

                          It is literally 100% impossible to write any useful optimizations whatsoever without that supposed ‘expansive’ view of UB. Though overflow being UB is not very useful for optimization, UB in general has to work like this unless you want really slow machine code (and even then, UB could still blow up in your face).

                          Too bad for the obscure platforms nobody uses any more, but let’s be honest 2’s complement has won for decades now, it should have been standardised for C99 already.

                          2’s complement has been standardized in C for a while, UB on overflow remains. Actually, since you’re replying to Andrew, I want to point out that Zig defaults to overflow being UB for both signed and unsigned integers.

                          1. 2

                            I want to point out that Zig defaults to overflow being UB for both signed and unsigned integers.

                            I have a hard time believing it at face value, can you provide a link?


                            It is literally 100% impossible to write any useful optimizations whatsoever without that supposed ‘expansive’ view of UB.

                            The problem here is distinguishing UB that has been put explicitly for portability reasons (signed integer overflow), from UB that has been put there explicitly for performance (aliasing rules & restrict), from UB that has been put there to allow the platforms to trap. (That’s another one there, the standard doesn’t distinguish trapping behaviour from fully fledged nasal demon UB.)

                            Overall there’s just too much damn undefined behaviour in C, to the point the behaviour of the standard body is almost incoherent: on the one hand they’re extremely conservative about evolving C, and on the other extremely reckless at letting those undefined behaviours run amok. I can only conclude that their main goal (if you’ll allow me to anthropomorphise it) is to make compiler writers look good. Compilers generate bloody fast code, so we’re the best. UB is not on us, just git gud, U loser.

                            Thank goodness they also gave us sanitizers. Monocypher would still be fraught with UB without them, and I can tell you constant time crypto is easy on that front.

                            1. 4

                              lonjil is correct. Note that OP is using signed integers as “the way” specifically because the fact that overflow is UB is desirable. With unsigned overflow being UB as well, you can start using unsigned integers again. https://godbolt.org/z/n_nLEU

                              1. 2

                                In your experience, does this have a non-negliable impact on performance? I’ve never been convinced by what I’ve seen, but you’ve probably seen things I have not.

                                1. 1

                                  Not really in the sense that unsigned math is slightly faster however it leads to using accurate data types and Data Oriented Design, making it type safe to organize your memory layout just so. In other words the language cooperates with you rather than fight you when you start taking performance seriously.

                                2. 2

                                  I confess this choice makes me very uneasy. I guess overflow traps in debug mode, but if one forgets to test that, in production we get the same kind of nasal demon I hate C for. Wrapping behaviour on the other hand makes it so much easier to test for overflow: we can do it after it happened. And some algorithms (ChaCha20, BLAKE2, SHA-2…) need wrapping semantics.

                                  Your example optimisation is cute, but brittle. I can’t infer a substantial enough benefit from that kind of thing. Do you have a more detailed rationale written somewhere?

                                  On the practical side, could I implement Monocypher in Zig? Can I distribute it as a Zig source file, let users chose their compiler flags, and still have access to the wrapping semantics some of my code needs? Failing that, does the default distribution model let me specify the compiler flags that must be used for my library?

                                  1. 2

                                    Zig has operator variants that are defined to wrap or saturate, so you can gain access to easy non-UB behavior without any compilation flags. Postfix any math operator with % to make it wrapping, and (IIRC) | to make it saturating. So a +% b gives you the behavior you want. And these work for both signed and unsigned types, avoiding the casting dance you sometimes have to do in C.

                                    1. 2

                                      Still not sure about UB being the default, but if I have the option this is no longer a deal breaker.

                                3. 1

                                  Clarity comes when you realize that the C committee is composed mostly of people who contribute to C compilers and C Standard Library implementations. And most of them have a veto button against any proposal they don’t like. They have to be conservative because some of the compilers represented in the committee are about as advanced as what a 90s grad student might’ve put together on a lark. So to fix anything, you must find a solution that is easy to implement for the most crappiest of compilers, and even worse, will not cause backwards compatibility issues for anyone. So for instance, they can’t add any bigger integer types if that would cause intmax_t to be bigger, because hey, that already has a defined size in platform ABIs and those ain’t changing until we switch to RISC-6 or something.

                                4. 1

                                  I will take that “literally 100%” as humorous hyperbole because it is obviously not true. What I would like to see is a closer examination of how much optimization a compiler might lose if all UB was instead implementation defined behaviour.

                                  1. 4

                                    Much UB in C could be removed with little to no performance impact, but as Andrew says, some UB is utterly fundamental to the ability to optimize. Pretty much anything involving pointer provenance is a must, or your C implementation would be about as fast as a safety checking interpreter. Also, since C does not have a perfectly rigorous mathematical definition, not all UB is known, making it categorically impossible to define all UB, even if you wanted.

                                    If you want to really avoid all UB, you only have two choices: restrict the programmer, such as in Safe Rust, or put on them the burden to provide machine checkable proofs. (note: all useful Safe Rust code relies on at least some code using Unsafe Rust, most of which has been looked at very carefully by humans, but not been proven correct. Much work is going into various mechanisms for proving uses of unsafe safe, so hopefully this will improve over time.)

                                    1. 4

                                      It’s true. For example, even mem2reg, arguably the most basic optimization pass, depends on it being UB for stack variables to get modified from a pointer that wasn’t made from the stack variable directly.

                                    2. 1

                                      2’s complement has been standardized in C for a while

                                      Citation needed. I know the C2x standardized 2’s complement, but I’m not sure if any prior version did.

                                      1. 2

                                        C17 aka C18 allows sign-magnitude and one’s complement - see page 35 section 6.2.6.2

                                        1. 2

                                          To be clear, I did not mean that all integer types have been specified as such. What I meant is that types that are guaranteed to be 2’s complement and also UB on overflow have existed in C for a while now. Namely, the intN_t family of typedefs. Here is the relevant sentence from C11:

                                          The typedef name intN_t designates a signed integer type with width N, no padding bits, and a two’s complement representation.

                                  2. 3

                                    No const. It serves no practical role in optimization, and I cannot recall an instance where it caught, or would have caught, a mistake…Dropping const has made me noticeably more productive by reducing cognitive load and eliminating visual clutter. I now believe its inclusion in C was a costly mistake…I’ll cast away the const if needed.

                                    I have a lot of sympathy with the author’s view here, since the language doesn’t generally allocate readonly memory, so the cognitive load of an edge case seems quite high.

                                    Currently string literals are readonly by default, and failure to declare arguments as const means functions can’t accept these literals. These could be cast away, but doing so brings back visual clutter.

                                    It’d also be possible to tell the compiler to put these in a non-readonly section in the binary, although that seems a little hazardous - do you really want your printf format strings to be writable? When memory is readonly, there’s at least a good reason for the compiler to complain if it’s modified, and the cast is suppressing a crash condition.

                                    1. 12

                                      I use const to signify intent—that this function won’t change the structure (or array) being passed in. For me, that’s less cognitive load than worrying about the function possibly changing data unexpectedly. I still use NULL because it’s visually distinctive, unlike a bare “0” in the code (I thought we weren’t supposed to use bare magic values anyway).

                                      To me, his coding style screams “Windows C++ programmer writing in C” but with less uppercase.

                                      1. 1

                                        (I thought we weren’t supposed to use bare magic values anyway).

                                        To me, 0, 1, and -1 are perfectly acceptable magic numbers, because they are generally tied to the structure of the surrounding code and rarely have a reason to change without the code around them changing as well. I also throw in the occasional 1024 to denote a kibibyte, but most of the time it ends up being used to define a constant anyway, like bufsize = 1024 * 32.

                                      2. 7

                                        I have a lot of sympathy with the author’s view here, since the language doesn’t generally allocate readonly memory, so the cognitive load of an edge case seems quite high.

                                        In most implementations, const variables with static storage duration are allocated in the read-only data section and are immutable. It is undefined behaviour to try to mutate them and the typical interpretation of that undefined behaviour is a segmentation fault.

                                        If you declare a variable const and then operate on it via functions that the compiler is able to analyse, it will often constant propagate all of the way through. If you don’t declare your arguments as const then converting to a non-const pointer requires an explicit cast. If you then mutate the object inside the call, the compiler won’t complain (if you mutate the variable inside the call and it’s declared const, the compiler will error).

                                        1. 1

                                          In most implementations, const variables with static storage duration are allocated in the read-only data section and are immutable.

                                          The author is suggesting that it’s feasible to just never use const and have things work. It’s true that if you use const to make a static readonly, you have to deal with the consequences of having made it readonly, but that is circular. The reason string literals are interesting is because they are made const implicitly (at least on Visual C++.)

                                          If the other comment is right (that other compilers don’t do this), then the original author’s model becomes more viable: if you don’t use const, you don’t have to use const.

                                          1. 3

                                            If you don’t use const, then the compiler has to figure out that it can’t change. That’s a complex alias analysis problem and, if the variable isn’t static is not actually possible without LTO. If you do use const, the compiler can treat a read of a field of the object as a read of the value shown in the initialiser and constant propagate it.

                                            Claiming that const makes no different in optimisation is the kind of thing that people say only if they’ve never looked at the output of their compiler. There was a nice talk posted here a couple of years back with someone using modern C++ features to target the 6502. They showed an example where a single const turned something from KiBs of code and data into a handful of instructions. That situation is not particularly rare.

                                        2. 5

                                          It would help if at least the standard library itself upheld const correctness. But it can’t! Consider strchr:

                                          char *strchr(const char *str, int c);
                                          

                                          This signature is const-incorrent, but the signature you’d really want is impossible to express in C, There is no way to say “the return type should be const iff the str argument is const”.

                                          strchr and many more fundamental library functions just silently launder away const-ness, making it more of a half-broken lint than a real type-system feature in practice.

                                          1. 3

                                            This has been fixed in C23, strchr and other functions now magically have a return type depending on the argument type.

                                          2. 1

                                            Currently string literals are readonly by default, and failure to declare arguments as const means functions can’t accept these literals. These could be cast away, but doing so brings back visual clutter.

                                            This isn’t true. String literals are read-only, but they’re not const; compilers are required to let you pass them to things that want char *.

                                            1. 4

                                              A minor addendum: modern compilers can warn about this case when run with appropriate flags (e.g. -Wwrite-strings for GCC or Clang).

                                              There’s a whole bunch of warning-enablement flags that need to be used when compiling C, due to a culture of not wanting compiler upgrades to generate new warnings in existing codebases (regardless of accuracy).

                                              strlit.c: In function ‘main’:
                                              strlit.c:6:16: warning: passing argument 1 of ‘myfunc’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
                                                  6 |         myfunc("hello");
                                                    |                ^~~~~~~
                                              strlit.c:1:19: note: expected ‘char *’ but argument is of type ‘const char *’
                                                  1 | void myfunc(char *x) {
                                                    |             ~~~~~~^
                                              
                                              1. 2

                                                Yeah, I’m aware. -Wwrite-strings isn’t a real warning, though, it changes the type of string literals. Since discarding const is a constraint violation, -pedantic-errors (which everyone should use) turns it into an error. (It could also change a weird, but well-defined, program to have UB. e.g. typeof("hello") y; *(char *)y = 0;).

                                                I don’t mind adding warnings in general, but I think I would want -Wwrite-strings to be an exception.

                                              2. 3

                                                This isn’t true. String literals are read-only, but they’re not const; compilers are required to let you pass them to things that want char *.

                                                Huh, it’s const in C++ and I assumed it was also in C. I guess it must be almost 20 years since I compiled any C code that didn’t have ‘-Wwrite-strings’ enabled by something in the build system and I assumed it was part of the language by now. It turns out that you can write "hello"[4] = 'a'; and it compiles with no warnings, it just crashes at run time (writing to a string literal is UB, but the type pretends it’s mutable).

                                                I guess that this was because K&R C didn’t have const and retrofitting it to everything that took string literals was too painful at the time. There’s no good excuse for it still being allowed in C23, but I guess that’s another good reason to not go back to using C.

                                                1. 1

                                                  The reason to allow it today is that it’s still widely used. If you break most C programs are you still a C?

                                                  There are a handful of things in C that are obviously mistakes, and that would be easy to fix it you didn’t mind breaking code. I don’t know that there’d be any point. If my code is going to be broken anyway maybe I should just rewrite it in a language that tries to fix the hard problems with C, too.

                                                  1. 1

                                                    The reason to allow it today is that it’s still widely used.

                                                    Is it? I’ve not seen a C project that didn’t enable that warning for a good 15 years (which is why I thought it was part of the language).

                                                    If you break most C programs are you still a C?

                                                    If they modify the string, they’re already broken (they’ll abort at run time). If they don’t, then adding const is a trivial mechanical fix.

                                                    C23 removed K&R function definitions for a similar reason. Any use of them is almost certainly an error and it’s easy to fix by adding void if that’s what you actually meant. I’ve seen far more K&R function definitions than things assigning string literals to non-const variables / parameters in the last few years.

                                                    1. 2

                                                      If they modify the string, they’re already broken (they’ll abort at run time).

                                                      If you’re running on an embedded platform (no MMU) and your .ro section is in RAM then you won’t get an error. If you’re running in a single-threaded environment, it’s fine to mutate strings as long as you restore them later. For example,

                                                      void *ret;
                                                      char *dot = strchr(name, '.');
                                                      
                                                      /* Only use the part before the dot */
                                                      if (dot)
                                                              *dot = '\0';
                                                      ret = hm_get(hash_map, name);
                                                      
                                                      if (dot)
                                                              *dot = '.';
                                                      return ret;
                                                      

                                                      Of course, the best approach is to add an hm_getn which takes a length.

                                                      1. 2

                                                        If you’re running on an embedded platform (no MMU) and your .ro section is in RAM then you won’t get an error

                                                        That’s true (well, except on the one that I’m responsible for, where we have no MMU but still give you read-only rodata via capability permissions), but you will be deeply surprised when the value of a constant changes. Especially since the linker may do suffix merging and so you might have just changed the value of a completely unrelated string that happens to be a suffix of this one, which might be used in the hm_get function. Oh and your compiler may assume that, because modifying a string literal is undefined behaviour, that name doesn’t alias any string literal and so any address comparisons between name and will be false and your hash map gets optimised away entirely.

                                                        1. 1

                                                          Oh and your compiler may assume that, because modifying a string literal is undefined behaviour, that name doesn’t alias any string literal and so any address comparisons between name and will be false and your hash map gets optimised away entirely.

                                                          Well, given that (in the omitted context) name is a function parameter, if the function containing the above block isn’t static, then unless you have LTO on the compiler can’t tell whether name is a string literal or something else.

                                                          Especially since the linker may do suffix merging and so you might have just changed the value of a completely unrelated string that happens to be a suffix of this one, which might be used in the hm_get function.

                                                          Just don’t access strings in your hash function :)

                                                        2. 2

                                                          Some embedded platforms have a memory protection unit, like an MMU with access control but no remapping. e.g. this epic hack to attach QSPI RAM to an RP2040 which relies on the MPU to make things work.

                                                        3. 2

                                                          If they modify the string, they’re already broken (they’ll abort at run time). If they don’t, then adding const is a trivial mechanical fix.

                                                          So modulo Forty-Bot’s thing, yes the first thing is true, but the second part is a little bit more challenging because the same pointer can be used for both read-only and read-write things. I’m pretty sure I saw this in ircd, but if it’s all the same to you I’d rather not go looking. (I’ve also done it in ircd, but I used a const pointer and subtracted it twice from a non-const pointer that wouldn’t exist unless the thing was writable. That thing does weird things to your brain.)

                                                          1. 4

                                                            To me, that’s a really bad code smell. If you have a thing that you are not mutating, it can be const. If you have a thing that you are mutating, you should never assign in immutable object (or one that, via your API contract, you have promised not to mutate) to it. Even if code like that doesn’t contain bugs now, it is a shape that is very easy for the next person to introduce because they have to notice (or read in a comment) that mutating this thing that you hold a mutable reference to is not allowed.

                                                2. 2
                                                  • Better is different.
                                                  • Different is worse.
                                                  • QED. Better is worse.
                                                  1. 1

                                                    For awhile I capitalized type names as that effectively put them in a kind of namespace apart from variables and functions, but I eventually stopped. I may try this idea in different way in the future.

                                                    Lower-case type names are one of the biggest mistakes in C and C++ that affect readability. I wonder what the “different way” he might try in the future is.

                                                    1. 1

                                                      The header-only library that I’d love to see for C++ is one that created a Standard namespace and imported everything in std with the type names capitalised. Unfortunately, a load of the types are properties of objects, so I don’t think it’s actually feasible.

                                                    2. 1

                                                      Allow me to add one more thing I hate about that the author didn’t mention: the alignment of variable/function names.

                                                      What happens if you have a type & qualifier that’s longer than what are there and you need to move the variable/function alignment further right? Do you add more spaces to all the variables/functions to keep them aligned?

                                                      1. 2

                                                        I know a lot of C programmers do not like giving up vertical space, but some years ago I started putting them on the line before in function declarations/definitions

                                                        static unsigned int
                                                        foo(const char *s, int n);
                                                        

                                                        It has the benefit that when you are scrolling through a file the function names are always in the same place (also, you can search for ^foo).

                                                        1. 4

                                                          The BSD style does this, for the latter reason: you can grep with a start-of-line pattern to find the definition. It’s less important with things like clangd cross-referencing your code and providing jump-to-definition in your editor, but still quite useful.

                                                        2. 1

                                                          What a total horror show. I already wouldn’t trust any significantly large C codebase, but now I also have disdain for the people choosing to write the stuff.