Threads for drobilla

  1. 9

    This seems like a misplaced rant against coworkers who don’t format things with trailing commas like the author would like, and weird tooling issues (I certainly don’t run clang-format via Docker and can’t imagine why I would), rather than much of a criticism of clang-format itself. Even the “self-evident” example is a format that clang-format can and does use! Eliminating the comma “trick” would just mean you have even less control over formatting, which contradicts the point.

    The main reason people (myself included) like machine-dictated formatting is to reduce/eliminate time wasted on bikeshedding. If the main issue is that developers won’t place a comma to your liking, not using an automatic formatter certainly isn’t going to make anything better: that just means there are many, many more things that other developers can do that isn’t to your liking.

    This sort of thing easily gets toxic even in corporate environments where developers are expected to deal with their coworker’s nitpicks (because they get paid at least in part to do so), and is practically a non-starter in an open source context where you simply are not going to get submissions in exactly the style you want. You can then either manually reformat it all yourself, or berate people for it, certainly just driving potential contributors away. That is the main point of automated formatting, not “it always makes the code as pretty as possible”. clang-format is popular because it’s more or less Good Enough for everyone to tolerate, not because it’s anyone’s ideal, and having a tool to do that job eliminates a ton of wasted time which can be spent on something actually important.

    1. 1

      I think you’re trying to misunderstand. I’m trying to address an industry-wide problem, and I don’t like bikeshedding.

      The trailing comma is just a detail that happens to work in curly braces (which I’m glad for), but doesn’t apply anywhere else, like function arguments, so it’s not like the “self-evident” format is supported everywhere.

      1. 2

        To me, it sounds like you are saying: “I have strong opinions on the right way C++ code should be formatted, and I am unable to force clang-format to use it, so therefore, don’t use it.”

    1. 2

      Seems like a lot of situational (“nuclear apocalypse”) roleplaying to say that OpenBSD has better manpages than Linux. With FreeBSD my experience has been that the manpages are about equally terse as Linux, but the Handbook makes the system a joy to read about and use. Are OpenBSD’s manpages that much better? xinit has the same manpage on OpenBSD as it does on Ubuntu. mail has a good manpage. Is there a good example of these differences between Linux and OpenBSD in a manpage? I’d love to see it.

      OpenBSD was create for free men like you, enjoy it.

      Lol

      1. 2

        Is there a good example of these differences between Linux and OpenBSD in a manpage? I’d love to see it.

        I find the OpenBSD’s manpages better written. Some of the examples I routinely consult (OpenBSD vs. Linux):

        As you will notice above, some of the OpenBSD’s manpages are shorter than Linux’s (or GNU’s) ones—notably, awk(1) vs gawk(1).On the other hand, awk(1) links to script(7), what doesn’t have an equivalent (as far as I am concerned) on Linux.

        However, in general, I tend to find the information I want on OpenBSD’s manpages way quicker than on Linux’s ones, and I usually find the explanation better.And the fact almost all of them have an examples’s section is very handy.

        As a side effect of its readability, I tend to read OpenBSD’s manpage more often, and more thoroughly, than the I use(d) to do in other system (including other BSDs).

        I wonder whether mdoc(7) is the reason OpenBSD’s manpages are so uniformly better than their equivalents.

        P.S.: I linked to the Ubuntu’s manpages instead of the man7.org’s ones because man 1 ed in my (Ubuntu) system presents the same telegraphic manpage linked, while man7.org’s one is the POSIX Programmer’s Manual manpage, which isn’t even installed in my system.

        1. 3

          I wonder whether mdoc(7) is the reason OpenBSD’s manpages are so uniformly better than their equivalents

          Having somewhat recently switched to mdoc from classic/Linux man macros, it certainly doesn’t hurt. It’s dramatically better in every way. Semantic markup, built-in decent HTML output, tool-enforced uniformity, etc. It shows that it was built by people who actually think man pages are good.

          1. 1

            I decided to go a little further and compare equivalent manpages which are introduced only for the OpenBSD and only for Linux, triggered by your comment about the manpage of xinit(1) being the same on OpenBSD and Linux—it is so because it is an “imported code” on both systems, so it makes sense it is the same in both operating systems. (The same happens, for example, for tmux(1)).

            So, I compared OpenBSD’s ktrace(1) vs. Linux’s strace(1) and, in my opinion, it shows the good difference between those systems: those tools provide the same functionality, but the manpage of the latter is overwhelming and abstruse compared to the one of the former.

            Thus, I think the manpages of OpenBSD is a great example of their KISS attitude, without sacrificing on completeness of information.

          2. 2

            I’m not sure the degree to which it’s enforced, but OpenBSD used to refuse to merge any changes that affected an interface that didn’t come with updates to the man page. FreeBSD was never quite as aggressive, most Linux things don’t come close to either. For example, consider something like _umtx_op, which describes the FreeBSD futex analog. Compare this to the futex man page and you’ll notice two things: first, it’s a lot less detailed, second it has an example of a C wrapper around the system call that isn’t actually present in glibc or musl. OpenBSD’s futex man page isn’t that great - there are a bunch of corner cases that aren’t explicit.

            Or kqueue vs epoll - the pages are a similar length, but I found the kqueue one was the only reference that I needed, whereas the epoll one had me searching Stack Overflow.

            The real difference between *BSD and Linux is in the kernel APIs. For example, let’s look up how to do memory allocation in the kernel. FreeBSD has malloc(9), OpenBSD has malloc(9), both with a description of the APIs. The level of detail seems similar. Linux has no kmalloc man page.

          1. 4

            There’s a lot of difficult trade-offs here, but IMO the clear winner for “functionalish thing that will still be around more or less as-is for the indefinite future” is Scheme. Probably r5rs specifically.

            It just is what it is. The spec is quite small and understandable. Countless people have implemented it, in various ways, to various degrees of completion and/or quality (no doubt several people reading this have had a go at some point). It’s pretty aggressively distilled down to its essence and has a philosophy of not bolting on genuine new language features unless it’s completely unavoidable.

            Of course, it’s Scheme, so… yes, parentheses, and dynamically typed, and many of the above things that make it a good contender for this “long-lived stability” criteria are also the reasons why it’s not the most practical language to use, and there is quite a lot of implementation variability if you’re not careful to restrict yourself to the actual standard, and and and…

            Still though, I have a very hard time imagining a future world where one couldn’t grab a copy of SICP or whatever and fire up some ~r5rs environment or another and get on with programming in much the same way they could have any number of decades in the past, much like C. Scheme is forever, and the almost religious philosophy of not adding special case syntax and piling on complexity to solve problems keeps it that way. One could make similar arguments for many other languages, especially those with a more academic background, but Scheme seems relatively unique in being something defined precisely in what are more or less timeless papers, while also being widely implemented and not a dead language in practice (like, say, SML).

            Being a dead language could be considered an advantage here (as another thread is diving in to), but I don’t think that is universally true. An alive language community doesn’t necessarily mean that the language itself will tend to break. That’s at least partially an outcome of the language design itself. Love it or hate it, the hyper-minimalist “lambda is all you get” anti-syntax s-expressions thing of the Scheme world keeps that at bay much more than, say, in the ML-derived world.

            Long-term language stability, I think, mainly derives from having a relatively small set of core concepts that everything else is built on top of. You can learn and understand more or less everything that’s in C, or in Scheme, now or 30 years ago or 30 years from now and they’ll still be more or less the same as they’ve always been. It’s the “concept” part that really clarifies this for me: you can even see it in the discussions in programming language communities. C and Scheme forums almost never have topics about entirely new concepts in the language (beyond the superficial anyway, I’m using a sort of capital-C “Concepts” here), the language just is what it is. They are also very stable languages. Meanwhile, C++ and Rust forums are constantly packed with whole new concepts for the language, or why existing ones are bad, or (etc etc etc). They are notoriously unstable languages. This is not a coincidence.

            I guess that rambling rant boils down to: find a language where the answer to “what fundamental or syntactic constructs is the community likely to try and add to the language in the future?” is “probably none at all”.

            1. 5

              This is silly/obvious (and, as others have pointed out, probably a straw-man motivated by self-interest). If you want to do this, obviously MIT/BSD/etc isn’t the license you use for the “open edition”.

              That’s why it’s - somewhat ironically - much easier to get code released as GPL in most organizations, despite those same organizations being strictly anti-GPL for others’ code they want to use. That tells you all you need to know if you want to make such organizations pay you to use your open source software, really…

              1. 2

                The C99 designated initialiser version also works well in combination with C++11 default initialisers. For example:

                struct Options
                {
                        bool flag = true;
                        int count = 42;
                };
                
                void doThing(Options = {});
                

                This can be called as any of the following:

                        doThing();
                        doThing({.flag = false});
                        doThing({.count = 12});
                        doThing({.flag = true, .count = 12});
                

                Clang and GCC both accept this in C++14 mode or later. You can add new fields to Options without breaking the API but if you want to preserve the ABI then you can take advantage of overloading to add a new overload with an Options2 structure that adds more things and remove the old version from the header. Existing code will call the old overload, which can then forward to the new one, new code will only see the new overload and so will call that.

                I use something similar to this in snmalloc at the class, rather than function, level. Each back end is required to provide an Options static constexpr field that describes the features that it supports. Implementations that want non-default behaviour just need to specify the flags that specify the non-default behaviour that they want. The individual flags can be queried by if constexpr, so the allocator can be specialised at compile time based on the behaviour that the back end requests.

                From another lobste.rs story today, there’s another proposal for named parameters that seems to be getting some traction.

                1. 1

                  It’s pretty nice, actually; I’d consider using it in real code. Having to define a separate struct for the parameters is clunky, but the call-site is pretty ergonomic, aside from the need for braces.

                  The declaration could be cleaned up a bit with some preprocessor abuse, but that’s getting pretty hacky and would probably confuse IDEs.

                  1. 2

                    I also quite like it (and it’s about time C++ finally added designated initializers), but in C++ the order must match the order in the struct which isn’t great for named parameter ergonomics.

                1. 17

                  So it surprised me to learn that some (many?) folks in the open source and academic world hate -Werror with passion.

                  That’s because the Open Source world deploys source code to the entire world. You can not possibly come even close to covering all the systems people will try to build your code on, so making the mistake of having Werror on by default means you get a ton of feedback about things not building for unimportant reasons. That gets old very quickly. Warning flags are also quite volatile over time. In short, the in-house proprietary way of thinking here just doesn’t work, because the deployment scenario is entirely different, and you have no control over it.

                  Since this is a “strong opinions, loosely held” blog, I’ll follow suit:

                  The objectively correct approach here is to have a developer mode which enables ultra strict warnings (far more than mentioned in the post) as errors, and use those on CI, but have the default be a more conservative set of warnings with broad compatibility, and without Werror. That way the code quality is as strict as you like (which is, after all, the point), but you aren’t breaking things for users because of some silly warning. No good comes from that - the feedback is almost always useless, and the initial impression of your software is “it doesn’t even compile”.

                  1. 3

                    a ton of feedback about things not building for unimportant reasons

                    And specifically this usually happens when building with a newer or just another compiler. (So many projects just test on whatever gcc they have – every time I upgrade clang I get new stupid Werror fails…)

                    Werror as-is is a complete disaster. “Only use this for development” flags inevitably end up in shipped build systems. No flag should be this fragile.

                    If only compiler developers got together and agreed on common warning sets. If one could say -Werror=2021 and any future version of clang and gcc interpreted this as “enable whatever warnings we agreed on in 2021” it would be usable.

                    1. 3

                      Yeah, I suspect people with the luxury of only “supporting” some small set of compilers/versions really underestimate how hard it is to get a warning-free build across a vast swath of versions, especially if you’re being strict about it, and double plus especially if you’re using Weverything in clang with explicit exceptions (which is obviously absolute madness in conjunction with default Werror, but I’ve seen it…).

                      “Only use this for development” flags inevitably end up in shipped build systems. No flag should be this fragile.

                      Meh. Here I think I disagree. Any build system used to deploy code to users needs to have some kind of configuration mechanism, and if you have to actively opt-in to warnings being errors, well… you specifically asked for warnings to be errors, so of course they are? Maybe I’ve been lucky, but I’ve never really been bothered by people doing this.

                      That said, flag stability is annoying, but I think that’s a different problem - Werror just does what it says on the tin. Compiler authors could never fully agree to such a thing, it’s more or less equivalent to agreeing on a common set of implementation-specific details. The current system of GCC and clang mostly agreeing where possible is as good as that’s going to get, I’d say. Even if they agreed on a common subset, you’d end up using the extra ones anyway (because many are useful but compiler-exclusive), and we’re back to the same problem. MSVC is off in another universe entirely, but it always is.

                      1. 1

                        If one could say -Werror=2021

                        You can! You just have to specify which warnings those are:

                        -Werror=format-security -Werror=switch -Werror=maybe-uninitialized …
                        

                        This decouples the “which warnings are errors” question from the general warning level question, which is subject to those volatile warning categories (like -Wall, -Wextra, -Wpedantic). I think this is the only sane way to use -Werror, at least as a default build option in open source.

                        1. 1

                          Sure, but aside from having an ugly loooooong list in CFLAGS, the problem is that I don’t know of a resource that answers questions like “give me the warnings supported by both the N last gcc versions and the last N clang versions”. At least having that as a website would be something.

                    1. 104

                      I’m not a big fan of pure black backgrounds, it feels a bit too « high contrast mode » instead of « dark mode ». I think a very dark gray would feel better to the eye. n=1 though, that’s just a personal feeling.

                      Thanks for the theme, it’s still great!

                      1. 29

                        Agreed, background-color: #222 is better than #000.

                        1. 15

                          I’ll just put my +1 here. The pure black background with white text isn’t much better than the opposite to me (bright room, regular old monitor). I’ve been using a userstyle called “Neo Dark Lobsters” that overall ain’t perfect, but is background: #222, and I’ll probably continue to use it.

                          On my OLED phone, pure black probably looks great, but that’s the last place I’d use lobste.rs, personally.

                          1. 18

                            Well, while we’re bikeshedding: I do like true black (especially because I have machines with OLED displays, but it’s also a nice non-decision, the best kind of design decision), but the white foreground here is a bit too intense for my taste. I’m no designer, but I think it’s pretty standard to use significantly lower contrast foregrounds for light on dark to reduce the intensity. It’s a bit too eye-burney otherwise.

                            1. 7

                              You have put your finger on something I’ve seen a few times in this thread: The contrast between the black background and the lightest body text is too high. Some users’ wishes to lighten the background are about that, and others’ are about making the site look like other dark mode windows which do not use pure black, and therefore look at home on the same screen at the same time. (Both are valid.)

                              1. 10

                                For me pure white and pure black is accessibility nightmare: that high contrast triggers my dyslexia and text starts to jump around, which starts inducing migraine.

                                As I default to dark themes systemwide and I couldn’t find way to override detected theme, this site is basically unusable for me right now. Usually in these cases I just close the tab and never come back, for this site I decided type this comment before doing that. Maybe some style change happens, manual override is implemented or maybe I care enough to setup user stylesheet.. but otherwise my visits will stop

                                1. 1

                                  No need to be so radical, you still have several options. Not sure what browser you’re using, but Stylus is available for Chrome/FF:

                                  https://addons.mozilla.org/en-US/firefox/addon/styl-us/

                                  It allows to override the stylesheet for any website with just a few clicks (and few CSS declarations ;))

                                  1. 9

                                    I don’t mind the comment. There’s a difference between being radical because of a preference and having an earnest need. Access shouldn’t require certain people to go out of their way on a per-website basis.

                                    1. 6

                                      It’s not radical, it’s an accessibility problem.

                              2. 8

                                That’s great, thank you.

                                I wonder if I am an outlier in using the site on my phone at night frequently. Alternatively, maybe we could keep the black background only for the mobile style, where it’s more common to have an OLED screen and no other light sources in your environment.

                                1. 2

                                  I don’t use my phone much, especially not for reading long-form content, so I wouldn’t be surprised if I was the outlier. That sounds like a reasonable solution, but it’s not going to affect me (since I can keep using a userstyle), so I won’t push either way. I will +1 the lower-contrast comments that others have posted, if it remains #000 though - the blue links are intense.

                                  1. 1

                                    The blue link color brightness is a point that not many have made. I think the reason I didn’t expect it is that I usually use Night Shift on my devices, which makes blue light less harsh at night. Do you think we should aim to solve this problem regardless of whether users apply nighttime color adjustment? Another way to ask this question: What do you think about dark mode blue links in the daytime?

                                    1. 2

                                      Sorry if I’m misunderstanding, but to clarify, my above comment is in a bright room; I try to avoid looking at screens in dim light/darkness. The blue links just look kind of dark, and intensely blue. Just a wee reduction in saturation or something makes it easier to read.

                                      Thanks for your work on this btw. I looked into contributing something a while back, but was put off after it looked like the previous attempt stalled out from disagreement. I’d take this over the bright white any day (and it turns out this really is nice on my phone, dark blue links withstanding). The css variables also make it relatively easy for anyone here to make their own tweaks with a userstyle.

                                      I feel like I’ve taken up enough space complaining here, so I’ll leave a couple nitpicks then take my leave: the author name colour is a little dark (similar to links, it’s dark blue on black), and the byline could do with a brightness bump to make it more readable, especially when next to bright white comment text.

                                      1. 1

                                        I appreciate the clarification and other details :)

                                  2. 1

                                    My laptop is OLED and I’d still appreciate #000 there

                                    1. 1

                                      +1 to separate mobile style.

                                  3. 4

                                    I strongly agree.

                                    I can’t put my finger on why, but I find very dark gray easier.

                                    1. 1

                                      #222 is way better! thank you

                                    2. 14

                                      I strongly disagree, and this black background looks and feels great to me! No one can ever seem to agree on the exact shade or hue of grey in their dark themes, so if you have the general UI setting enabled, you end up with a mishmash of neutral, cooler, hotter, and brighter greys that don’t look cohesive at all. But black is always black!

                                      For lower contrast, I have my text color set to #ccc in the themes I have written.

                                      1. 6

                                        Another user pointed out that pure black is pretty rare in practice, which makes this site stand out in an environment with other dark mode apps:

                                        Here’s a desktop screenshot with lobste.rs visible - notice that it’s the only black background on the screen.

                                        Does that affect your opinion like it did mine? I do see value in pure black, but suppose we treated the too-high-contrast complaint as a separate issue: Darkening the text could make the browser window seem too dim among the other apps.

                                        1. 3

                                          I prefer the black even in that scenario. The contrast makes it easier to read imo.

                                          1. 2

                                            Not all. If it gets swapped out for grey I will simply go back to my custom css, which I have used to black out most of the sites I visit, so no hard feelings.

                                        2. 8

                                          Feedback is most welcome! Would you please include the type of screen you’re using (OLED phone, TFT laptop…) and the lighting environment you’re in (dark room, daytime indoors with a window, etc.)? And do you feel differently in different contexts?

                                          I’ve got some comments about how I selected the colors in the PR, if that helps anyone think through what they would prefer.

                                          1. 4

                                            Sure! I’m on my iPhone 12 so OLED phone. I tried in with dimmed lights and in the dark, but in both cases I think I’d prefer a lighter background color.

                                          2. 7

                                            I disagree. Black is black. These off-gray variants just looks dirty and wrong to me.

                                            I love this theme.

                                          1. 9

                                            I have a different take on the long lines (but with the same conclusion). One line per paragraph essentially means that diffs, and therefore revision control, is unusable. One line per sentence (or perhaps clause, it doesn’t need to be strict) is a convention in areas where people work on text together (such as writing papers in LaTeX) for good reason.

                                            If that’s not a concern, sure, wrapping things to look nice in a text editor as the author wants to do here is reasonable. Which of these two makes the most sense depends on the context. Gemini’s approach is the worst of both worlds and does not make sense in any context. It’s just wrong.

                                            1. 6

                                              Sounds like our diffing and merge tools are too basic. There is no great reason for line based merge other than it seems to work well enough for code and reduces the computational complexity (mostly in pathological cases).

                                              I much prefer soft wraping because it looks nice whatever the current size of my editor or browser window happens to be.

                                              1. 8

                                                Fair enough, but what nearly all actually existing tools do seems like a pretty significant consideration for a platform that exists for the sole purpose of being simple…

                                                1. 3

                                                  Well git diff supports word diffs by default.

                                                  And of course very few tools support rendering hard-wrapped lines nicely. The options are basically either having a screen that is larger than what the author wrapped to or render as markdown or html.

                                                  So it seems like you need to pick between nice viewing and nice merging with the current tools. For me viewing has generally been more important.

                                              2. 3

                                                Is your issue that two people editing one paragraph will always be a merge conflict, specifically?

                                                1. 2

                                                  Yeah, that’s the big one. Although it also makes diffs a nightmare to read in general.

                                                  1. 2

                                                    Markdown has the same wrapping behavior. And systems that handle text not written by programmers (wikis, blogs, CMSs) tend to assume there will be lines of any length, because nobody but programmers uses hard line breaks anymore.

                                                    I’ve seen many tools that display reasonable diffs of single-line paragraphs, for example GitHub. Usually they use color or strike through to indicate the diffs within one paragraph.

                                                    1. 1

                                                      Well, it makes git diffs which are output in the default style a nightmare to read and it seems a little unfair to hold Gemini responsible for that. I’m curious, in what sort of workflow would you be reading diffs of gemtext?

                                                      1. 7

                                                        It also seems a little unfair to hold HTTP/HTML responsible for advertising or whatever, but here we are :)

                                                        I’m curious, in what sort of workflow would you be reading diffs of gemtext?

                                                        Software documentation seems the most likely, but indeed: probably not many. My point is really that Gemini’s approach is bad for this and for wrapped text (e.g. at 80 columns). It’s just plain bad.

                                                  2. 2

                                                    I was against the use of long lines for paragraphs in the original discussions of it on the Gemini mailing list. The problem is that there were basically two different approaches to handling running text containing newlines on offer, which I will call the Markdown approach and the Gopher approach. The Markdown approach means separating paragraphs by blank lines, and then on the client un-wrapping and re-wrapping paragraphs to the available display area. This is good, but it means more client complexity (more about this in a minute). The Gopher approach is to hard-break lines, and separate paragraphs by blank lines, but then never re-format, which means that if the client is too narrow for 80 character lines, you’ll get even worse ragged edges than the other way around.

                                                    Now, it’s not hard at all to unwrap and re-wrap paragraphs, but you need to keep track of the beginning and end of a paragraph and do the wrapping before you display, which means that you can’t parse line-by-line. One easy thing about the gemtext format is that you can always tell how to display a line by looking only at the first three characters. The only state you ever need to keep in your parser is whether you’re inside a literal block or not, but even within a literal block, you can display line-by-line.

                                                    Early Gemini sites took the Gopher approach, which meant they were broken on phones. The consensus of the group was that long lines and soft wrapping solved that problem without the complexity of the Markdown approach. I don’t know if this was the right decision or not, but it’s the one that stuck.

                                                  1. 8

                                                    GCC has -Wstack-usage=BYTES and -Wframe-larger-than=BYTES warnings which can be used to put some guardrails around stack usage at compile time.

                                                    They aren’t perfect, but I’ve started setting them (usually to the lowest power of 2 that still works) in new projects, and found it handy that they will say “hey, that thing you just did increased the stack usage significantly” when necessary, while being easy to adjust or turn off and not requiring any elaborate mechanism in the code to attempt to do this dynamically.

                                                    1. 5

                                                      I am a big fan of coverage, but feel that a lot of the debate around the practice largely misses the point(s). So, while I agree that complete or high coverage does not automatically mean that a test suite or the software is good… of course it doesn’t? In the extreme, it’s pretty trivial to reach 100% coverage without testing the actual behaviour at all.

                                                      Coverage is useful for other reasons, for example the one this article ends with:

                                                      Our results suggest that coverage, while useful for identifying under-tested parts of a program, should not be used as a quality target because it is not a good indicator of test suite effectiveness.

                                                      Identifying under-tested parts of a program seems like a pretty important part of a testing strategy to me. Like many advantages of coverage, though, you have to have pretty high coverage for it to be useful. There are other “flavours” of this advantage that I find useful all the time, most obviously dead code elimination. High test coverage at the very least signals that the developers are putting effort into testing, and checking that their testing is actually hitting important pieces of the code. Maybe their test suite is in fact nearly useless, but that seems pretty unlikely, and it could be nearly useless without coverage, too. That said, like any metric, it can be gamed, and pursuing the metric itself can easily go wrong. Test coverage is a means to many useful ends, not an end unto itself.

                                                      The quest for 100% may be a bit of a wank, but I’ve tried that in a few projects before and actually found it quite useful. In particular it highlights issues with code changes that affect the coverage of the test suite in a very simple way. Day-to-day, this means that you don’t need to meticulously pour over the test suite every time any change is made to make sure that some dead code or dead/redundant branches weren’t added. If you don’t have total coverage, doing that is a chore. If you do, it’s trivial: “oh, the number is not 100% anymore, I should look into why”. I regularly end up significantly improving the code during this process. It’s undeniably a lot of work to get there (depending on the sort of project), but once you do, there are a lot of efficiency benefits to be had. If the project has platform-specific or -dependant aspects, then this is even more useful in conjunction with a decent CI system.

                                                      As to the article itself, the methodology here seems rather… convenient to me:

                                                      • Programs are mutated by replacing a conditional operator with a different one. This mutation does not affect coverage (except perhaps branch coverage, in exactly one case, if you’re replacing > with >= as they are here). It also hardly seems like a common case.

                                                      • The effectiveness of the test suite as a whole is determined by running random subsets of the tests and seeing if they catch the bug. This is absurd. Test suites are called test suites for a reason. The instant you remove arbitrary tests, you are no longer evaluating the effectiveness of the test suite, full stop. You are - obviously - evaluating the effectiveness of a random subset of the test suite. Who cares about that?

                                                      Am I missing something? In short, given this methodology, the only things these results seem to say to me is: “running a random subset of a test suite is not a reliable way to detect random mutations that change one conditional operator to another”. I don’t think this is at all an indicator of overall test suite effectiveness.

                                                      That said, I have not read the actual paper (paywall), and am assuming that the summary in the article is accurate.

                                                      1. 4

                                                        I also find coverage extremely valuable for finding dead or unreachable code.

                                                        I frequently find that unreachable code should be unreachable, e.g. error-handling for a function that doesn’t error when provided with certain inputs; this unreachable-by-design error handling should be replaced with panics since reaching them implies a critical bug. Doing so combines well with fuzz-testing.

                                                        It’s also useful for discovering properties of inputs. Say I run a function isOdd that never returns true and thus never allows a certain branch to be covered. I therefore know that somehow all inputs are even; I can then investigate why this is and perhaps learn more about the algorithms or validation the program uses.

                                                        In other words, good coverage helps me design better programs; it’s not just a bug-finding tool.

                                                        This only holds true if I have a plethora of test cases (esp if I employ something like property testing) and if tests lean a little towards integration on the (contrived) “unit -> integration” test spectrum. I.e. only test user-facing parts and see what gets covered, and see how much code gets covered for each user-facing component.

                                                        1. 1

                                                          This matches my experience very well. Good point that the sort of test suite is relevant here. I get the impression that the article is coming from more of a purist unit-testing perspective, but this dead code elimination thing is mostly useful when you have a pretty integrated test suite (I agree that this axis is largely contrived).

                                                          I find it particularly nice for non-user-facing things with well-defined inputs and outputs like parsers, servers, and so on. If you have a test suite that mostly does the thing the software actually has to do (e.g. read this file with these options and output this file), in my experience, coverage exposes dead code a lot more often than you expect.

                                                          This has the interesting side-effect that unit tests which only exist to cover internal code are actually harmful in a way, because something useless will still be covered.

                                                          1. 1

                                                            I find it particularly nice for non-user-facing things with well-defined inputs and outputs like parsers, servers, and so on. If you have a test suite that mostly does the thing the software actually has to do (e.g. read this file with these options and output this file), in my experience, coverage exposes dead code a lot more often than you expect.

                                                            I think it’s just fine, as long as it’s possible to turn them off and just run the subset of tests for public functions or user-facing code. I typically have a portable Makefile that includes make test-cov, make test, and make test-quick; if applicable, only make test needs to touch all test files.

                                                        2. 2

                                                          I have not read the actual paper (paywall)

                                                          The PDF is on the linked ACM site: https://dl.acm.org/doi/pdf/10.1145/2568225.2568271 – I think you must have misinterpreted something or took a wrong turn somewhere(?)

                                                          Otherwise there is always that certain site run by a certain Kazakhstani :-)

                                                          1. 1

                                                            Paywalled in the typical ACM fashion as far as I can tell?

                                                            That said, sure, there are… ways (and someone’s found an author copy on the open web now). I’m just lazy :)

                                                            1. 1

                                                              Skimmed the paper. It seems the methodology summary in the article is accurate, and I stand by my critique of it. To be fair, doing studies like this is incredibly hard, but I don’t think the suggested conclusions follow from the data. The constructed “suites” are essentially synthetic, and so don’t really say anything about how useful of a quality metric or target coverage is in a real-world project.

                                                              1. 1

                                                                Huh, I can just access it. I don’t know, ACM is weird at times; for a while they blocked my IP because it was “infiltrated by Sci-Hub” 🤷 Don’t ask me what that means exactly, quoting their support department.

                                                                1. 1

                                                                  Hm. Out of curiosity, do you have a lingering academic account, or are you accessing it via some institution’s network? I know I was surprised and dismayed when my magical “free” access to all papers got taken away :)

                                                                  1. 1

                                                                    I only barely finished high school, and that was a long time ago. So no 🙃

                                                                    Maybe they’re providing free access to some developing countries (Indonesia in my case), or they just haven’t fully understood carrier grade NAT (my IP address is shared by hundreds or thousands of people, as is common in many countries). Or maybe both. Or maybe it’s one of those “free access to the first n articles, paywall afterwards” things? I don’t store cookies by default (only whitelisted sites), so that could play a factor too.

                                                            2. 1

                                                              Identifying under-tested parts of a program seems like a pretty important part of a testing strategy to me.

                                                              My interpretation is that test coverage reports can be useful if you look at them in detail to identify specific areas in the code where you thought you were testing it but you were wrong.

                                                              But test coverage reports are completely useless if you just look at a percentage number on its own and say “the tests for project X are better than project Y because their number is higher”. We have a codebase at work with the coverage number around 80%, and having looked at it in detail, I can tell you that we could raise that number to 90% and get absolutely no actual benefit from it.

                                                            1. 1

                                                              In my experience, projects with 100% test coverage tend to have a lot of trivial tests – for example, the function sets foo=1, and the test verifies that foo==1. Higher order tests that actually verify that nontrivial logic is correct are much more valuable, but they are also harder to measure.

                                                              1. 1

                                                                Yeah, that part is tricky. If you want 100% you’re probably going to need at least some of those. I think keeping the number of them down, and trying to make them as non-pointless as possible is just part of the art of coverage-directed testing.

                                                                That said, I see the phenomenon you’re mentioning mostly in the context of very unit-testing focused projects (and especially exclusively unit-testing focused projects). It works much better with high level tests. For example, if you’re really testing your application well and setting foo to 1 is not covered (both by line, and in the higher level sense of the value “1” having some impact)… why is that code even there? Is 1 even correct? Why?

                                                              1. 1

                                                                State serialization and synchronization across a network

                                                                Woah woah woah, I don’t think that’s something the voxel engine needs, just something the game needs that is unrelated to the graphics stack. What am I missing here?

                                                                1. 5

                                                                  Typical modern game style is to use the data model more or less directly everywhere. So, for example, if you have a simple ECS with Position and Speed components, all the code you write that does something with that information has to… well, work with that information, in the format it is stored in (usually quite directly).

                                                                  That’s not true for just rendering, it’s how the entire application is written. If the job is to synchronize state across the network, then that code also needs to understand how to find and manipulate that state. It wouldn’t make sense to try and make some kind of abstraction there: the data model is the abstraction that all the pieces of code use to cooperate. “It’s All About the Data”, as the headline in TFA says.

                                                                  1. 4

                                                                    I don’t understand why you think “voxel engine” means “voxel rendering engine”. The entire point of the article seems to be that a “voxel engine” is (or should be) so much more than a cool voxel renderer.

                                                                  1. 31

                                                                    It’s odd to see C described as boring. How can it be boring if you’re constantly navigating a minefield and a single misstep could cause the whole thing to explode? Writing C should be exhilarating, like shoplifting or driving a motorcycle fast on a crowded freeway.

                                                                    1. 17

                                                                      Hush! We don’t need more reasons for impressionable youngsters to start experimenting with C.

                                                                      1. 11

                                                                        Something can be boring while still be trying to kill you. One example is described in Things I Won’t Work With.

                                                                        1. 1

                                                                          ‘Boring’ is I suspect the author’s wording for ‘I approve of this language based on my experiences’.

                                                                          1. 10

                                                                            I suspect “boring” is used to describe established languages whose strengths and weaknesses are well known. These are languages you don’t spend any “weirdness points” for picking.

                                                                            1. 6

                                                                              ‘Boring’ is I suspect the author’s wording for ‘I approve of this language based on my experiences’.

                                                                              I’m curious if you read the post, and if so, how you got that impression when I said things like “it feels much nicer to use an interesting language (like F#)”, “I still love F#”, etc.

                                                                              Thanks for the feedback.

                                                                              1. 4

                                                                                I found your article pretty full of non-sequiturs and contradictions, actually.

                                                                                boring languages are widely panned. … One thing I find interesting is that, in personal conversations with people, the vast majority of experienced developers I know think that most mainstream langauges are basically fine,

                                                                                Are they widely panned or are they basically fine?

                                                                                But when I’m doing interesting work, the boilerplate is a rounding error and I don’t mind using a boring language like Java, even if that means a huge fraction of the code I’m writing is boilerplate.

                                                                                Is it a rounding error or is it a huge fraction? Once the code has been written down, it doesn’t matter how much effort it was to mentally wrestle with the problem. That was a one-time effort, you don’t optimize for that. The only thing that matters is clearly communicating the code to readers. And if it’s full of boilerplate, that is not great for communication. I want to optimize for clear, succinct communication.

                                                                                Of course, neither people who are loud on the internet nor people I personally know are representative samples of programmers, but I still find it interesting.

                                                                                I’m fairly sure, based on this, that you are just commenting based on your own experiences, and are not claiming to have an unbiased sample?

                                                                                To me it basically seems that your argument is, ‘the languages which should be used are the ones which are already used’. The same argument was used against C, C++, Java, Python, and every other boring language you can think of.

                                                                                1. 3

                                                                                  Are they widely panned or are they basically fine?

                                                                                  I think the point is that the people who spend a lot of time panning boring languages (and advocating their favourite “interesting” one) are not representative of “experienced developers”. They’re just very loud and have an axe to grind.

                                                                                  1. 1

                                                                                    Having a tough time reconciling this notion that a narrow section of loudmouths criticize ‘boring languages’, against ‘widely panned’, which to me means ‘by a wide or significant section’.

                                                                                    But it’s really quite interesting how the experienced programmers who like ‘boring languages’ are the ones being highlighted here. It begs the question, what about the experienced programmers who don’t? Are they just not experienced enough? Sounds like an unquestionable dogma to me. If you don’t like the boring languages in the list, you’re just not experienced enough to realize that languages ultimately don’t matter.

                                                                                    Another interesting thing, some essential languages of the past few decades are simply not in this list. E.g. SQL, JavaScript, shell. Want to use a relational database, make interactive web pages, or just bash out a quick script? Sorry, can’t, not boring enough 😉

                                                                                    Of course that’s a silly argument. The point is to use the right tool for the job. Sometimes that’s a low-level real-time stuff that needs C, sometimes it’s safety-critical high-perf stuff that needs Ada or Rust, sometimes you need a performant language with good domain modelling and safety properties like OCaml or F#. Having approved lists of ‘boring languages’ is a silly situation to get into.

                                                                                    1. 2

                                                                                      To be honest, I don’t really see why that’s hard to reconcile at all. Take an extreme example:

                                                                                      Let’s say programming language X is used for the vast majority of real world software development. Through some strange mechanism (doesn’t matter), programmers who write language X never proselytize programming languages on the Internet. Meanwhile, among the set of people who do, they almost always have nasty things to say about X. So, all the articles you can find on the general topic are at least critical of X, and a lot of them are specifically about how X is the devil.

                                                                                      Is saying that X is “widely panned” accurate? Yes.

                                                                                      Of course that’s a silly argument.

                                                                                      Yes it is.

                                                                                      The point is to use the right tool for the job.

                                                                                      Indeed.

                                                                              2. 5

                                                                                Normally I’d lean towards this interpretation, but I’ve read many other posts by this author and he strikes me as being more thoughtful than that. Perhaps a momentary lapse in judgement; happens to everyone I suppose.

                                                                              3. 1

                                                                                That does not sound any different from most other languages. You have described programming.

                                                                                To expand a bit on that, GNOME is full of assertions, and it’s quite hard to make it crash internally.

                                                                              1. 1

                                                                                Since struct and class are so similar, I choose to consider class to be the keyword in excess, simply because struct exists in C and not class, and that it is the process of the keyword class that brought them both so close.

                                                                                This is an interesting perspective on the history. I would consider struct to be the keyword worth removing, since that would change the default access qualifiers to be safer.

                                                                                1. 5

                                                                                  I may be misremembering but I am reasonably sure that backwards compatibility with C was one of the early design goals of C++. Removing struct would quickly break compatibility. That is, presumably, why the default access qualifier is different from class‘s (and identical to C’s struct).

                                                                                  1. 1

                                                                                    It’s always irked me that this C compatibility was only one-way because of support for member functions (at least).

                                                                                  2. 3

                                                                                    Removing struct would create a lot more C code that is not C++, and making the default “safer” doesn’t improve things since, as noted, it’s standard practice to be explicit with access qualifiers.

                                                                                    1. 4

                                                                                      Yeah, I don’t think that can be understated. This would destroy one of the biggest reasons C++ was successful, and one of its main advantages to this day. It would even make most C headers not C++ compatible, which would be an absolute catastrophe. Even if the committee did something so egregious, no compiler could or would ever implement it (beyond perhaps a performative warning).

                                                                                      I think the real mistake is that the keywords are redundant at all. We’ve ended up with this near-universal convention that struct is for bags of data (ideally POD or at least POD-ish) because that’s a genuinely useful and important distinction. Since C++ somehow ended up with the useless “class except public by default” definition, we all simply pretend that it has a useful (if slightly fuzzy) one.

                                                                                      1. 1

                                                                                        Because of its incremental design and the desire to make classes seem like builtin types, C++ has a Moiré pattern-like feel. A lot of constructs that are exceedingly close, yet different.

                                                                                  1. 11

                                                                                    These aren’t for comments, but rather for replies on a review thread. I think it is unwise to overload the term ‘comment’ in computing.

                                                                                    For code comments, I have been using https://www.python.org/dev/peps/pep-0350/ for this for a long time, and recommend it to others.

                                                                                    For review responses, I suppose this looks decent enough, although the use of bold assumes styled text; I would prefer all-caps, as has been conventional in unstyled text for quite awhile. When styles are available, bold and all-caps is quite visually distinct.

                                                                                    1. 4

                                                                                      These aren’t for comments, but rather for replies on a review thread. I think it is unwise to overload the term ‘comment’ in computing.

                                                                                      These are comments, readers are supposed to understand that we’re talking about something different from code comments by context. This is absolutely not an unreasonable expectation. Both my kids have understood contextual words without being taught. Context really is intuitive to human nature and it’s perfectly reasonable to use the same word in different contexts to mean something different.

                                                                                      1. 4

                                                                                        “Reviews” is the standard word for this.

                                                                                        1. 1

                                                                                          The only real context here outside of TFA itself is computing, or maybe slightly more broadly, technology. All we see on this site, which is generally full of programming minutia, is “comments”, in both the title and domain name. The use of the word “conventional” only makes it worse: conventions in code comments are an almost universally recognized and common thing, conventions in reviews, not so much. One might even argue that this is nearing territory considered off-topic on this site (being not particularly technical).

                                                                                          I’d be low-key surprised if anyone here assumed differently. This is actually my second click through to this article, because although I read the whole thing the first time, it didn’t even occur to me that this link was to that article, and not something on code commenting practices or whatever that I missed before.

                                                                                          Sure, anyone who actually reads the whole thing and comes away confused… well, has bigger problems… but it’s still a poor choice of words. Maybe this is a superficial bikeshed, but that sort of thing is pretty important when the whole point is to define a soft standard for things with a standard name. Even in the context this is specifically intended for (code review), I’d assume that “conventional comments” was something about the code (did I get the Doxygen tags wrong or something?), because of course I would. That’s what a code review is.

                                                                                      1. 1

                                                                                        I haven’t ever done anything with entity-component systems. I am curious about how broadly this could be applied.

                                                                                        So I understand you having an entity and you give it a position, so presumably this component is like a property of the entity. So why not just give the entity a position directly?

                                                                                        1. 2

                                                                                          There’s a few reasons. On the software engineering side of things, it avoids a lot of issues where class hierarchies are too rigid or code becomes too tightly coupled, but there are also performance benefits which is largely why the game development universe is so into the idea.

                                                                                          If you have a system that is calculating collisions, for example, perhaps you only need that position to do the calculation. If you “just” give the entity a position “directly” (assuming Entity is a class and you just jam fields in there), then you will also “just” give it other things directly, and eventually it grows to have a huge number of fields. So, your collision algorithm is scanning huge chunks of fragmented memory only to read a single position variable, which is extremely cache-inefficient.

                                                                                          In contrast, with an ECS you can implement that so that scanning all of the positions is just a linear scan of a contiguous array. Depending on the data type it may even be vectorized. The way you realize this is to not make the component a property of the entity in the sense that it is stored “in” the entity, but to instead store the components by type, completely separate from entities, and associate them with IDs. In the simplest ideal vision, there is no Entity class at all, an entity is simply an integer.

                                                                                          1. 1

                                                                                            In contrast, with an ECS you can implement that so that scanning all of the positions is just a linear scan of a contiguous array. Depending on the data type it may even be vectorized.

                                                                                            I definitely get this along with the cache argument.

                                                                                            The thing I am not really sure about is perhaps more generally, outside of games. I don’t really do game design, but I do things with web applications (angular).

                                                                                            1. 1

                                                                                              It’s a technique that’s mainly for performance–it also is kinda specific to languages (read: C/C++/Java/C#) that don’t have an easy way of doing dynamic compositions/mixins. Like, in Ruby or JS I don’t think it’s as big a win.

                                                                                          2. 1

                                                                                            Its a framework that helps reinforce a good separation of concerns. You can mix and match any type of components across your entities, and your systems only care about their specific types of components. It’s a lifesaver in instances where you need an oddball case later in development. “Gee, I really need this Sword Item class to be able to talk to the player, but only the Character class has “Talk()”! Rather than trying to figure out how to shoehorn your Sword into a different class hierarchy, you would start by just adding a Talk component to the Sword entity.

                                                                                            It flattens out the logic, any “thing” in your world has the capability to do any action.

                                                                                          1. 4

                                                                                            getting rid of footguns like parameter-less/takes-any-argument functions

                                                                                            Wow, finally. A few times I’ve been told to “make my functions ANSI” by reviewers, which always got me like “WHAT? I don’t use the weird K&R style decls before the opening {, this is ANSI??” and a minute later “oh, the stupid (void) parameter, argh” >_<

                                                                                            1. 1

                                                                                              There’s a Warning For That™

                                                                                            1. 10

                                                                                              Are they finally going to fix the abomination that is C11 atomics? As far as I can tell, WG14 copied atomics from WG21 without understanding them and ended up with a mess that causes problems for both C and C++.

                                                                                              In C++11 atomics, std::atomic<T> is a new, distinct type. An implementation is required to provide a hardware-enforced (or, in the worst case, OS-enforced) atomic boolean. If the hardware supports a richer set of atomics, then it can be used directly, but a std::atomic<T> implementation can always fall back to using std::atomic_flag to implement a spinlock that guards access to larger types. This means that std::atomic<T> can be defined for all types and be reasonably efficient (if you have a futex-like primitive then, in the uncontended case it’s almost as fast as T and in the contended state it doesn’t consume much CPU time or power spinning).

                                                                                              Then WG14 came along and wanted to define _Atomic(T) to be compatible with std::atomic<T>. That would require the C compiler and C++ standard library to agree on data layout and locking policy for things larger than the hardware-supported atomic size, but it’s still feasible. Then they completely screwed up by making all of the arguments to the functions declared in stdatomic.h take a volatile T* instead of an _Atomic(T)*. For historical reasons, the representation of volatile T and T have to be the same, which means that _Atomic(T) and T must have the same representation and there is nowhere that you can stash a lock. The desire to make _Atomic(T) and std::atomic<T> interchangeable means that C++ implementers are stuck with this.

                                                                                              Large atomics are now implemented by calls to a library but there is no way to implement this in a way that is both fast and correct, so everyone picks fast. The atomics library provides a pool of locks and acquires one keyed on the address. That’s fine, except that most modern operating systems allow virtual addresses to be aliased and so there are situations (particularly in multi-process situations, but also when you have a GC or similar doing exciting virtual memory tricks) where simple operations _Atomic(T) are not atomic. Fixing that would requiring asking the OS if a particular page is aliased before performing an operation (and preventing it from becoming aliased during the operation), at which point you may as well just move atomic operations into the kernel anyway, because you’re paying system call for each one.

                                                                                              C++20 has worked around this by defining std::atomic_ref, which provides the option of storing the lock out-of-line with the object, at the expense of punting the determination of the sharing set for an object to the programmer.

                                                                                              Oh, and let’s not forget the mtx_timedlock fiasco. Ignoring decades of experience in API design, WG14 decided to make the timeout for a mutex the wall-clock time, not the monotonic clock. As a result, it is impossible to write correct code using C11’s mutexes because the wall-clock time may move arbitrarily. You can wait on a mutex with a 1ms timeout and discover that the clock was wrong and after it was reset in the middle of your ‘get time, add 1ms, timedwait’ sequence, you’re now waiting a year (more likely, you’re waiting multiple seconds and now the tail latency of your distributed system has weird spikes). The C++ version of this API gets it right and allows you to specify the clock to use, pthread_mutex_timedlock got it wrong and ended up with platform-specific work-arounds. Even pthreads got it right for condition variables, C11 predictable got it wrong.

                                                                                              C is completely inappropriate as a systems programming language for modern hardware. All of these tweaks are nice cleanups but they’re missing the fundamental issues.

                                                                                              1. 3

                                                                                                Then they completely screwed up by making all of the arguments to the functions declared in stdatomic.h take a volatile T* instead of an _Atomic(T)*. For historical reasons, the representation of volatile T and T have to be the same, which means that _Atomic(T) and T must have the same representation and there is nowhere that you can stash a lock.

                                                                                                I’m not too familiar with atomics and their implementation details, but my reading of the standard is that the functions in stdatomic.h take a volatile _Atomic(T) * (i.e. a pointer to volatile-qualified atomic type).

                                                                                                They are described with the syntax volatile A *object, and earlier on in the stdatomic.h introduction it says “In the following synopses: An A refers to one of the atomic types”.

                                                                                                Maybe I’m missing something?

                                                                                                1. 2

                                                                                                  Huh, it looks as if you’re right. That’s how I read the standard in 2011 when I added the atomics builtins to clang, but I reread it later and thought that I’d initially misunderstood. It looks as if I get to blame GCC for the current mess then (their atomic builtins don’t require _Atomic-qualified types and their stdatomic.h doesn’t check it).

                                                                                                  Sorry WG14, you didn’t get atomics wrong, you just got mutexes and condition variables wrong.

                                                                                                  That said, I’ve no idea why they felt the need to make the arguments to these functions volatile and _Atomic. I am not sure what a volatile _Atomic(T)* actually means. Presumably the compiler is not allowed to elide the load or store even if it can prove that no other thread can see it?

                                                                                                  1. 1

                                                                                                    I’ve no idea why they felt the need to make the arguments to these functions volatile and _Atomic

                                                                                                    I’ve no idea; but a guess: they want to preserve the volatility of arguments to atomic_*. That is, it should be possible to perform operations on variables of volatile type without losing the ‘volatile’. I will note that the c++ atomics contain one overload with volatile and one without. But if that’s the case, why the committee felt they could get away with being polymorphic wrt type, but not with being polymorphic wrt volatility is beyond me.

                                                                                                    There is this stackoverflow answer from a committee member, but I did not find it at all illuminating.

                                                                                                    not allowed to elide the load or store even if it can prove that no other thread can see it?

                                                                                                    That would be silly; a big part of the impetus for atomics was to allow the compiler to optimize in ways that it couldn’t using just volatile + intrinsics. Dead loads should definitely be discarded, even if atomic!


                                                                                                    One thing that is clear from this exchange: there is a massive rift between specifiers, implementors, and users. Thankfully the current spec editor (JeanHeyd Meneide, also the author of the linked post) seems to be aware of this and to be acting to improve the situation; so we will see what (if anything) changes.

                                                                                                    1. 3

                                                                                                      One thing that is clear from this exchange: there is a massive rift between specifiers, implementors, and users. Thankfully the current spec editor (JeanHeyd Meneide, also the author of the linked post) seems to be aware of this and to be acting to improve the situation; so we will see what (if anything) changes.

                                                                                                      It’s not really clear to me how many implementers are left that care:

                                                                                                      • MSVC is a C++ compiler that has a C mode. The authors write in C++ and care a lot about C++.
                                                                                                      • Clang is a C++ compiler that has C and Objective-C[++] modes. The authors write in C++ and care a lot about C++.
                                                                                                      • GCC includes C and C++ compilers with separate front ends, it’s primarily C so historically the authors have cared a lot about C, but for new code it’s moving to C++ and so the authors increasingly care about C++.

                                                                                                      That leaves things like PCC, TCC, an so on, and a few surviving 16-bit microcontroller toolchains, as the only C implementations that are not C++ with C as an afterthought.

                                                                                                      I honestly have no idea why someone would choose to write C rather than C++ these days. You end up writing more code, you have a higher cognitive load just to get things like ownership right (even if you use nothing from C++ other than smart pointers, your live is significantly better than that of a C programmer), you don’t get generic data structures, and you don’t even get more efficient code because the compilers are all written in C++ and so care about C++ optimisation because it directly affects the compiler writers.

                                                                                                      C++ is not seeing its market eroded by C but by things like Rust and Zig (and, increasingly, Python and JavaScript, since computers are fast now). C fits in a niche that doesn’t really exist anymore.

                                                                                                      1. 2

                                                                                                        I honestly have no idea why someone would choose to write C rather than C++ these days.

                                                                                                        For applications, perhaps, but for libraries and support code, ABI stability and ease of integration with the outside world are big ones. It’s also a much less volatile language in ways that start to really matter if you are deploying code across a wide range of systems, especially if old and/or embedded ones are included.

                                                                                                        Avoiding C++ (and especially bleeding edge revisions of it) avoids a lot of real life problems, risks, and hassles. You lose out on a lot of power, of course, but for some projects the kind of power that C++ offers isn’t terribly important, but the ability to easily run on systems 20 years old or 20 years into the future might be. There’s definitely a sort of irony in C being the real “write once, run anywhere” victor, but… in many ways it is.

                                                                                                        C fits in a niche that doesn’t really exist anymore.

                                                                                                        It might not exist in the realm of trendy programming language debates on the Internet, but we’re having this conversation on systems largely implemented in it (UNIX won after all), so I think it’s safe to say that it very much exists, and will continue to for a long time. That niche is just mostly occupied by people who don’t tend to participate in programming language debates. One of the niche’s best features is being largely insulated from all of that noise, after all.

                                                                                                        It’s a very conservative niche in a way, but sometimes that’s appropriate. Hell, in the absolute worst case scenario, you could write your own compiler if you really needed to. That’s of course nuts, but it is possible, which is reassuring compared to languages like C++ and Rust where it isn’t. More realistically, diversity of implementation is just a good indicator of the “security” of a language “investment”. Those implementations you mention might be nichey, but they exist, and you could pretty easily use them (or adapt them) if you wanted to. This is a good thing. Frankly I don’t imagine any new language will ever manage to actually replace C unless it pulls the same thing off. Simplicity matters in the end, just in very indirect ways…

                                                                                                        1. 4

                                                                                                          For applications, perhaps, but for libraries and support code, ABI stability and ease of integration with the outside world are big ones. It’s also a much less volatile language in ways that start to really matter if you are deploying code across a wide range of systems, especially if old and/or embedded ones are included.

                                                                                                          I’d definitely have agreed with you 10 years ago, but the C++ ABI has been stable and backwards compatible on all *NIX systems, and fairly stable on Windows, for over 15 years. C++ provides you with some tools that allow you to make unstable ABIs for your libraries, but it also provides tools for avoiding these problems. The same problems exist in C: you can’t add a field to a C structure without breaking the ABI, just as you can’t add a field to a C++ class without breaking the ABI.

                                                                                                          I should point out that most of the things that I work on these days are low-level libraries and C++17 is the default tool for all of these.

                                                                                                          You lose out on a lot of power, of course, but for some projects the kind of power that C++ offers isn’t terribly important, but the ability to easily run on systems 20 years old or 20 years into the future might be.

                                                                                                          Neither C nor C++ guarantees this, in my experience old C code needs just as much updating as C++ code, and it’s often harder to do because C code does not encourage clean abstractions. This is particularly true when talking about running on new platforms. From my personal experience, we and another group have recently written memory allocators. Ours is written in C++, theirs in C. This is what our platform and architecture abstractions look like. They’re clean, small, and self-contained. Theirs? Not so much. We’ve ported ours to CHERI, where the hardware enforces strict pointers and bounds enforcement on pointers with quite a small set of changes, made possible (and maintainable when most of our targets don’t have CHERI support) by the fact that C++ lets us define pointer wrapper types that describe high-level semantics of the associated pointer and a state machine for which transitions are permitted, porting theirs would require invasive changes.

                                                                                                          It might not exist in the realm of trendy programming language debates on the Internet, but we’re having this conversation on systems largely implemented in it (UNIX won after all), so I think it’s safe to say that it very much exists, and will continue to for a long time.

                                                                                                          I’m writing this on a Windows system, where much of the kernel and most of the userland is C++. I also post from my Mac, where the kernel is a mix of C and C++, with more C++ being added over time, and the userland is C for the old bits, C++ for the low-level new bits, and Objective-C / Swift for the high-level new bits. The only places either of these systems chose C were parts that were written before C++11 was standardised.

                                                                                                          Hell, in the absolute worst case scenario, you could write your own compiler if you really needed to.

                                                                                                          This is true for ISO C. In my experienced (based in part on building a new architecture designed to run C code in a memory-safe environment and working on defining a formal model of the de-facto C standard), there is almost no C code that is actually ISO C. The language is so limited that anything nontrivial ends up using vendor extensions. ‘Portable’ C code uses a load of #ifdefs so that it can use two or more different vendor extensions. There’s a lot of GNU C in the world, for example.

                                                                                                          Reimplementing GNU C is definitely possible (clang, ICC, and XLC all did it, with varying levels of success) but it’s hard, to the extent that of these three none actually achieve 100% compatibility to the degree that they can compile, for example, all of the C code in the FreeBSD ports tree out of the box. They actually have better compatibility with C++ codebases, especially post-C++11 codebases (most of the C++ codebases that don’t work are ones that are doing things so far outside the standard that they have things like ‘works with G++ 4.3 but not 4.2 or 4.4’ in their build instructions).

                                                                                                          More realistically, diversity of implementation is just a good indicator of the “security” of a language “investment”. Those implementations you mention might be nichey, but they exist, and you could pretty easily use them (or adapt them) if you wanted to.

                                                                                                          There are a few niche C compilers (e.g. PCC / TCC), but almost all of the mainstream C compilers (MSVC, GCC, Clang, XLC, ICC) are C++ compilers that also have a C mode. Most of them are either written in C++ or are being gradually rewritten in C++. Most of the effort in ‘C’ compiler is focused on improving C++ support and performance.

                                                                                                          By 2018, C++17 was pretty much universally supported by C++ compilers. We waited until 2019 to move to C++17 for a few stragglers, we’re now pretty confident being able to move to C++20. The days when a new standard took 5+ years to support are long gone for C++. Even a decade ago, C++11 got full support across the board before C11.

                                                                                                          If you want to guarantee good long-term support, look at what the people who maintain your compiler are investing in. For C compilers, the folks that maintain them are investing heavily in C++ and in C as an afterthought.

                                                                                                          1. 3

                                                                                                            I’d definitely have agreed with you 10 years ago, but the C++ ABI has been stable and backwards compatible on all *NIX systems, and fairly stable on Windows, for over 15 years. C++ provides you with some tools that allow you to make unstable ABIs for your libraries, but it also provides tools for avoiding these problems. The same problems exist in C: you can’t add a field to a C structure without breaking the ABI, just as you can’t add a field to a C++ class without breaking the ABI.

                                                                                                            The C++ ABI is stable now, but the problem is binding it from other languages (i.e. try binding a mangled symbol), because C is the lowest common denominator on Unix. Of course, with C++, you can just define a C-level ABI and just use C++ for everything.

                                                                                                            edit

                                                                                                            Reimplementing GNU C is definitely possible (clang, ICC, and XLC all did it, with varying levels of success) but it’s hard, to the extent that of these three none actually achieve 100% compatibility to the degree that they can compile, for example, all of the C code in the FreeBSD ports tree out of the box. They actually have better compatibility with C++ codebases, especially post-C++11 codebases (most of the C++ codebases that don’t work are ones that are doing things so far outside the standard that they have things like ‘works with G++ 4.3 but not 4.2 or 4.4’ in their build instructions).

                                                                                                            It’s funny no one ever complains about GNU’s extensions to C being so prevalent that it makes implementing other C compilers hard, yet loses their minds over say, a Microsoft extension.

                                                                                                            1. 2

                                                                                                              The C++ ABI is stable now, but the problem is binding it from other languages (i.e. try binding a mangled symbol), because C is the lowest common denominator on Unix. Of course, with C++, you can just define a C-level ABI and just use C++ for everything.

                                                                                                              That depends a lot on what you’re binding. If you’re using SWIG or similar, then having a C++ API can be better because it can wrap C++ types and get things like memory management for free if you’ve used smart pointers at the boundaries. The binding generator doesn’t care about name mangling because it’s just producing a C++ file.

                                                                                                              If you’re binding to Lua, then you can use Sol2 and directly surface C++ types into Lua without any external support. With something like Sol2 in C++, you write C++ classes and then just expose them directly from within C++ code, using compile-time reflection. There are similar things for other languages.

                                                                                                              If you’re trying to import C code into a vaguely object-oriented scripting language then you need to implement an object model in C and then write code that translates from your ad-hoc language into the scripting language’s one. You have to explicitly write all memory-management things in the bindings, because they’re API contracts in C but part of the type system in C++.

                                                                                                              From my personal experience, binding modern C++ to a high-level language is fairly easy (though not quite free) if you have a well-designed API, binding Objective-C (which has rich run-time reflection) is trivial to the extent that you can write completely generic bridges, and binding C is possible but requires writing bridge code that is specific to the API for anything non-trivial.

                                                                                                              1. 1

                                                                                                                Right; I suspect it’s actually better with a binding generator or environments where you have to write native binding code (i.e. JNI/PHP). It’s just annoying for the ad-hoc cases (i.e. .NET P/Invoke).

                                                                                                                1. 2

                                                                                                                  On the other hand, if you’re targeting .NET on Windows then you can expose COM objects directly to .NET code without any bridging code and you can generate COM objects directly from C++ classes with a little bit of template goo.

                                                                                                2. 2

                                                                                                  Looks like Hans Boehm is working on it, as mentioned in the bottom of the article. They are apparently “bringing it back up to parity with C++” which should fix the problems you mentioned.

                                                                                                  1. 4

                                                                                                    That link is just Hans adding a <cstdatomic> to C++ that adds a #define _Atomic(T) std::atomic<T>. This ‘fixes’ the problem by letting you build C code as C++, it doesn’t fix the fact that C is fundamentally broken and can’t be fixed without breaking backwards source and binary compatibility.

                                                                                                1. 4

                                                                                                  And people wonder why RDF didn’t become popular. I can’t imagine why…

                                                                                                  1. 2

                                                                                                    I don’t think one can hold the fact that such discussion is possible with RDF against RDF. RDF is a tool. You may stick to representing your CSV in RDF or go further, and RDF has nothing to do with it. I actually don’t see the future of IoT/Industry 4.0/buzzwordhere without RDF. Its core premise is a federation of vocabulary definitions, where classes and properties are identified by their URIs instead of bespoke literals (and anyone can reuse other definitions by linking to them). Eg ssn, s_s_n, and all other variants become ‘https://ns.irs.gov/core/#ssn’. I can’t imagine how we are going to build a future of “connected everything” if we can’t get things even to use the same terms for same things and just keep inventing new JSON structures for every single API.

                                                                                                    1. 4

                                                                                                      I remember the hey-day of RDF in “Web 2.0” - Flickr, Friend-of-a-Friend, all sorts of API-driven sites with sharing. In principle I really appreciate the affordances these services offered but they were a bit too hard for most developers to wrap their heads around.

                                                                                                      The vision for XML was similar: that organizations would define careful schemas, files would conform to these schemas, and liberal use of XSLT would transform data from one form to another. Reality was that people shoved CSV data willy-nilly into XML, shipped it over the wire, and relied on copious amount of hand-filed ETL code at the receiving end to handle it correctly.

                                                                                                      RDF is the same - it’s a beautiful creation but too good for this fallen world. The future is an endless pile of JSON.

                                                                                                      1. 3

                                                                                                        That’s the fundamental tragedy of RDF, to me. The idea and most of the basic concepts are mostly great and sorely needed, but the technology stack and documentation is a 90’s W3C nightmare, and there is all this academic logic/theory stuff around that almost nobody in the real world cares about and only makes the entire thing seem hopelessly intimidating. There is some good technology hidden in there, but if you just search the web for “RDF” as a curious potential adopter, you’ll almost certainly end up saying “yeeeeeeeah, no thanks”.

                                                                                                        As the maintainer of a project (LV2) that “forces” RDF on developers who just want to get something done (write audio plugins), I’m painfully aware of these problems. While some aspects of the technology are concretely useful there, it /really/ hurts to have RDF be what it is. In recent years I’ve been trying to mitigate this on a soft level by distancing from “RDF” and the mess the W3C and the semantics people have made, including avoiding using the term “RDF” at all. The situation is so bad that I think doing so only does damage.

                                                                                                        In an ideal universe maybe there’d be a vaguely WHATWG-like splinter group of people trying to build up these ideas in a more practical way (we want to chuck quasi-schemaless data around but still have it make some sense, and we need nice tools with a low barrier of entry to do so, etc), but I imagine that ship has sailed.

                                                                                                        At least JSON-LD provides a pretty viable bridge between something most developers these days are comfortable with (JSON) and the Linked Data ideal, at the cost of more work having to write context definitions and so on. Although I have my gripes with JSON-LD, it at least provides us an option to provide something that is superficially uncontroversial (“it’s just JSON”) without throwing the meaningful baby out with the bathwater…

                                                                                                        While I have my gripes with JSON-LD, I think it’s the only hope at this point. The RDF project did such a bad job at the practical-developer-facing side of things, that the only way out is to present a veneer that almost completely abstracts it away and makes it essentially an implementation detail. I’m pretty convinced at this point that the only way to make RDF-based technology palatable to random developers in the trenches is to make it so that they can’t even tell they’re using RDF-based technology at all, unless they actively dig into it.

                                                                                                        JSON won because it’s simple. A developer who knows nothing at all about it can get the basic idea with a single web search and maybe 5 or 10 minutes, and probably achieve their goal (which is probably just chucking some data around) shortly thereafter. Meanwhile, it would probably take days if not weeks to initially figure out what RDF even /is/, to say nothing of actually achieving anything with it. There’s a lesson to be learned in there somewhere. I wrote a fully conformant (and very fast) JSON parser in C with no dependencies in one weekend. I’ve been writing an RDF implementation for over a decade with no end in sight. That kind of thing really matters.

                                                                                                        … I probably should have used this as my entry in the “what would you rewrite from scratch?” thread from a few weeks ago. Such a missed opportunity.