Threads for Student

    1. 2

      these impressive performance gains turned out to be primarily due to inadvertently working around a regression in LLVM 19. When benchmarked against a better baseline (such GCC, clang-18, or LLVM 19 with certain tuning flags), the performance gain drops to 1-5% or so depending on the exact setup.

      5% for free isn’t bad.

      1. 1

        free only for users, not for devs. The code is now more complicated.

        1. 10

          I find it slightly odd that an “epic treatise on error models” would fail to mention Common Lisp and Smaltalk, whose error models provide a facility that all others lack: resuming from an error.

          1. 7

            Hi, author here, the title also does say “for systems programming languages” :)

            For continuations to work in a systems programming language, you can probably only allow one-shot delimited continuations. It’s unclear to me as to how one-shot continuations can be integrated into a systems language where you want to ensure careful control over lifetimes. Perhaps you (or someone else here) knows of some research integrating ownership/borrowing with continuations/algebraic effects that I’m unfamiliar with?

            The closest exception to this that I know of is Haskell, which has support for both linear types and a primitive for continuations. However, I haven’t seen anyone integrate the two, and I’ve definitely seen some soundness-related issues in various effect systems libraries in Haskell (which doesn’t inspire confidence), but it’s also possible I missed some developments there as I haven’t written much Haskell in a while.

            1. 10

              I’m sorry for the slightly snarky tone of my original reply, but even if you were to discount the Lisp machines, or all the stuff Xerox and others did with Smalltalk (including today’s Croquet), as somehow not being systems, I would have expected an epic treatise to at least mention that error resumption exists – especially since academia is now rediscovering this topic as effect handlers (typically without any mention of the prior art).

              For continuations to work in a systems programming language, you can probably only allow one-shot delimited continuations.

              This misconception is so common (and dear to my heart) that I have to use bold:

              Resumable exceptions do not require first-class continuations, whether delimited or undelimited, whether one-shot or multi-shot. None at all. Nada. Zilch.

              To take the example I posted earlier about writing to a full disk: https://lobste.rs/s/az2qlz/epic_treatise_on_error_models_for_systems#c_ss3n1k

              ... outer stack ...
                  write()
                      signal_disk_is_full()
                          disk_is_full_handler()
              

              Suppose write() discovers that the disk is full (e.g. from an underlying primitive). This causes it to call signal_disk_is_full(). Note that the call to signal_disk_is_full() happens inside the stack of write() (obviously).

              Now signal_disk_is_full() looks for a handler and calls it: disk_is_full_handler(). Again, the call to the handler happens inside the stack of signal_disk_is_full() (and write()). The handler can return normally to write() once it has cleaned up space.

              write() is never popped off the stack. It always stays on the stack. IOW, there is never a need to capture a continuation, and never a need to reinstate one. The disk_is_full_handler() runs inside the stack of the original call to write().

              effect systems

              A side note: most effect systems do use and even require first-class continuations, but IMO that’s completely overkill and only needed for rarely used effects like nondeterminism. For simple effects, like resumable exceptions, no continuations are needed whatsoever.

              1. 2

                but even if you were to discount the Lisp machines, or all the stuff Xerox and others did with Smalltalk (including today’s Croquet), as somehow not being systems

                I provided the working definition of “systems programming language” that I used in the blog post. It’s a narrow one for sure, but I have to put a limit somewhere. My point is not trying to exclude the work done by smart people; but I need a stopping point somewhere after 100~120 hours of research and writing.

                Resumable exceptions do not require first-class continuations, whether delimited or undelimited, whether one-shot or multi-shot. None at all. Nada. Zilch.

                Thank you for writing down a detailed explanation with a concrete example. I will update the post with some of the details you shared tomorrow.

                You will notice that my comment does not use the phrase “first-class” anywhere; that was deliberate, but perhaps I should’ve been more explicit about it. 😅

                As I see it, the notion of a continuation is that of a control operator, which allows one to “continue” a computation from a particular point. So in that sense, it’s a bit difficult for me to understand where exactly you disagree, perhaps you’re working with a different definition of “continuation”? Or perhaps the difference of opinion is because of the focus on first-class continuations specifically?

                If I look at Chapter 3 in Advances in Exception Handling Techniques, titled ‘Condition Handling in the Lisp Language Family’ by Ken M. Pitman, that states:

                At the time of the Common Lisp design, Scheme did not have an error system, and so its contribution to the dialog on condition systems was not that of contributing an operator or behavior. However, it still did have something to contribute: the useful term continuation […] This metaphor was of tremendous value to me socially in my efforts to gain acceptance of the condition system, because it allowed a convenient, terse explanation of what “restarts” were about in Common Lisp. [..] And so I have often found myself thankful for the availability of a concept so that I could talk about the establishment of named restart points as “taking a continuation, labeling it with a tag, and storing it away on a shelf somewhere for possible later use.”

                So it might be the case that the mismatch here is largely due to language usage, or perhaps my understanding of continuations is lacking.


                I’m also a little bit confused as to why your current comment (and the linked blog post) focus on unwinding/stack representation. For implementing continuations, there are multiple possible implementation strategies, sure, and depending on the exact restrictions involved, one can potentially use more efficient strategies. If a continuation is second-class in the sense that it must either be immediately invoked (or discarded), it makes sense that the existing call stack can be reused.


                Regardless of the specifics of whether we can call Common Lisp style conditions and resumption a form of continuations or not, I believe the concern about non-local control flow interacting with type systems and notions of ownership/regions/lifetimes still applies.

                1. 4

                  As I see it, the notion of a continuation is that of a control operator, which allows one to “continue” a computation from a particular point. … Or perhaps the difference of opinion is because of the focus on first-class continuations specifically?

                  Typically, there are two notions of continuations:

                  1. Continuations as an explanatory or semantic concept. E.g. consider the expression f(x + y). To evaluate this, we first need to compute x + y. At this point our continuation is f(_), where _ is the place into which we will plug the result of x + y. This is the notion of a continuation as “what happens next” or “the rest of the program”.

                  2. Continuations as an actually reified value/object in a programming language, i.e. first-class continuations. You can get such a first-class continuation e.g. from Scheme’s call/cc or from delimited control operators. This typically involves copying or otherwise remembering some part of the stack on the part of the language implementation.

                  Resumable exceptions have no need for first-class continuations (2). Continuations as an explanatory concept (1) of course still apply, but only because they apply to every expression in a program.

                  I believe the concern about non-local control flow interacting with type systems and notions of ownership/regions/lifetimes still applies.

                  The example I used has no non-local control flow at all. write() calls signal_disk_is_full() and that calls the disk_is_full_handler(), and that finally returns normally to write(). This is my point: resumption does not require any non-local control flow.

                  1. 4

                    As well as what @manuel wrote, it’s worth noting that basically every language has second-class continuations: a return statement skips to the current function’s continuation.

                    Your comment talked about one-shot delimited continuations, which are a kind of first-class continuation in that (per Strachey’s definition of first vs second class) they can be assigned to variables and passed around like other values.

                    1. 1

                      it’s worth noting that basically every language has second-class continuations: a return statement skips to the current function’s continuation.

                      In most languages, a return statement cannot be passed as an argument to a function call. So is it still reasonable to call it as “support for a second-class continuation”?

                      Your comment talked about one-shot delimited continuations, which are a kind of first-class continuation in that (per Strachey’s definition of first vs second class) they can be assigned to variables and passed around like other values.

                      I understand your and @manuel’s points that the common usage may very well be that “one-shot delimited continuation” implies “first-class” (TIL, thank you).

                      We can make this same point about functions where generally functions are assumed to be first class. However, it’s not unheard of to have second-class functions (e.g. Osvald et al.’s Gentrification gone too far? and Brachthäuser et al.’s Effects, Capabilities, and Boxes describe such systems). I was speaking in this more general sense.

                      As I see it, the “one-shot delimited” aspect is disconnected from the “second class” aspect.

                      1. 5

                        In most languages, a return statement cannot be passed as an argument to a function call. So is it still reasonable to call it as “support for a second-class continuation”?

                        That you can’t pass it as an argument is exactly why it’s called second-class. Only a first-class continuation is reified into a value in the language, and therefore usable as an argument.

                        As I see it, the “one-shot delimited” aspect is disconnected from the “second class” aspect.

                        One-shot strongly implies a first-class continuation. Second-class continuations are always one-shot, since, again, you can’t refer to them as values, so how would you invoke one multiple times?

                        1. 1

                          One-shot strongly implies a first-class continuation. Second-class continuations are always one-shot, since, again, you can’t refer to them as values, so how would you invoke one multiple times?

                          Here is the wording from Strachey’s paper, as linked by @fanf

                          they always have to appear in person and can never be represented by a variable or expression (except in the case of a formal parameter) [emphasis added]

                          Isn’t this “except in the case of a formal parameter” exactly what is used by Osvald et al. and Brachthäuser et al. in their papers? Here is the bit from Osvald et al.’s paper:

                          Our solution is a type system extension that lets us define file as a second-class value, and that ensures that such second-class values will not escape their defining scope. We introduce an annotation @local to mark second-class values, and change the signature of withFile as follows:

                          def withFile[U](n: String)(@local fn: (@local File) => U): U
                          

                          [..] Note that the callback function fn itself is also required to be second-class, so that it can close over other second-class values. This enables, for example, nesting calls to withFile

                          In the body of withFile, fn is guaranteed to have several restrictions (it cannot be escaped, it cannot be assigned to a mutable variable etc.). But the type system (as in the paper) cannot prevent the implementation of withFile from invoking fn multiple times. That would require an additional restriction – that fn can only be invoked 0-1 times in the body of withFile.

                        2. 2

                          @manuel wrote most of what I was going to (thanks, @manuel!) but I think it’s worth quoting the relevant passage from Strachey’s fundamental concepts in programming languages

                          3.5. Functions and routines as data items.

                          3.5.1. First and second class objects.

                          In ALGOL a real number may appear in an expression or be assigned to a variable, and either may appear as an actual parameter in a procedure call. A procedure, on the other hand, may only appear in another procedure call either as the operator (the most common case) or as one of the actual parameters. There are no other expressions involving procedures or whose results are procedures. Thus in a sense procedures in ALGOL are second class citizens—they always have to appear in person and can never be represented by a variable or expression (except in the case of a formal parameter), while we can write (in ALGOL still)

                          (if x > 1 then a else b) + 6
                          

                          when a and b are reals, we cannot correctly write

                          (if x > 1 then sin else cos)(x)
                          

                          nor can we write a type procedure (ALGOL’s nearest approach to a function) with a result which is itself a procedure.

                      2. 2

                        Regardless of the specifics of whether we can call Common Lisp style conditions and resumption a form of continuations or not, I believe the concern about non-local control flow interacting with type systems and notions of ownership/regions/lifetimes still applies.

                        That’s a concern, sure, but most “systems” languages have non-local control flow, right? C++ has exceptions, and Rust panics can be caught and handled. It would be very easy to implement a Common Lisp-like condition system with nothing more than thread local storage, function pointers (or closures) and catch/throw.

                        (And I’m pretty sure you can model exceptions / anything else that unwinds the stack as essentially being a special form of “return”, and handle types, ownership, and lifetimes just the same as you do with the ? operator in Rust)

                        1. 1

                          My point is not about ease of implementation, it’s about usability when considering type safety and memory safety. It’s not sufficient to integrate a type system with other features – the resulting thing needs to be usable…

                          I’ve added a section at the end, Appendix A8 describing the concrete concerns.

                          Early Rust did have conditions and resumptions (as Steve pointed out elsewhere in the thread), but they were removed because of usability issues.

                    2. 5

                      If you dig into the code a bit, you discover that SEH on Windows has full support for Lisp-style restartable and resumable exceptions in the lower level, they just aren’t exposed in the C/C++ layer. The same component is used in the NT kernel and so there’s an existence proof that you can support both of these models in systems languages, I just don’t know of anyone who does.

                      The SEH model is designed to work in systems contexts. Unlike the Itanium model (used everywhere except Windows) it doesn’t require heap allocation. The throwing frame allocates the exception and metadata and then invokes the unwinder. The unwinder then walks the stack and invokes ‘funclets’ for each frame being unwound. A funclet is a function that runs on the top of the stack but with access to another frame’s stack pointer and so can handle all cleanup for that frame without actually doing the unwind. As with the Itanium model, this is a two-stage process, with the first determining what needs to happen on the unwind and the second running cleanup and catch logic.

                      This model is very flexible because (as with the Lisp and Smalltalk exception models) the stack isn’t destroyed until after the first phase. This means that you can build any kind of policy on top quite easily.

                      1. 3

                        Oh yes, that reminds me, Microsoft’s Annex K broken C library extensions have a runtime constraint handler that is vaguely like a half-arsed Lisp condition.

                        1. 2

                          Yes. However, even the Itanium model supports it: https://itanium-cxx-abi.github.io/cxx-abi/abi-eh.html

                          A two-phase exception-handling model is not strictly necessary to implement C++ language semantics, but it does provide some benefits. For example, the first phase allows an exception-handling mechanism to dismiss an exception before stack unwinding begins, which allows resumptive exception handling (correcting the exceptional condition and resuming execution at the point where it was raised). While C++ does not support resumptive exception handling, other languages do, and the two-phase model allows C++ to coexist with those languages on the stack.

                          1. 1

                            If you dig into the code a bit

                            Are you referring to some closed-source code here, or is the implementation source-available/open-source somewhere? I briefly looked that the microsoft/STL repo, and the exception handling machinery seems to be linked to vcruntime which is closed-source AFAICT.

                            The SEH model is designed to work in systems contexts [..]

                            Thanks for the context, I haven’t seen a simple explanation of SEH works elsewhere, so this is good to know. I have one follow-up question:

                            it doesn’t require heap allocation. The throwing frame allocates the exception and metadata

                            So the exception and metadata is statically sized (and hence space for it is already reserved on the throwing frame’s stack frame)? Or can it be dynamically sized (and hence there is a risk of triggering stack overflow when throwing)?

                            The same component is used in the NT kernel and so there’s an existence proof that you can support both of these models in systems languages, I just don’t know of anyone who does.

                            As Steve pointed out elsewhere in the thread, Rust pre-1.0 did support conditions and resumptions, but they removed it.

                            To be clear, I don’t doubt whether you can support it, the question in my mind is whether can you support it in a way that is usable.

                            1. 1

                              Are you referring to some closed-source code here, or is the implementation source-available/open-source somewhere?

                              I thought I read it in a public repo, but possibly it was a MS internal one.

                              So the exception and metadata is statically sized (and hence space for it is already reserved on the throwing frame’s stack frame)? Or can it be dynamically sized (and hence there is a risk of triggering stack overflow when throwing)?

                              The throwing context allocates the exception on the stack. The funclet can then use it in place. If it needs to persist beyond the catch scope, the funclet can copy it elsewhere.

                              This can lead to stack overflow (which is fun because stack overflow is, itself, handled as an SEH exception.

                            1. 3

                              Incidentally, Rust had conditions long ago. They were removed because users preferred Result.

                              1. 1

                                Is there any documentation or code examples of how they worked?

                                1. 1

                                  https://github.com/rust-lang/rust/issues/9795 Here’s the bug about removing them. There was some documentation in those early releases, I don’t have the time to dig right now.

                            2. 2

                              I’ve only dabbled slightly with both - how is resuming from an error different from catching it? Is it that execution restarts right after the line that threw the error?

                              1. 7

                                Consider the following:

                                A program wants to write() something to a file, but – oops – the disk is full.

                                In ordinary languages, this means write() will simply fail, signal an error (via error code or exception or …), and unwind its stack.

                                In languages with resumable or restartable errors, something entirely different happens: write() doesn’t fail, it simply pauses and notifies its calling environment (i.e. outer, enclosing layers of the stack) that it has encountered a DiskIsFull situation.

                                In the environment, there may be programmed handlers that know how to deal with such a DiskIsFull situation. For example, a handler may try to empty the /tmp directory if this happens.

                                Or there may be no such handler, in which case an interactive debugger is invoked and presented to the human user. The user may know how to make space such as deleting some no longer needed files.

                                Once a handler or the user has addressed the DiskIsFull situation, it can tell write() to try writing again. Remember, write() hasn’t failed, it is still paused on the stack.

                                Well, now that space is available, write() succeeds, and the rest of the program continues as if nothing had happened.

                                Only if there is no handler that knows how to deal with DiskIsFull situations, or if the user is not available to handle the situation interactively, would write() fail conclusively.

                                1. 5

                                  Is it that execution restarts right after the line that threw the error?

                                  Yes. Common Lisp and Smalltalk use condition systems, where the handler gets executed before unwinding.

                                  So unwinding is just one possible option (one possible restart), other common ones are to start a debugger, to just resume, to resume with a value (useful to provide e.g. default values, or replacement for invalid values), etc… the signalling site can provide any number of restart for the condition they signal.

                                  It’s pretty cool in that it’s a lot more flexible, although because it’s adjacent to dynamic scoping it can make the program’s control flow much harder to grasp if you start using complex restarts or abusing conditions.

                                  1. 2

                                    Exactly. For example “call with current continuation” or call-cc allows you to optionally continue progress immediately after the throw. It’s a generalization of the callback/continuation style used in async-await systems.

                                    (There’s also hurl, which I think was intended as an esolang but stumbled upon something deep (yet already known): https://ntietz.com/blog/introducing-hurl/)

                                    1. 7

                                      You don’t need continuations to implement resumable errors. The trick is simply to not unwind the stack when an error happens. I wrote an article about how it works a while ago: http://axisofeval.blogspot.com/2011/04/whats-condition-system-and-why-do-you.html

                                      1. 2

                                        Even if you want to do stack unwinding, you don’t need continuations. Catch and throw are adequate operations to implement restarts that unwind the stack to some point first.

                                  2. 5

                                    I’ve invested many hundreds of hours into Lisps, cut my teeth with them, love them dearly, and still have romantic feelings about them. But I heard something on Twitter once that somewhat haunts me:

                                    Lisp did everything first and better; except “get used and adopted”

                                    I think there’s something to that. I think Lisps are wonderful dopamine factories for curious and playful programmers, and you could use it to ship world-class software, but I’ve struggled to believe its feature set meaningfully provides a whole lot more leverage? Maybe I’m just too smooth-brained to use it, but I have a big appetite for different approaches to programming and it mostly results in “I have more fun,” not “I ship more value, or higher value.”

                                    While doing things in Scheme or CL always felt like I was close to something True about computing, if I consider what the experience has been like delivering features, I don’t think it gave me functionally a whole lot more than, say, Python. I think “dynamically-typed, garbage collected language with decent performance” was more of a draw in the 90’s, and even then, Perl was more popular for that use case. I tell myself I was learning lessons that I could take with me to the next codebase where I’d get that great boon of productivity, or that eventually it’d all become a bit more comfortable, but it never came, even after stretching myself.

                                    In theory macros are a game-changer and unique to the language, but even in those communities I’ve met so, so few people who deploy them, and in my experience, it’s hard to find a use case for them that’s not well-served by a collection of functions. It reminds me of the myths of those people who claim to be 10x better in Forth. I feel like in order to harvest that your brain has to be just so different.

                                    Some of the features in the Land of Lisp promo comic (the blinking blue ones) are pretty rare in other places, like restarts and continuations; I get a lot of power of restarts from BEAM languages, and continuations are still a black magic to me other than things like exception handlers. It always felt like a solution in search of a problem. Has anyone here used continuations to great effect, for things that weren’t developing language features like cooperative scheduling, or nondeterminism? (e.g. amb)

                                    Not saying any of this to agitate. I’m just stuck in this place of “should I sink more time into it and try to Make It Happen (because it is great fun, even if I don’t ship much in it) or play somewhere else, like logic programming, array programming, concatenative programming…?”

                                    Anyways, wonderful article, so much to follow-up on.

                                    1. 6

                                      Lisp did everything first and better; except “get used and adopted”

                                      This is a good quote, and it reminded me of one I heard from a Roger Hui talk (creator of J) about some advice he had been given:

                                      Don’t worry about people stealing your idea. If it’s any good, you’ll have to force it down their throats.

                                      I think we all have some in-built feeling that good ideas are like sparks that catch fire, but sometimes they never do, and that’s not really because of the quality of the idea.

                                      Not saying any of this to agitate. I’m just stuck in this place of “should I sink more time into it and try to Make It Happen (because it is great fun, even if I don’t ship much in it) or play somewhere else, like logic programming, array programming, concatenative programming…?”

                                      Interesting post, and I understand what you mean - I’ve had a lot of the same thoughts myself. I don’t really see Lisp(s) as the be-all-end-all for me personally. They’re something I find neat and they’ve helped change how I view software development, but I can see there’s also plenty of other types of programming that will probably have a similar effect. The next thing I want to get into are array based languages: APL, J, Uiua etc. Variety is the spice of life and all that.

                                      1. 2

                                        Don’t worry about people stealing your idea. If it’s any good, you’ll have to force it down their throats.

                                        Funny this is the second time I see this quote this week. Apaprently it is from H. H. Aiken.

                                      2. 4

                                        I don’t think it gave me functionally a whole lot more than, say, Python. I think “dynamically-typed, garbage collected language with decent performance” was more of a draw in the 90’s, and even then, Perl was more popular for that use case

                                        I am myself super glad it gives me this over Python and other dynamically-type languages:

                                        • compile-time warnings and errors, function by function, instantaneous, built-in, no config or external tools required, with a keybinding (SBCL is good and constantly improves. It finds: typos, unused code path, (some) bad argument types, (some) bad returned types… for a Haskell on top of CL, we now have Coalton)
                                        • compiling to a self-contained binary
                                          • include static assets, rsync to my server, I deployed.
                                          • (SBCL binaries rely on the glibc)
                                        • stability (while the implementations and the libraries improve)
                                        • performance
                                        • light in resources
                                        • can install libraries from within the running image
                                        • interactive and fun development / debugging
                                        • restarts: fix a bug and resume the program from where it failed, don’t restart from scratch: do this every days, for small or long computations, you save time.
                                        • I never wait for my webserver to restart: I work from within the image
                                        • nice OO, more functional-y orientation if I wish
                                        1. 3

                                          As a hobby programmer I feel privileged as I’m not contrained by the necessities of production software in an industrial setting. I’m free to pick whatever is most fun and playful to me. Over the past decades I checked out and read about many great languages but, so far, what really resonates with me, fulfills these needs, and sticks is Lisp.

                                          1. 3

                                            In theory macros are a game-changer and unique to the language, but even in those communities I’ve met so, so few people who deploy them, and in my experience, it’s hard to find a use case for them that’s not well-served by a collection of functions

                                            This is consistent with my experience of 20+ years writing lisps. Some lisp communities tend to treat macros with a kind of holy reverence, because they know that macros are the one thing inherent to lisps that can’t be copied by non-lisps. But the cases where they’re justified are … few.

                                            In my mind the benefits of lisp syntax have more to do with consistent notation and structural editing than the macro system.

                                            Has anyone here used continuations to great effect, for things that weren’t developing language features like cooperative scheduling, or nondeterminism? (e.g. amb)

                                            I’ve never found a use for full continuations, but “full” or “stackful” coroutines offer you a subset of their power and allow you to do some really cool things that most languages don’t support: https://technomancy.us/202 (Continuations can be used to implement coroutines but can be thought of as a superset.)

                                            I’m just stuck in this place of “should I sink more time into it and try to Make It Happen (because it is great fun, even if I don’t ship much in it)

                                            If you’re not doing it for career reasons, then do it for as long as it’s fun, then stop once it isn’t! =)

                                            1. 1

                                              I just don’t agree that macros are unique to lisp. Rust, scala, and presumably other ml descendants have full featured macros.

                                              1. 2

                                                Yes, I didn’t mean to imply macros are unique to lisps; I meant that lisp’s approach to macros cannot be copied by non-lisps, because any language that uses lisp’s approach to macros (where programs are written using the same notation as data structures) by definition is already a lisp.

                                                (But my initial comment was written in haste and glossed over that nuance.)

                                            2. 2

                                              I have a big appetite for different approaches to programming and it mostly results in “I have more fun,” not “I ship more value, or higher value.”

                                              It’s quite wonderful that when you’re so passionate about the field, the lessons that you can share become very valuable to the “trade” of programming, in addition to the art. I’ve added your article to my “PLT” bookmark list, but it’s really a “how I learn” type of thing. Muscle memory, documentation, invariants, tooling and automation. Thanks for sharing, very inspiring!

                                              1. 2

                                                hard to find a use case for them that’s not well-served by a collection of functions

                                                I use a lot for data DSLs (e.g. full financial models in a single line, so you can fit in many models in a document), especially for expanding those into more data (e.g. let 1-5 expand to 1, 2, 3, 4, 5) or slots for fuzzing, calling APIs to get data, prepare an environment etc. The goal is condensing everything, to make it easier to read, reconfigure etc. although my notations aren’t an optimal tool of thought.

                                                I think the lisp tools empower library/framework writers, while library-connectors/application makers only enjoy better notation but a good framework is well fitted to its problem, whether in Python, Lisp or anything else. The lisp promise’s that it’s easier to write well-fitted frameworks/libraries/DSLs, hence Practical Common Lisp demonstrating so much library writing instead of sewing outside libraries together.

                                                Non-opinionated, Common Lisp and Racket give you more tools than you need, so you can utilize your preferred approach. I’m sure if I embraced an image-based deployment, restarts etc. functions would make more sense for a lot of this but most of my time is building new models, so most of my code condenses their representation.

                                              2. 1

                                                @mitchellh tangential to this, why did you abandon Ruby as used in virtualbox for HCL?

                                                1. 1

                                                  I’m still looking for a way to do local CI using vm’s. Indeed my 6 year old laptop is powerful enough to start vm’s for several Linux distro’s (old and new) and BSDs. I imagine it will work by spinning up a vm with qemu (or the vmm’s on the BSDs) and using SSH to clone my repo into it and running a command inside. There should probably be a mechanism to fetch and/or build latest vm images, and probably a pre-build step that installs toolchains needed for my project so it doesn’t have to happen each time and the test vm’s can be started more quickly.

                                                  I hope someone can point me to software doing this. I recently saw a post about nix(os) and starting Linux vm’s (don’t recall anything about BSDs), perhaps that’s the way?

                                                  1. 2

                                                    It isn’t complete, but this is a tool I am working on to spin up VMs locally for testing: https://github.com/stacktide/fog

                                                    I use QEMU to spin up the VMs. The serial console is passed to a socket file for initial output. QMP can be used for controlling the machine, but that isn’t fully implemented yet. I pass in cloud-init configuration to setup SSH. For a pre-build step my plan is to use QEMU’s snapshot functionality and create a qcow2 file.

                                                    You may also be interested in multipass which works similarly. I ran into some issues like templating not working that led me to build my own tool.

                                                    1. 1

                                                      Nice, this is about what I had in mind. I also figured the disk snapshot could be used to good effect, especially for the quick ephemeral snapshots. I hadn’t thought about the console yet, QMP seems like a good mechanism.

                                                      It looks like you’re aiming to “up” a machine and ssh into it and to things, keeping it around. I was thinking about starting, running a build and stopping, and the vm’s being ephemeral. But keeping them around is a great feature too.

                                                      I’m not fond of the idea of having some place where prebuilt images are pulled from. I would really like getting base images from official sources and applying changes locally. E.g. the freebsd vm images (that are signed), for openbsd an iso and auto installer script may be needed. The “library of vm images” would instead be a library of scripts to create vm images locally. The scripts shouldn’t be too complicated, and easy to inspect.

                                                      Would like to hear more about your plans!

                                                      1. 1

                                                        I want to support ephemeral workflows too. The plan is to add a “run” command that starts a machine, runs a command, and then destroys the machine when the command exits.

                                                        For the images I am only pulling prebuilt images from official sources. There is no signature verification yet, only checksums. The official library is just a bunch of YAML files shipped with the binary. You can see them in the ./images directory. I’m only supporting images that use cloud-init, so install scripts are regular cloud_config YAML with the nocloud datasource.

                                                        I’m hoping to resume work on this project in the next 3-6 months. The project I am focusing on right now needs this one for testing.

                                                        1. 2
                                                          1. 4

                                                            Is gluten the problem, or something else? My coeliac friend likes a bottle of gluten-free Old Speckled Hen.

                                                            1. 1

                                                              to be honest, I’m not certain - my doctor thinks it’s an immune condition of some kind. I’m allergic to most American staples - eggs, dairy, most forms of gluten, etc. I’ll have to give old speckled hen a try!!

                                                            2. 2

                                                              I suspect you’ve probably stumbled on Burning Brothers, and Sociable Ciderwerks; for anyone who hasn’t, there are actually good GF beer/beer-adjacent options these days! ☺

                                                              All the best,

                                                              1. 2

                                                                Thanks, my mom is gluten-intolerant and has often lamented being unable to have a beer anymore.

                                                            3. 1

                                                              Wheat beers must have barley as well

                                                          2. 16

                                                            My intent with “X as Code” was always to get knowledge out of people’s heads and into a more inscribed system. Once inscribed, knowledge and process can be shared, versioned, iterated upon, etc.

                                                            This has also been my understanding of X as Code for the better part of a decade - it wasn’t until this year that I realized there are people who understood this differently and disagreed with this definition as passionately as they do.

                                                            1. 3

                                                              Interesting! I’ve never cared to wade into it, but I always read “config as code” to mean turing complete as opposed to declarative. Another score for using ambiguous terms.

                                                              1. 1

                                                                What is the contrary view? Who are its proponents?

                                                                1. 8

                                                                  The contrary view is mostly a variation of “configuration is not programming”, which I guess defines “code” vaguely as something Turing complete?

                                                                  For example, there was this article posted here not long ago which made the point that Terraform is configuration, not code: https://xeiaso.net/blog/2025/yoke-k8s/. Somewhat expectedly, this lead to some discussion here: https://lobste.rs/s/t0uh3q/yoke_is_really_cool

                                                                  1. 7

                                                                    Ugh these are all the same statement

                                                                    • HTML is not programming
                                                                    • CSS is not programming
                                                                    • Perl is not programming
                                                                    • Terraform is not programming
                                                                    1. 7

                                                                      I’ll add a bit of substance for anyone not convinced. For context, I’m a programmer that knows things like assembly, C, OCaml, Rust and what have you. I can promise you I have done Terraform programming using its more advanced features.
                                                                      Terraform/HCL has:

                                                                      Anyways all this is besides the point, using code/programming in a way that excludes things that don’t look like your favorite language is unnecessary gatekeeping.
                                                                      “Code” doesn’t necessarily mean “programming language.” I see code as a superset of programming and configuration languages (and other things like spreadsheets or small plastic bricks).
                                                                      And the programming and config lang sets have some overlap because, as I said in the other thread, configuration is semantics and can be done in basically any language. For instance, Lua is not always configuration, but with Neovim it mostly is.

                                                                      1. 5

                                                                        I think Mitchell’s aim was to get a description of the system

                                                                        • written down
                                                                        • in a formal notation
                                                                        • in version control

                                                                        But there are things that can be programmed which cannot be programmed in this kind of code. Pegboard programming of synthesizers or the ENIAC, visual programming like Labview, mechanisms in Lego or Meccano.

                                                                        (fun fact! the Computer Lab in Cambridge was founded in 1937 starting with equipment such as differential analysers built using meccano.)

                                                                        There are other non-programming activities that can be turned into code, such as writing (are you using a word processor? or LaTeX?) or diagramming (are you using Inkscape? or Graphviz?).

                                                                      2. 2

                                                                        Where “programming” and “code” is used interchangeably so they can gatekeep the profession.

                                                                        1. 2

                                                                          Is this sentiment a consequence of people misusing “coding” and “coder” to mean programming and programmer? i.e. “all programming is coding” transforms into “all coding is programming” (not to say that declarative code isn’t programming). That word choice has always bothered me, it missed the point of what programming’s about.

                                                                          How long before we can add

                                                                          • Morse code is not programming

                                                                          to your list?

                                                                          1. 1

                                                                            I’m not sure what your point is. Surely you don’t think writing words down is inherently programming?

                                                                        2. 2

                                                                          FWIW I divided config languages into 5 categories here, I view them as a spectrum from data to code

                                                                          https://github.com/oils-for-unix/oils/wiki/Survey-of-Config-Languages

                                                                          And I do think there are 2 separate ideas in “X as code”:

                                                                          • X as “text files versioned with git” – as opposed to some database behind a GUI, which may be in a cloud
                                                                          • X as “programmable text files”, which is code
                                                                            • again, the line is fuzzy, although I think the taxonomy of 5 categories does a decent job
                                                                          1. 1

                                                                            I’d also differentiate between code executed by courts and executed with determinism

                                                                            1. 1

                                                                              What about code executed by multiple implementations of the interpreter?

                                                                    2. 6

                                                                      More like American programmers

                                                                      1. 22

                                                                        You’d think so! I did too.

                                                                        I was the tech lead of the internationalization effort for a popular website a number of years back. This was in the US. The site was English-only and we wanted to make it available in a wide variety of languages. We wanted to make it feel as native to each language as we could, rather than feeling like a translation of a foreign site.

                                                                        My team and I came up with a bunch of internal tools and a flexible library we could use to make our code work in multiple languages. When I say “flexible” I mean it went way beyond simple token replacement; it could do things like look up different variants of sentences depending on whether a caller-supplied place name referred to a city or a country and whether that distinction mattered in the target language, could use different pluralization rules for different languages, took gender into account if a sentence mentioned a person whose gender was known, and so on. We had people with linguistics backgrounds making sure we didn’t fall into any obvious traps.

                                                                        The code base was far too big for my little team to update on our own, so an early goal was to give the rest of the engineering team all the resources they needed to do a really good job of updating their own corners of the code. In addition to thoroughly documenting our tools and libraries, we wrote up a set of annotated examples of how to change existing English-only code to be translation-friendly, and we made sure it covered all the common patterns in the code base (including visual design things like assuming a button only needed to be exactly big enough to hold an English label) and included examples of what could go wrong in different languages if people decided to just do string concatenation instead.

                                                                        Then we started rolling it out. My expectation going into it was like yours: that the monoglot American devs would struggle to embrace all the techniques because English-specific assumptions would be too deeply ingrained.

                                                                        But once I started doing code reviews of people’s changes, the reality was different. It turned out there was no measurable relationship between how good someone was at making their part of the site translatable into a wide variety of languages and which language(s) they spoke. Americans who’d never spoken anything but English were just as good at it, on average, as trilingual Europeans or people whose native languages were very different from English.

                                                                        The thing that floored me was seeing people from other countries repeatedly make mistakes that would have made it impossible to correctly translate part of the site into their own native languages. This happened a lot, and it happened across multiple native languages. It seemed to me like some people were able to put their brains in “human language is highly variable and the code needs to act accordingly” mode, and some people were stuck in “I am working in English right now, so everything is English” mode, and it barely mattered if they happened to speak some other language or not.

                                                                        Maybe the situation would have been different if the site had been in more than one language from the get-go; I don’t know. But that experience really shattered some of my preconceptions about the advantages of speaking multiple languages. (For the record: I still think it’s worthwhile to be multilingual!)

                                                                        1. 5

                                                                          The first job I had in Germany, having moved here as a fresh-faced monolingual foreigner, was kind of like this. We were building an internal tool that was used by warehouse workers in various European countries. We knew we had a lot of fairly monolingual users in a variety of different languages, so when we decided to redesign the tool, I pushed really hard for making sure that every part of the UI was fully translated into all the relevant languages. I was amazed by how much my German colleagues pushed back against this, saying it would be a lot of effort, and people could just learn what the different English-language messages meant over time.

                                                                        2. 7

                                                                          There were things a decade ago that my non-American English native speaker coworkers (Romanian, Croatian, Indian [Hindi & Marathi]) learned alongside me who’d been doing localization for a while — back then, I spoke natively American English but had six school years of Latin, 10+ years of self-driven Esperanto, and smattering of (Mexican) Spanish and (Canadian) French — when we did a big project targeting 10 languages on release days and 22 within four weeks in a patch release. I’ve picked up Dutch and some Korean since then and I’m constantly learning new things about language having gotten into linguist sector of Instagram, Threads, Bluesky, and the fediverse.

                                                                          Unless you’re a linguist, you’re always learning surprising new things about language, discovering new tools in the toolbox, per se. If you’re a linguist, you’re learning which has/does what because you’re more familiar with what’s in the toolbox.

                                                                          1. 1

                                                                            Cool you’ve learned so many languages. I’m currently trying to learn a new language and struggling a bit so maybe you can help. What are your preferred ways to learn a new language and make it stick?

                                                                            1. 4

                                                                              Consistent practice. Try to experience all modes - reading, writing, listening, speaking, and conversing.

                                                                              Find media you like. You don’t have to be even at 50% comprehension to listen to some audio, but obviously you’re only going to get little bits.

                                                                              Adverts are much simpler than anything except children’s media.

                                                                              Experiencing the language is generally the best way to internalize the rules, but reading up on complex rules to help you practice them is also important.

                                                                              1. 1

                                                                                Regular practice. Duolingo is fine if all you can put into it is 10-15 minutes per day. That’s better than 0 and 5 minutes isn’t doing much. Most of my focus is on reading and writing until I started learning Dutch in 2023. You have to read a ton and listen a lot. I don’t listen as much as I should, but there are plenty of Dutch teachers and comedians on social media that I’ve come to enjoy.

                                                                                I think it’s important to remember your purpose. I like learning languages because I like linguistics and language, not because I have an acute need to interact in another language. Honestly, some trips to Belgium and The Netherlands in the last few years have been the most immersive foreign language environment I’ve been in… and any Belgian or Dutch can tell you that you can get along just fine in most both of those countries speaking just English. I was able to use nothing but Dutch to get lunch in my great grandmother’s small hometown, though!

                                                                                1. 2

                                                                                  Thanks, I’ll keep practicing! :)

                                                                            2. 3

                                                                              Absolutely not, I’ve seen a couple of these with German software, with English/French just an afterthought. And that’s already two languages where half of the stuff doesn’t even apply because they are both LTR languages in latin script. I will admit that it’s of course more likely to be a US/UK dev team.

                                                                            3. 31

                                                                              A good “falsehoods” list needs to include specific examples of every falsehood.

                                                                              1. 28

                                                                                Yours doesn’t! And I maintain that it’s still a good boy list.

                                                                                1. 2

                                                                                  That doesn’t look to me like it’s meant to be an example of a good falsehoods list.

                                                                                    1. 3

                                                                                      In addition, it’s worth knowing that dogs up to 2 years of age exhibit the halting problem.

                                                                                2. 27

                                                                                  I’ll make an attempt, with the caveat that this list seems so obvious to me that I’m worried I might be missing some nuance (imagine a similar list about cooking utensils with “people think knives can only be used for butter, but in reality they can also be used to cut bread, meat, and even vegetables!!!”).

                                                                                  Sentences in all languages can be templated as easily as in English: {user} is in {location} etc.

                                                                                  Both the substitutions and the surrounding text can depend on each other. The obvious example is languages where nouns have gender, but you might also have cases like Japanese where “in” might be へ, で, or に to indicate relative precision of the location.

                                                                                  Words that are short in English are short in other languages too.

                                                                                  German is the classic example of using lengthy compound words where English would use a shorter single-purpose word, “Rindfleisch” vs “beef” or “Lebensmittel” vs “food” (why yes I haven’t had lunch yet, why do you ask…?).

                                                                                  For any text in any language, its translation into any other language is approximately as long as the original.

                                                                                  See above – English -> German tends to become longer, English -> Chinese tends to become shorter.

                                                                                  For every lower-case character, there is exactly one (language-independent) upper-case character, and vice versa.

                                                                                  Turkish and German are famous counter-examples, with Turkish 'i' / 'I' being different letters, or German ß capitalizing to "SS" (though I think this is now considered somewhat old-fashioned?).

                                                                                  The lower-case/upper-case distinction exists in all languages.

                                                                                  Not true in Chinese, Japanese, Korean.

                                                                                  All languages have words for exactly the same things as English.

                                                                                  Every language has words that don’t exist in any other language. Sometimes because the concept is alien (English has no native word for 寿司), sometimes because a general concept has been subdivided in a different way (English has many words for overcast misty weather that don’t translate easily into languages from drier climates).

                                                                                  Every expression in English, however vague and out-of-context, always has exactly one translation in every other language.

                                                                                  I’m not sure what this means because many expressions in English don’t even have a single explanation in English, but in any case, idioms and double entendres often can’t be translated directly.

                                                                                  All languages follow the subject-verb-object word order.

                                                                                  If one’s English to SVO order is limited, limited too must their knowledge of literature be.

                                                                                  When words are to be converted into Title Case, it is always the first character of the word that needs to be capitalized, in all languages.

                                                                                  Even English doesn’t follow a rule of capitalizing the first character of every word. Title Casing The First Letter Of Every Word Is Bad Style.

                                                                                  Every language has words for yes and no.

                                                                                  One well-known counter-example being languages where agreement is by repeating a verb:

                                                                                  A: “Do you want to eat lunch together?” B: “Eat.”

                                                                                  In each language, the words for yes and no never change, regardless of which question they are answering.

                                                                                  See above.

                                                                                  There is always only one correct way to spell anything.

                                                                                  Color / colour, aluminum / aluminium

                                                                                  Each language is written in exactly one alphabet.

                                                                                  Not sure exactly what this means – upper-case vs lower-case? Latin vs Cyrillic? 漢字 vs ひらがな カタカナ ? 简化字 vs 繁体字 ? Lots of counter-examples to choose from, Kazakh probably being a good one.

                                                                                  All languages (that use the Latin alphabet) have the same alphabetical sorting order.

                                                                                  Lithuanian sorts 'y' between 'i' and 'j': https://stackoverflow.com/questions/14458314/letter-y-comes-after-i-when-sorting-alphabetically

                                                                                  Some languages special-case ordering of letter combinations, such as ij in Dutch.

                                                                                  And then there’s the dozens of European languages that have their own letters outside the standard 26. Or diacritics.

                                                                                  All languages are written from left to right.

                                                                                  Arabic, Hebrew.

                                                                                  Even in languages written from right to left, the user interface still “flows” from left to right.

                                                                                  Not sure what “flows” means here, but applications with good RtL support usually flip the entire UI – for example a navigational menu that’s on the right in English would be on the left in Arabic.

                                                                                  Every language puts spaces between words.

                                                                                  Segmenting a sentence into words is as easy as splitting on whitespace (and maybe punctuation).

                                                                                  Chinese, Japanese.

                                                                                  Segmenting a text into sentences is as easy as splitting on end-of-sentence punctuation.

                                                                                  English: "Dear Mr. Smith".

                                                                                  No language puts spaces before question marks and exclamation marks at the end of a sentence.

                                                                                  No language puts spaces after opening quotes and before closing quotes.

                                                                                  French famously has rules that differ from English regarding spacing around punctuation.

                                                                                  All languages use the same characters for opening quotes and closing quotes.

                                                                                  “ ” in English,「 」in Japanese, « » in French,

                                                                                  Numbers, when written out in digits, are formatted and punctuated the same way in all languages.

                                                                                  European languages that use '.' for thousands separator and ',' for the fractional separator, or languages that group by different sizes (like lakh/crore in Indian languages).

                                                                                  No two languages are so similar that it would ever be difficult to tell them apart.

                                                                                  Many languages are considered distinct for political reasons, even if a purely linguistic analysis would consider them the same language.

                                                                                  Languages that have similar names are similar.

                                                                                  English (as spoken in Pittsburgh), English (as spoken in Melbourne), and English (as spoken in Glasgow).

                                                                                  More seriously, Japanese and Javanese.

                                                                                  Icons that are based on English puns and wordplay are easily understood by speakers of other languages.

                                                                                  Often they’re difficult to understand even for English speakers (I once saw a literal hamburger used to signify a collapsable sidebar).

                                                                                  Geolocation is an accurate way to predict the user’s language.

                                                                                  Nobody who has ever travelled would think this. And yet. AND YET!

                                                                                  C’mon Google, I know that my IP is an airport in Warsaw but I really don’t want the Maps UI to switch to Polish when I’m trying to find a route to my hotel.

                                                                                  Country flags are accurate and appropriate symbols for languages.

                                                                                  You can roughly gauge where you are in the world by whether the local ATMs offer “🇬🇧 English”, “🇺🇸 English”, or “🇦🇺 English”.

                                                                                  Every country has exactly one “national” language.

                                                                                  Belgium, Luxembourg, Switzerland.

                                                                                  Every language is the “national” language of exactly one country.

                                                                                  English, again.

                                                                                  1. 14

                                                                                    Turkish and German are famous counter-examples, with Turkish ‘i’ / ‘I’ being different letters, or German ß capitalizing to “SS” (though I think this is now considered somewhat old-fashioned?).

                                                                                    The German ß has history.

                                                                                    The old rule is that ß simply has no uppercase. Capitalizing it as “SS” was the default fallback rule if you had to absolutely capitalize everything and the ß would look bad (such as writing “STRAßE” => “STRASSE”). Using “SZ” was also allowed in some cases.

                                                                                    The new rule is to use the uppercase ß: ẞ. So instead of “STRASSE” you now write “STRAẞE”.

                                                                                    The usage of “SZ” was disallowed in 2006, the East Germans had an uppercase ß since 1957, the West German rules basically said “Uppercase ß is in development” and that was doppred in 1984 for the rule to use SS or SZ as uppercase variant. The new uppercase ß is in the rules since 2017. And since 2024 the uppercase ß is now preferred over SS.

                                                                                    The ISO DIN 5008 was updated in 2020,

                                                                                    This means depending on what document you’re processing, based on when it was created and WHERE it was created, it’s writing of the uppercase ß may be radically different.

                                                                                    It should also be noted that if you’re in Switzerland, ß is not used at all, here the SS substitute is used even in lower case.

                                                                                    Family names may also have custom capitalization rules, where ß can be replaced by SS, SZ, ẞ or even HS, so “Großman” can become “GROHSMANN”. Note that this depends on the person, while Brother Großmann may write “GROHSMANN”, Sister Großmann may write “GROSSMANN” and their mother may use “GROẞMANN” and these are all valid and equivalent.

                                                                                    Umlauts may also be uppercased without the diacritic umlaut and with an E suffix; ä becomes “AE”. In some cases even lowercase input does the translation because older systems can’t handle special characters, though this is not GDPR compliant.

                                                                                    No two languages are so similar that it would ever be difficult to tell them apart.

                                                                                    Many languages are considered distinct for political reasons, even if a purely linguistic analysis would consider them the same language.

                                                                                    If you ever want to have fun, the politics and regionality of German dialects could be enough to drive some linguists up the wall.

                                                                                    Bavarian is recognized as a language and dialect at the same time, it can be subdivided into dozens and dozens of subdialects, which are all similar but may struggle to understand eachother.

                                                                                    As someone who grew up in Swabian Bavaria, my dialect is a mix of both Swabian and Bavarian, I struggle to understand Northern Bavaria but I struggle much less with Basel Swiss Germany (which is distinct from Swiss German in that it originates from Lower Allemans instead of Higher Allemans) which is quite close in a lot of ways.

                                                                                    And the swiss then double down on making things confusing by sometimes using french language constructs in german words, or straight up importing french or italian words.

                                                                                    1. 2

                                                                                      East Germans had an uppercase ß since 1957

                                                                                      What should I read to learn more about this? Why wasn’t the character in Unicode 1.0, then?

                                                                                      1. 5

                                                                                        East Germany added the uppercase ß in 1957 and removed it in 1984. The spelling rules weren’t updated, so despite the presence of an uppercase ß, it would have been wrong to use it in any circumstances. Since Unicode 1.0 is somewhere around 1992, with some early drafts in 1988, it basically missed the uppercase ß being in the dictionary.

                                                                                        The uppercase ß itself has been around since 1905 and we’ve tried to get it into Unicode since roughly 2004.

                                                                                        1. 1

                                                                                          Is this more like there being an attested occurrence in a particular dictionary in East Germany in 1957 rather than common usage in East Germany?

                                                                                      2. 7

                                                                                        Every expression in English, however vague and out-of-context, always has exactly one translation in every other language.

                                                                                        I’m not sure what this means because many expressions in English don’t even have a single explanation in English, but in any case, idioms and double entendres often can’t be translated directly.

                                                                                        A good example of this is a CMS I used to work on. The way it implemented translation was to define everything using English[0], then write translations as a mapping from those English snippets to the intended language. This is fundamentally flawed, e.g. by homonyms:

                                                                                        Subject            From       Flags              Actions
                                                                                        ----------------------------------------------------------------
                                                                                        Project update     Alice      Unread, Important  [Read] [Delete]
                                                                                        Welcome            HR         Read               [Read] [Delete]
                                                                                        

                                                                                        Here, the “Read” flag means “this has been read”, whilst the “Read” button means “I want to read this”. Using the English as a key forces the same translation on both.

                                                                                        [0] We used British English, except for the word “color”; since we felt it was better to match the CSS keywords (e.g. when defining themes, etc.).

                                                                                        1. 4

                                                                                          One trick is to use a different word on the asset: Reviewed(adj) and Review(v) don’t have the same problem that Read(adj) and Read(v) do. Seen(adj) and See(v); Viewed(adj) and View(v). And so on. Then you can “translate” to English to actually use Unread/Read/[Read] if you still like it without confusing the translator who need to know you want more like e.g. Lido/Ler or 阅读/显示 and so on.

                                                                                        2. 3

                                                                                          Much better than the original article. Also love how many of the counter examples come from English.

                                                                                        3. 16

                                                                                          My bar for these lists is https://yourcalendricalfallacyis.com/ and most “falsehoods programmers believe” lists don’t meet it.

                                                                                          1. 6

                                                                                            The number of exceptions caused by the Hebrew calendar makes me shed a tear of joy.

                                                                                            Here’s one falsehood they missed: the length of a year varies by at most one day. True in Gregorian calendar, apparently true in the Islamic calendar, but not true in the Hebrew calendar: leap years are 30 days longer than regular years.

                                                                                            1. 2

                                                                                              They sorta cover it on the “days” section, by way of mentioning that the Hebrew calendar has leap months.

                                                                                              They also miss Byzantine calendars which are still used by many churches, related to the Jewish Greek calendar from the Septuagint. It’s of course complicated by the fact that many churches & groups do not agree on what year was the start, so it’s complex to use (but still in somewhat fairly common liturgical use).

                                                                                              1. 1

                                                                                                Wow 30? I need to red more about this

                                                                                            2. 10

                                                                                              Here’s a fun (counter)example of (something like) this one from my heritage language:

                                                                                              In each language, the words for yes and no never change, regardless of which question they are answering.

                                                                                              (Context: the word for enjoy/like is the same in the language, so when translating to English, I choose whichever sounds most natural in each given example sentence.)

                                                                                              When someone says, “do you (enjoy/)like it?”, if you want to say “yes, I like it”, that’s fine, but if you want to say you don’t like it, you would say, “I don’t want it”; if you were to say, “I don’t like it” in that situation, it would mean, “I don’t want it”. The same reversal happens if they ask, “do you want it?”, and you want to respond in the negative.

                                                                                              So someone would say, “do you want a chocolate bar?”, and you’d say, “no, I don’t want it”, and that would mean, “no, (I already know) I don’t (usually/habitually) enjoy it (when I have it), (therefore I don’t want it)”, whereas, “no, I don’t enjoy it” would just straightforwardly mean, “I don’t want it”.

                                                                                              (You can also respond with just, “no!” instead of using a verb in the answer.)

                                                                                              This only happens in the habitual present form. Someone might ask, “do you like chocolate?” before they offer you some, and you can say, “no, I don’t want it”, but if they already gave you a chocolate bar to try, they may ask, “did you like it?” in the past tense, and you’d have to respond with, “I didn’t like it” instead of, “I didn’t want it”. And, “do you want chocolate?” would be met with, “no, I don’t like it”, but “did you want chocolate?” would be met with, “no, I didn’t want it”, and that second one would just mean what it straightforwardly sounds like in English.

                                                                                              (Strictly speaking, it doesn’t need to be a response to a question, I’m just putting it into a context to show that the verb used in the answer isn’t just a negative form of the same verb used in the question.)

                                                                                              (It’s hard to explain because if you were to translate this literalistically to English, it wouldn’t even be noticed, since saying, “no, I don’t like it” in response to, “do you want it?” is quite natural, but does literally just mean, “I don’t like it”, in the sense of, “no, (I already know) I don’t (usually/habitually) enjoy it (when I have it), (therefore I don’t want it)”. Even, “no, I don’t want it“ in response to, “do you like it?” is fairly natural in English, if a little presumptive-sounding.)

                                                                                              1. 4

                                                                                                In Polish when someone asks you “Czy chcesz cukru do kawy?” (“Do you want coffee with sugar?”) and you can respond with “Dziękuję”, which can mean 2 opposite things “Yes, please” or “No, thank you”.

                                                                                              2. 6

                                                                                                The original ones, like “…Names”, don’t; part of what I find fun about them is trying to think of counterexamples.

                                                                                                1. 6

                                                                                                  I think if you want them to be useful they need to include counterexamples. If it’s just a vent post then it’s fine to leave them.

                                                                                                  1. 4

                                                                                                    The first one gets a pass because it was the first one, and even then, I think it’s better to link people one of the many explainers people wrote about it.

                                                                                                2. 1

                                                                                                  Building Tangled with @op—a social git collaboration platform built on atproto, that’s designed to be decentralised since day one!

                                                                                                  Also, quit my job (today was my last day) to work on startups (exploring various ideas) for the next three months. Excited to see where I end up.

                                                                                                  1. 2

                                                                                                    What are your other startups?

                                                                                                  2. 7

                                                                                                    The biggest one is that Git is really difficult to use. If you think otherwise, you probably don’t understand it very well.

                                                                                                    No, git is deep and has a long and sometimes steep learning curve, but if you do understand it very well then it is not difficult to use.

                                                                                                    People make it look difficult to use by lacking the knowledge required to use it.

                                                                                                    I honestly think most people shouldn’t be using git, a simpler VCS with an easier learning curve is what I would rather see people use. Silly jokes like XKCD #1597 are a symptom of people who are not interested in learning git being forced to use git in situations where (due to the fact that nobody seems to know git) it clearly is overkill and isn’t being used even close to its full potential.

                                                                                                    And to respond to the referenced claim of: “Git is not a success story. Git is a failure as a system with a crap user experience that forces you to learn more about the tool you’re using that about getting your work done.”

                                                                                                    Git is successful for the people who made it to manage the task they needed it to manage. The UI could do with some improvement, but its complexity is mostly necessary for the task it’s for. It’s not “a failure” just because it’s over-complicated for your workflow and just because you don’t need most of the features it provides. Just like vim would not be a failure if someone insisted on replacing windows notepad.exe with it and then everyone who was previously happy with the simple and limited feature-set of notepad.exe started complaining.

                                                                                                    The real failure is in all the people who popularized git as a solution for all the 99% of use-cases where it’s extremely overkill or just mismatched.

                                                                                                    1. 9

                                                                                                      I agree about Git being easy to use if you understand it well.
                                                                                                      That’s exactly why they’re saying it’s a failure: the deep knowledge is necessary. Good tools don’t require understanding their inner workings to be used efficiently.
                                                                                                      You can see that as failure of UX or of abstraction, but a tool’s value is not dissociable from that UX so such a failure is a failure of the tool.

                                                                                                      TL;DR: you and I both have Stockholm syndrome* and enjoy using Git, but it’s not a tool design success.

                                                                                                      * the actual existence of the phenomenon is disputed, but you get what I mean

                                                                                                      1. 1

                                                                                                        Agreed. Git is the first version control system I’ve used that mostly just works and has all the features I need. (And yes I’ve used mercurial).

                                                                                                        Which I guess is why we’re currently entering a phase of many tools built on top of git.

                                                                                                      2. 5

                                                                                                        I am bothered by this llm tool taking its name directly from the generic concept.

                                                                                                        1. 9

                                                                                                          When I released the first version in in April 2023 the acronym LLM was still pretty obscure.

                                                                                                          I was excited to find a three letter tool name that was still available on PyPI.

                                                                                                          1. 5

                                                                                                            I’m not gonna lie, the three letter name is a compelling reason.

                                                                                                            And I hope it’s coming across that I’m not ascribing any kinda intent to you. I think it makes sense as a name for the tool and I think that it might cause some who’s-on-first type shenanigans.

                                                                                                            1. 5

                                                                                                              This stops making sense if you take into consideration most classic UNIX utilities.

                                                                                                              I am bothered by this “sort” tool taking its name directly from the generic concept.

                                                                                                              Why can’t we have a de-facto utility for interacting with LLMs?

                                                                                                              1. 3

                                                                                                                You’re using that term incorrectly, this is a better example of “de jure,” in that a person has by decree seized ownership of a generic term. De facto would be like, I dunno, WinRAR being the de facto archive utility for Windows.

                                                                                                                Sort was released 53 years ago, we’ve learned lessons since then. This project takes ownership of a term used by millions (billions?) of people. At the time sort was written the author probably had the phone number of every possibly immediately impacted person.

                                                                                                                1. 2

                                                                                                                  At least with tools it’s your choice whether to introduce the tool into your PATH. But the phenomenon is a continual annoyance to me in package managers without mandatory namespacing. I don’t know if it’s hubris or ignorance that leads people to do this. But they do, so if you’re making a package repository, I beg of you, require namespacing.

                                                                                                                  Just yesterday I wanted a nice wrapper for bitfields in Rust and found the bitfield crate, which provides a somewhat inelegant macro, and doesn’t actually work with another crate that it claims to support. There is a bitfield-struct crate that is a bit better, possibly named that because the other one already took bitfield. But there just shouldn’t be a way for someone to squat on the name bitfield.

                                                                                                              2. 4

                                                                                                                I’m not bothered because it doesn’t strike me as some moneymaking land-grab. I’d rather the name be used by this agnostic tool than a particular company’s product.

                                                                                                                However I do have to disambiguate a little bit every time I talk about it, by calling it “simonw’s llm” or “the pypi package llm.” That’s cumbersome.

                                                                                                                1. 1

                                                                                                                  That confusion is my concern. I’m not calling anyone an asshole, I think this harm is unintentional but it is still present.

                                                                                                                  1. 4

                                                                                                                    I’m not sure I could call it harm on the basis of needing to say another word or two to speak clearly about something I find so helpful.

                                                                                                                    1. 2

                                                                                                                      It seems silly for me to say this in a discussion about words but: I don’t think you quibbling over the word harm is productive, and I do think it’s appropriate. This doesn’t just make it confusing to discuss the tool, it makes it confusing to NOT discuss the tool. The overload is present when discussing the concept, as well.

                                                                                                                      The confusion isn’t inherent to the utility, it’s a choice. It’s optional. You only have to say more words to speak clearly about it because it is unclear for no good reason. Thing coulda been called llmer, llmo, llimo (gets you there in style), whatever. Just some differentiating mouth noises.

                                                                                                              3. 43

                                                                                                                A good read, though it went in a different direction than I expected. I guessed the article was going to advocate for taking time to step away from a problem and use one’s own brain, which often happens when one relaxes their overworked gray matter while showering.

                                                                                                                1. 12

                                                                                                                  That is a genius way to conclude the post! Mind if I use that idea?

                                                                                                                  1. 4

                                                                                                                    Same. When people ask me about the value I find in LLMs beyond really simple cases, I usually find them about as useful an aide in solving problems as spending an extra minute thinking about it away from the computer.

                                                                                                                    1. 1

                                                                                                                      Similarly, I thought it would be about the qualities and shortcomings of both. Which I believe to be the first and most important question we should ask ourselves.

                                                                                                                      Everyone is just assuming AI can do a bunch of jobs, or at least provide a good enough solution. But I am realizing that often times that is just an illusion of giving you something that kind of works. I am starting to see the pattern that the se of AI results in the expenditure of more man hours than something written by hand.

                                                                                                                      1. 1

                                                                                                                        I think it’s a question of learning how to use it and not use it. I’ve generally found that when I can quickly type a fairly complete description of the happy path of an application and get a fairly small program back, then usually it’s a good start.

                                                                                                                        It’s a good augmentation to what is sometimes called scripting.

                                                                                                                        1. 2

                                                                                                                          I didn’t mean myself so much. I can choose of o want to use it or not and when or not to use it. But I have no control over how other people work. As we speak, AI generated code is being sneaked in as authored by whomever is commiting it. Much of it unchecked, with the developers not possessing the skill to write it themselves. I think we are just looking at the initial illusion of “free code” without addressing the very real risk that it actually ends up being pretty expensive. I personally have come across plenty of code that was chatgpt generated and was the source of all sorta of bugs.

                                                                                                                    2. 69

                                                                                                                      The majority of bugs (quantity, not quality/severity) we have are due to the stupid little corner cases in C that are totally gone in Rust. Things like simple overwrites of memory (not that rust can catch all of these by far), error path cleanups, forgetting to check error values, and use-after-free mistakes. That’s why I’m wanting to see Rust get into the kernel, these types of issues just go away, allowing developers and maintainers more time to focus on the REAL bugs that happen (i.e. logic issues, race conditions, etc.)

                                                                                                                      This is an extremely strong statement.

                                                                                                                      I think a few things are also interesting:

                                                                                                                      1. I think people are realizing how low quality the Linux kernel code is, how haphazard development is, how much burnout and misery is involved, etc.

                                                                                                                      2. I think people are realizing how insanely not in the open kernel dev is, how much is private conversations that a few are privy to, how much is politics, etc.

                                                                                                                      1. 35

                                                                                                                        I think people are realizing how insanely not in the open kernel dev is, how much is private conversations that a few are privy to, how much is politics, etc.

                                                                                                                        The Hellwig/Ojeda part of the thread is just frustrating to read because it almost feels like pleading. “We went over this in private” “we discussed this already, why are you bringing it up again?” “Linus said (in private so there’s no record)”, etc., etc.

                                                                                                                        1. 45

                                                                                                                          Dragging discussions out in front of an audience is a pretty decent tactic for dealing with obstinate maintainers. They don’t like to explain their shoddy reasoning in front of people, and would prefer it remain hidden. It isn’t the first tool in the toolbelt but at a certain point there is no convincing people directly.

                                                                                                                          1. 31

                                                                                                                            Dragging discussions out in front of an audience is a pretty decent tactic for dealing with

                                                                                                                            With quite a few things actually. A friend of mine is contributing to a non-profit, which until recently had this very toxic member (they’ve even attempted felony). They were driven out of the non-profit very soon after members talked in a thread that was accessible to all members. Obscurity is often one key component of abuse, be it mere stubbornness or criminal behaviour. Shine light, and it often goes away.

                                                                                                                            1. 13

                                                                                                                              IIRC Hintjens noted this quite explicitly as a tactic of bad actors in his works.

                                                                                                                              It’s amazing how quickly people are to recognize folks trying to subvert an org piecemeal via one-off private conversations once everybody can compare notes. It’s equally amazing to see how much the same people beforehand will swear up and down oh no that’s a conspiracy theory such things can’t happen here until they’ve been burned at least once.

                                                                                                                              This is an active, unpatched attack vector in most communities.

                                                                                                                              1. 12

                                                                                                                                I’ve found the lowest example of this is even meetings minutes at work. I’ve observed that people tend to act more collaboratively and seek the common good if there are public minutes, as opposed to trying to “privately” win people over to their desires.

                                                                                                                            2. 5

                                                                                                                              There is something to be said for keeping things between people with skin in the game.

                                                                                                                              It’s flipped over here, though, because more people want to contribute. The question is whether it’ll be stabe long-term.

                                                                                                                            3. 18

                                                                                                                              I think people are realizing how low quality the Linux kernel code is, how haphazard development is, how much burnout and misery is involved, etc.

                                                                                                                              Something I’ve noticed is true in virtually everything I’ve looked deeply at is the majority of work is poor to mediocre and most people are not especially great at their jobs. So it wouldn’t surprise me if Linux is the same. (…and also wouldn’t surprise me if the wonderful Rust rewrite also ends up poor to mediocre.)

                                                                                                                              yet at the same time, another thing that astonishes me is how much stuff actually does get done and how well things manage to work anyway. And Linux also does a lot and works pretty well. Mediocre over the years can end up pretty good.

                                                                                                                              1. 14

                                                                                                                                After tangentially following the kernel news, I think a lot of churning and death spiraling is happening. I would much rather have a rust-first kernel that isn’t crippled by the old guard of C developers reluctant to adopt new tech.

                                                                                                                                Take all of this energy into RedoxOS and let Linux stay in antiquity.

                                                                                                                                1. 36

                                                                                                                                  I’ve seen some of the R4L people talk on Mastodon, and they all seem to hate this argument.

                                                                                                                                  They want to contribute to Linux because they use it, want to use it, and want to improve the lives of everyone who uses it. The fact that it’s out there and deployed and not a toy is a huge part of the reason why they want to improve it.

                                                                                                                                  Hopping off into their own little projects which may or may not be useful to someone in 5-10 years’ time is not interesting to them. If it was, they’d already be working on Redox.

                                                                                                                                  1. 2

                                                                                                                                    The most effective thing that could happen is for the Linux foundation, and Linus himself, to formally endorse and run a Rust-based kernel. They can adopt an existing one or make a concerted effort to replace large chunks of Linux’s C with Rust.

                                                                                                                                    IMO the Linux project needs to figure out something pretty quickly because it seems to be bleeding maintainers and Linus isn’t getting any younger.

                                                                                                                                    1. 0

                                                                                                                                      They may be misunderstanding the idea that others are not necessarily incentivized to do things just because it’s interesting for them (the Mastodon posters).

                                                                                                                                    2. 4

                                                                                                                                      Yep, I made a similar remark upthread. A Rust-first kernel would have a lot of benefits over Linux, assuming a competent group of maintainers.

                                                                                                                                      1. 4

                                                                                                                                        along similar lines: https://drewdevault.com/2024/08/30/2024-08-30-Rust-in-Linux-revisited.html

                                                                                                                                        Redox does have the chains of trying to do new OS things. An ABI-compatible Rust rewrite of the Linux kernel might get further along than expected (even if it only runs in virtual contexts, without hardware support (that would come later.))

                                                                                                                                        1. 44

                                                                                                                                          Linux developers want to work on Linux, they don’t want to make a new OS. Linux is incredibly important, and companies already have Rust-only drivers for their hardware.

                                                                                                                                          Basically, sure, a new OS project would be neat, but it’s really just completely off topic in the sense that it’s not a solution for Rust for Linux. Because the “Linux” part in that matters.

                                                                                                                                          1. 19

                                                                                                                                            I read a 25+ year old article [1] from a former Netscape developer that I think applies in part

                                                                                                                                            The idea that new code is better than old is patently absurd. Old code has been used. It has been tested. Lots of bugs have been found, and they’ve been fixed. There’s nothing wrong with it. It doesn’t acquire bugs just by sitting around on your hard drive. Au contraire, baby! Is software supposed to be like an old Dodge Dart, that rusts just sitting in the garage? Is software like a teddy bear that’s kind of gross if it’s not made out of all new material?

                                                                                                                                            Adopting a “rust-first” kernel is throwing the baby out with the bathwater. Linux has been beaten into submission for over 30 years for a reason. It’s the largest collaborative project in human history and over 30 million lines of code. Throwing it out and starting new would be an absolutely herculean effort that would likely take years, if it ever got off the ground.

                                                                                                                                            [1] https://www.joelonsoftware.com/2000/04/06/things-you-should-never-do-part-i/

                                                                                                                                            1. 33

                                                                                                                                              The idea that old code is better than new code is patently absurd. Old code has stagnated. It was built using substandard, out of date methodologies. No one remembers what’s a bug and what’s a feature, and everyone is too scared to fix anything because of it. It doesn’t acquire new bugs because no one is willing to work on that weird ass bespoke shit you did with your C preprocessor. Au contraire, baby! Is software supposed to never learn? Are we never to adopt new tools? Can we never look at something we’ve built in an old way and wonder if new methodologies would produce something better?

                                                                                                                                              This is what it looks like to say nothing, to beg the question. Numerous empirical claims, where is the justification?

                                                                                                                                              It’s also self defeating on its face. I take an old codebase, I fix a bug, the codebase is now new. Which one is better?

                                                                                                                                              1. 16

                                                                                                                                                Like most things in life the truth is somewhere in the middle. There is a reason there is the concept of a “mature node” in the semiconductor industry. They accept that new is needed for each node, but also that the new thing takes time to iron out the kinks and bugs. This is the primary reason why you see apple take new nodes on first before Nvidia for example, as Nvidia require much larger die sizes, and so less defects per square mm.

                                                                                                                                                You can see this sometimes in software for example X11 vs Wayland, where adoption is slow, but most definetly progressing and now-days most people can see that Wayland is now, or is going to become the dominant tech in the space.

                                                                                                                                                1. 16

                                                                                                                                                  The truth lies where it lies. Maybe the middle, maybe elsewhere. I just don’t think we’ll get to the truth with rhetoric.

                                                                                                                                                    1. 7

                                                                                                                                                      I don’t think this would qualify as dialectic, it lacks any internal debate and it leans heavily on appeals by analogy and intuition/ emotion. The post itself makes a ton of empirical claims without justification even beyond the quoted bit.

                                                                                                                                                2. 15

                                                                                                                                                  “Good” is subjective, but there is real evidence that older code does contain fewer vulnerabilities: https://www.usenix.org/conference/usenixsecurity22/presentation/alexopoulos

                                                                                                                                                  That means we can probably keep a lot of the old trusty Linux code around while making more of the new code safe by writing it in Rust in the first place.

                                                                                                                                                  1. 10

                                                                                                                                                    I don’t think that’s a fair assessment of Spolsky’s argument or of CursedSilicon’s application of it to the Linux kernel.

                                                                                                                                                    Firstly, someone has already pointed out the research that suggests that existing code has fewer bugs in than new code (and that the older code is, the less likely it is to be buggy).

                                                                                                                                                    Secondly, this discussion is mainly around entire codebases, not just existing code. Codebases usually have an entire infrastructure around them for verifying that the behaviour of the codebase has not changed. This is often made up of tests, but it’s also made up of the users who try out a release of a codebase and determine whether it’s working for them. The difference between making a change to an existing codebase and releasing a new project largely comes down to whether this verification (both in terms of automated tests and in terms of users’ ability to use the new release) works for the new code.

                                                                                                                                                    Given this difference, if I want to (say) write a new OS completely in Rust, I need to choose: Do I want to make it completely compatible with Linux, and therefore take on the significant challenge of making sure everything behaves truly the same? Or do I make significant breaking changes, write my own OS, and therefore force potential adopters to rebuild their entire Linux workflows in my new OS?

                                                                                                                                                    The point is not that either of these options are bad, it is that they represent significant risks to a project. Added to the general risk that is writing new code, this produces a total level of risk that might be considered the baseline risk of doing a rewrite. Now risk is not bad per se! If the benefits of being able to write an OS in a language like Rust outweigh the potential risks, then it still makes sense to perform the rewrite. Or maybe the existing Linux kernel is so difficult to maintain that a new codebase really would be the better option. But the point that CursedSilicon was making by linking the Spolsky piece was, I believe, that the risks for a project like the Linux kernel are very high. There is a lot of existing, old code. And there is a very large ecosystem where either breaking or maintaining compatibility would each come with significant challenges.

                                                                                                                                                    Unfortunately, it’s very difficult to measure the risks and benefits here in a quantitative, comparable way, so I think where you fall on the “rewrite vs continuity” spectrum will depend mostly on what sort of examples you’ve seen, and how close you think this case is to those examples. I don’t think there’s any objective way to say whether it makes more sense to have something like R4L, or something like RedoxOS.

                                                                                                                                                    1. 7

                                                                                                                                                      Firstly, someone has already pointed out the research that suggests that existing code has fewer bugs in than new code (and that the older code is, the less likely it is to be buggy).

                                                                                                                                                      I haven’t read it yet, but I haven’t made an argument about that, I just created a parody of the argument as presented. I’ll be candid, i doubt that the research is going to compel me to believe that newer code is inherently buggier, it may compel me to confirm my existing belief that testing software in the field is one good method to find some classes of bugs.

                                                                                                                                                      Secondly, this discussion is mainly around entire codebases, not just existing code.

                                                                                                                                                      I guess so, it’s a bit dependent on where we say the discussion starts - three things are relevant; RFL, which is not a wholesale rewrite, a wholesale rewrite of the Linux kernel, and Netscape. RFL is not about replacing the entire Linux kernel, although perhaps “codebase” here refers to some sort of unit, like a driver. Netscape wanted a wholesale rewrite, based on the linked post, so perhaps that’s what’s really “the single worst strategic mistake that any software company can make”, but I wonder what the boundary here is? Also, the article immediately mentions that Microsoft tried to do this with Word but it failed, but that Word didn’t suffer from this because it was still actively developed - I wonder if it really “failed” just because pyramid didn’t become the new Word? Did Microsoft have some lessons learned, or incorporate some of that code? Dunno.

                                                                                                                                                      I think I’m really entirely justified when I say that the post is entirely emotional/ intuitive appeals, rhetoric, and that it makes empirical claims without justification.

                                                                                                                                                      There’s a subtle reason that programmers always want to throw away the code and start over. The reason is that they think the old code is a mess. And here is the interesting observation: they are probably wrong. The reason that they think the old code is a mess is because of a cardinal, fundamental law of programming:

                                                                                                                                                      This is rhetoric. These are unsubstantiated empirical claims. The article is all of this. It’s fine as an interesting, thought provoking read that gets to the root of our intuitions, but I think anyone can dismiss it pretty easily since it doesn’t really provide much in the form of an argument.

                                                                                                                                                      It’s important to remember that when you start from scratch there is absolutely no reason to believe that you are going to do a better job than you did the first time.

                                                                                                                                                      Again, totally unsubstantiated. I have MANY reasons to believe that, it is simply question begging to say otherwise.

                                                                                                                                                      That’s all this post is. Over and over again making empirical claims with no evidence and question beggign.

                                                                                                                                                      We can discuss the risks and benefits, I’d advocate for that. This article posted doesn’t advocate for that. It’s rhetoric.

                                                                                                                                                      1. 11

                                                                                                                                                        existing code has fewer bugs in than new code (and that the older code is, the less likely it is to be buggy).

                                                                                                                                                        This is a truism. It is survival bias. If the code was buggy, it would eventually be found and fixed. So all things being equal newer code is riskier than old code. But it’s also been impirically shown that using Rust for new code is not “all things being equal”. Google showed that new code in Rust is as reliable as old code in C. Which is good news: you can use old C code from new Rust projects without the risk that comes from new C code.

                                                                                                                                                        1. 5

                                                                                                                                                          But it’s also been impirically shown that using Rust for new code is not “all things being equal”.

                                                                                                                                                          Yeah, this is what I’ve been saying (not sure if you’d meant to respond to me or the parent, since we agree) - the issue isn’t “new” vs “old” it’s things like “reviewed vs unreviewed” or “released vs unreleased” or “tested well vs not tested well” or “class of bugs is trivial to express vs class of bugs is difficult to express” etc.

                                                                                                                                                          1. 2

                                                                                                                                                            I don’t disagree that the rewards can outweigh the risks, and in this case I think there’s a lot of evidence that suggests that memory safety as a default is really important for all sorts of reasons. Let alone the many other PL developments that make Rust a much more suitable language to develop in than C.

                                                                                                                                                            That doesn’t mean the risks don’t exist, though.

                                                                                                                                                      2. 4

                                                                                                                                                        It’s also self defeating on its face. I take an old codebase, I fix a bug, the codebase is now new. Which one is better?

                                                                                                                                                        Nobody would call an old codebase with a handful of fixes a new codebase, at least not in the contexts in which those terms have been used here.

                                                                                                                                                          1. 6

                                                                                                                                                            It’s a Ship of Theseus—at no point can you call it a “new” codebase, but after a period of time, it could be completely different code. I have a C program I’ve been using and modifying for 25 years. At any given point, it would have been hard to say “this is now a new codebase, yet not one line of code in the project is the same as when I started (even though it does the same thing at it always has).

                                                                                                                                                            1. 4

                                                                                                                                                              I don’t see the point in your question. It’s going to depend on the codebase, and on the nature of the changes; it’s going to be nuanced, and subjective at least to some degree. But the fact that it’s prone to subjectivity doesn’t mean that you get to call an old codebase with a single fixed bug a new codebase, without some heavy qualification which was lacking.

                                                                                                                                                              1. 1

                                                                                                                                                                If it requires all of that nuance and context maybe the issue isn’t what’s “old” and what’s “new”.

                                                                                                                                                                  1. 4

                                                                                                                                                                    What’s old and new is poorly defined and yet there’s an argument being made that “old” and “new” are good indicators of something. If they’re so poorly defined that we have to bring in all sorts of additional context like the nature of the changes, not just when they happened or the number of lines changed, etc, then it seems to me that we would be just as well served to throw away the “old” and “new” and focus on that context.

                                                                                                                                                                    1. 2

                                                                                                                                                                      I feel like enough people would agree more-or-less on what was an “old” or “new” codebase (i.e. they would agree given particular context) that they remain useful terms in a discussion. The general context used here is apparent (at least to me) given by the discussion so far: an older codebase has been around for a while, has been maintained, has had kinks ironed out.

                                                                                                                                                                      1. 3

                                                                                                                                                                        There’s a really important distinction here though. The point is to argue that new projects will be less stable than old ones, but you’re intuitively (and correctly) bringing in far more important context - maintenance, testing, battle testing, etc. If a new implementation has a higher degree of those properties then it being “new” stops being relevant.

                                                                                                                                                                        1. 2

                                                                                                                                                                          Ok, but:

                                                                                                                                                                          It’s also self defeating on its face. I take an old codebase, I fix a bug, the codebase is now new. Which one is better?

                                                                                                                                                                          My point was that this statement requires a definition of “new codebase” that nobody would agree with, at least in the context of the discussion we’re in. Maybe you are attacking the base proposition without applying the surrounding context, which might be valid if this were a formal argument and not a free-for-all discussion.

                                                                                                                                                                          If a new implementation has a higher degree of those properties

                                                                                                                                                                          I think that it would be considered no longer new if it had had significant battle-testing, for example.

                                                                                                                                                                          FWIW the important thing in my view is that every new codebase is a potential old codebase (given time and care), and a rewrite necessarily involves a step backwards. The question should probably not be, which is immediately better?, but, which is better in the longer term (and by how much)? However your point that “new codebase” is not automatically worse is certainly valid. There are other factors than age and “time in the field” that determine quality.

                                                                                                                                                          2. 1

                                                                                                                                                            Methodologies don’t matter for quality of code. They could be useful for estimates, cost control, figuring out whom you shall fire etc. But not for the quality of code.

                                                                                                                                                            1. 4

                                                                                                                                                              You’re suggesting that the way you approach programming has no bearing on the quality of the produced program?

                                                                                                                                                              1. 3

                                                                                                                                                                I’ve never observed a programmer become better or worse by switching methodology. Dijkstra would’ve not became better if you made him do daily standups or go through code reviews.

                                                                                                                                                                There are ways to improve your programming by choosing different approach but these are very individual. Methodology is mostly a beancounting tool.

                                                                                                                                                                1. 3

                                                                                                                                                                  When I say “methodology” I’m speaking very broadly - simply “the approach one takes”. This isn’t necessarily saying that any methodology is better than any other. The way I approach a task today is better, I think, then the way that I would have approached that task a decade ago - my methodology has changed, the way I think has changed. Perhaps that might mean I write more tests, or I test earlier, but it may mean exactly the opposite, and my methods may only work best for me.

                                                                                                                                                                  I’m not advocating for “process” or ubiquity, only that the approach one tasks may improve over time, which I suspect we would agree on.

                                                                                                                                                          3. 28

                                                                                                                                                            If you take this logic to its end, you should never create new things.

                                                                                                                                                            At one point in time, Linux was also the new kid on the block.

                                                                                                                                                            The best time to plant a tree is 30 years ago. The second best time is now.

                                                                                                                                                            1. 7

                                                                                                                                                              I read a 25+ year old article [1] from a former Netscape developer that I think applies in part

                                                                                                                                                              I don’t think Joel Spolsky was ever a Netscape developer. He was a Microsoft developer who worked on Excel.

                                                                                                                                                              1. 2

                                                                                                                                                                My mistake! The article contained a bit about Netscape and I misremembered it

                                                                                                                                                              2. 5

                                                                                                                                                                It’s the largest collaborative project in human history and over 30 million lines of code.

                                                                                                                                                                How many of those lines are part of the core? My understanding was that the overwhelming majority was driver code. There may not be that much core subsystem code to rewrite.

                                                                                                                                                                1. 5

                                                                                                                                                                  For a previous project, we included a minimal Linux build. It was around 300 KLoC, which included networking and the storage stack, along with virtio drivers.

                                                                                                                                                                  That’s around the size a single person could manage and quite easy with a motivated team.

                                                                                                                                                                  If you started with DPDK and SPDK then you’d already have filesystems and a copy of the FreeBSD network stack to run in isolated environments.

                                                                                                                                                                  1. 2

                                                                                                                                                                    Once many drivers share common rust wrappers over core subsystems, you could flip it and write the subsystem in Rust. Then expose C interface for the rest.

                                                                                                                                                                    1. 3

                                                                                                                                                                      Oh sure, that would be my plan as well. And I bet some subsystem maintainers see this coming, and resist it for reasons that aren’t entirely selfless.

                                                                                                                                                                      1. 3

                                                                                                                                                                        That’s pretty far into the future, both from a maintainer acceptance PoV and from a rustc_codegen_gcc and/or gccrs maturity PoV.

                                                                                                                                                                        1. 4

                                                                                                                                                                          Sure. But I doubt I’ll running a different kernel 10y from now.

                                                                                                                                                                          And like us, those maintainers are not getting any younger and if they need a hand, I am confident I’ll get faster into it with a strict type checker.

                                                                                                                                                                          I am also confident nobody in our office would be able to help out with C at all.

                                                                                                                                                                    2. 4

                                                                                                                                                                      It’s the largest collaborative project in human history

                                                                                                                                                                      This cannot possibly be true.

                                                                                                                                                                      1. 5

                                                                                                                                                                        It’s the largest collaborative project in human history

                                                                                                                                                                        It’s the largest collaborative open source os kernel project in human history

                                                                                                                                                                        1. 4

                                                                                                                                                                          It’s been described as such based purely on the number of unique human contributions to it

                                                                                                                                                                      2. 7

                                                                                                                                                                        I see that Drew proposes a new OS in that linked article, but I think a better proposal in the same vein is a fork. You get to keep Linux, but you can start porting logic to Rust unimpeded, and it’s a manageable amount of work to keep porting upstream changes.

                                                                                                                                                                        Remember when libav forked from ffmpeg? Michael Niedermayer single-handedly ported every single libav commit back into ffmpeg, and eventually, ffmpeg won.

                                                                                                                                                                        At first there will be extremely high C percentage, low Rust percentage, so porting is trivial, just git merge and there will be no conflicts. As the fork ports more and more C code to Rust, however, you start to have to do porting work by inspecting the C code and determining whether the fixes apply to the corresponding Rust code. However, at that point, it means you should start seeing productivity gains, community gains, and feature gains from using a better language than C. At this point the community growth should be able to keep up with the extra porting work required. And this is when distros will start sniffing around, at first offering variants of the distro that uses the forked kernel, and if they like what they taste, they might even drop the original.

                                                                                                                                                                        I genuinely think it’s a strong idea, given the momentum and potential amount of labor Rust community has at its disposal.

                                                                                                                                                                        I think the competition would be great, especially in the domain of making it more contributor friendly to improve the kernel(s) that we use daily.

                                                                                                                                                                        1. 15

                                                                                                                                                                          I certainly don’t think this is impossible, for sure. But the point ultimately still stands: Linux kernel devs don’t want a fork. They want Linux. These folks aren’t interested in competing, they’re interested in making the project they work on better. We’ll see if some others choose the fork route, but it’s still ultimately not the point of this project.

                                                                                                                                                                        2. 5

                                                                                                                                                                          Linux developers want to work on Linux, they don’t want to make a new OS.

                                                                                                                                                                          While I don’t personally want to make a new OS, I’m not sure I actually want to work on Linux. Most of the time I strive for portability, and so abstract myself from the OS whenever I can get away with it. And when I can’t, I have to say Linux’s API isn’t always that great, compared to what the BSDs have to offer (epoll vs kqueue comes to mind). Most annoying though is the lack of documentation for the less used APIs: I’ve recently worked with Netlink sockets, and for the proc stuff so far the best documentation I found was the freaking source code of a third party monitoring program.

                                                                                                                                                                          I was shocked. Complete documentation of the public API is the minimum bar for a project as serious of the Linux kernel. I can live with an API I don’t like, but lack of documentation is a deal breaker.

                                                                                                                                                                          1. 10

                                                                                                                                                                            While I don’t personally want to make a new OS, I’m not sure I actually want to work on Linux.

                                                                                                                                                                            I think they mean that Linux kernel devs want to work on the Linux kernel. Most (all?) R4L devs are long time Linux kernel devs. Though, maybe some of the people resigning over LKML toxicity will go work on Redox or something…

                                                                                                                                                                            1. 5

                                                                                                                                                                              I’m talking about the people who develop the Linux kernel, not people who write userland programs for Linux.

                                                                                                                                                                          2. 2

                                                                                                                                                                            Re-Implementing the kernel ABI would be a ton of work for little gain if all they wanted was to upstream all the work on new hardware drivers that is already done - and then eventually start re-implementing bits that need to be revised anyway.

                                                                                                                                                                        3. 3

                                                                                                                                                                          If the singular required Rust toolchain didn’t feel like such a ridiculous to bootstrap 500 ton LLVM clown car I would agree with this statement without reservation.

                                                                                                                                                                            1. 4

                                                                                                                                                                              Zig is easier to implement (and I personally like it as a language) but doesn’t have the same safety guarantees and strong type system that Rust does. It’s a give and take. I actually really like Rust and would like to see a proliferation of toolchain options, such as what’s in progress in GCC land. Overall, it would just be really nice to have an easily bootstrapped toolchain that a normal person can compile from scratch locally, although I don’t think it necessarily needs to be the default, or that using LLVM generally is an issue. However, it might be possible that no matter how you architect it, Rust might just be complicated enough that any sufficiently useful toolchain for the language could just end up being a 500 ton clown car of some kind anyways.

                                                                                                                                                                              1. 2

                                                                                                                                                                                Depends on which parts of GP’s statement you care about: LLVM or bootstrap. Zig is still depending on LLVM (for now), but it is no longer bootstrappable in a limited number of steps (because they switched from a bootstrap C++ implementation of the compiler to keeping a compressed WASM build of the compiler as a blob.

                                                                                                                                                                                1. 2

                                                                                                                                                                                  Yep, although I would also add it’s unfair to judge Zig in any case on this matter now given it’s such a young project that clearly is going to evolve a lot before the dust begins to settle (Rust is also young, but not nearly as young as Zig). In ten to twenty years, so long as we’re all still typing away on our keyboards, we might have a dozen Zig 1.0 and a half dozen Zig 2.0 implementations!

                                                                                                                                                                          1. 6

                                                                                                                                                                            Yeah, the absurdly low code quality and toxic environment make me think that Linux is ripe for disruption. Not like anyone can produce a production kernel overnight, but maybe a few years of sustained work might see a functional, production-ready Rust kernel for some niche applications and from there it could be expanded gradually. While it would have a lot of catching up to do with respect to Linux, I would expect it to mature much faster because of Rust, because of a lack of cruft/backwards-compatibility promises, and most importantly because it could avoid the pointless drama and toxicity that burn people out and prevent people from contributing in the first place.

                                                                                                                                                                            1. 14

                                                                                                                                                                              the absurdly low code quality

                                                                                                                                                                              What is the, some kind of a new meme? Where did you hear it first?

                                                                                                                                                                              1. 22

                                                                                                                                                                                From the thread in OP, if you expand the messages, there is wide agreement among the maintainers that all sorts of really badly designed and almost impossible to use (safely) APIs ended up in the kernel over the years because the developers were inexperienced and kind of learning kernel development as they went. In retrospect they would have designed many of the APIs very differently.

                                                                                                                                                                                1. 4

                                                                                                                                                                                  Someone should compile everything to help future OS developers avoid those traps! There are a lot of exieting non-posix experiments though.

                                                                                                                                                                                2. 14

                                                                                                                                                                                  It’s based on my forays into the Linux kernel source code. I don’t doubt there’s some quality code lurking around somewhere, but the stuff I’ve come across (largely filesystem and filesystem adjacent) is baffling.

                                                                                                                                                                                  1. 7

                                                                                                                                                                                    Seeing how many people are confidently incorrect about Linux maintainers only caring about their job security and keeping code bad to make it a barrier to entry, if nothing else taught me how online discussions are a huge game of Chinese whispers where most participants don’t have a clue of what they are talking about.

                                                                                                                                                                                    1. 15

                                                                                                                                                                                      I doubt that maintainers are “only caring about their job security and keeping back code” but with all due respect: You’re also just taking arguments out of thin air right now. What I do believe is what we have seen: Pretty toxic responses from some people and a whole lot of issues trying to move forward.

                                                                                                                                                                                      1. 8

                                                                                                                                                                                        Seeing how many people are confidently incorrect about Linux maintainers only caring about their job security and keeping code bad to make it a barrier to entry

                                                                                                                                                                                        Huh, I’m not seeing any claim to this end from the GP, or did I not look hard enough? At face value, saying that something has an “absurdly low code quality” does not imply anything about nefarious motives.

                                                                                                                                                                                        1. 10
                                                                                                                                                                                          1. 7

                                                                                                                                                                                            Indeed that remark wasn’t directly referring to GP’s comment, but rather to the range of confidently incorrect comments that I read in the previous episodes, and to the “gatekeeping greybeards” theme that can be seen elsewhere on this page. First occurrence, found just by searching for “old”: Linux is apparently “crippled by the old guard of C developers reluctant to adopt new tech”, to which GP replied in agreement in fact. Another one, maintainers don’t want to “do the hard work”.

                                                                                                                                                                                            Still, in GP’s case the Chinese whispers have reduced “the safety of this API is hard to formalize and you pretty much have to use it the way everybody does it” to “absurdly low quality”. To which I ask, what is more likely. 1) That 30-million lines of code contain various levels of technical debt of which maintainers are aware; and that said maintainers are worried even of code where the technical debt is real but not causing substantial issue in practice? Or 2) that a piece of software gets to run on literally billions of devices of all sizes and prices just because it’s free and in spite of its “absurdly low quality”?

                                                                                                                                                                                            Linux is not perfect, neither technically nor socially. But it sure takes a lot of entitlement and self-righteousness to declare it “of absurdly low quality” with a straight face.

                                                                                                                                                                                            1. 11

                                                                                                                                                                                              GP here: I probably should have said “shockingly” rather than “absurdly”. I didn’t really expect to get lawyered over that one word, but yeah, the idea was that for a software that runs on billions of devices, the code quality is shockingly low.

                                                                                                                                                                                              Of course, this is plainly subjective. If your code quality standards are a lot lower than mine then you might disagree with my assessment.

                                                                                                                                                                                              That said, I suspect adoption is a poor proxy for code quality. Internet Explorer was widely adopted and yet it’s broadly understood to have been poorly written.

                                                                                                                                                                                              But it sure takes a lot of entitlement and self-righteousness to declare it “of absurdly low quality” with a straight face

                                                                                                                                                                                              I’m sure self-righteousness could get you to the same place, but in my case I arrived by way of experience. You can relax, I wasn’t attacking Linux—I like Linux—it just has a lot of opportunity for improvement.

                                                                                                                                                                                              1. 5

                                                                                                                                                                                                I guess I’ve seen the internals of too much proprietary software now to be shocked by anything about Linux per se. I might even argue that the quality of Linux is surprisingly good, considering its origins and development model.

                                                                                                                                                                                                I think I’d lawyer you a tiny bit differently: some of the bugs in the kernel shock me when I consider how many devices run that code and fulfill their purposes despite those bugs.

                                                                                                                                                                                                1. 7

                                                                                                                                                                                                  FWIW, I was not making a dig at open source software, and yes plenty of corporate software is worse. I guess my expectations for Linux are higher because of how often it is touted as exemplary in some form or another. I don’t even dislike Linux, I think it’s the best thing out there for a huge swath of use cases—I just see some pretty big opportunities for improvement.

                                                                                                                                                                                              2. 4

                                                                                                                                                                                                But it sure takes a lot of entitlement and self-righteousness to declare it “of absurdly low quality” with a straight face.

                                                                                                                                                                                                Or actual benchmarks: the performance the Linux kernel leaves on the table in some cases is absurd. And sure it’s just one example, but I wouldn’t be surprised if it was representative of a good portion of the kernel.

                                                                                                                                                                                                1. 3

                                                                                                                                                                                                  absurdly low quality

                                                                                                                                                                                                  Well not quite but still “considered broken beyond repair by many people related to life time management” - which is definitely worse than “hard to formalize” when “the way ever[y]body does it” seems to vary between each user.

                                                                                                                                                                                                  1. 4

                                                                                                                                                                                                    I love Rust but still, we’re talking of a language which (for good reasons!) considers doubly linked lists unsafe. Take an API that gets a 4 on Rusty Russell’s API design scale (“Follow common convention and you’ll get it right”), but which was designed for a completely different programming language if not paradigm, and it’s not surprising that it can’t easily be transformed into a 9 (“The compiler/linker won’t let you get it wrong”). But at the same time there are a dozen ways in which, according to the same scale, things could actually be worse!

                                                                                                                                                                                                    What I dislike is that people are seeing “awareness of complexity” and the message they spread is “absurdly low quality”.

                                                                                                                                                                                                    1. 13

                                                                                                                                                                                                      Note that doubly linked lists are not a special case at all in Rust. All the other common data structures like Vec, HashMap etc. also need unsafe code in their implementation.

                                                                                                                                                                                                      Implementing these datastructures in Rust, and writing unsafe code in general, is indeed roughly a 4. But these are all already implemented in the standard library, with an API that actually is at a 9. And std::collections::LinkedList is constructive proof that you can have a safe Rust abstraction for doubly linked lists.

                                                                                                                                                                                                      Yes, the implementation could have bugs, thus making the abstraction leaky. But that’s the case for literally everything, down to the hardware that your code runs on.

                                                                                                                                                                                                      1. 4

                                                                                                                                                                                                        You’re absolutely right that you can build abstractions with enough effort.

                                                                                                                                                                                                        My point is that if a doubly linked list is (again, for good reasons) hard to make into a 9, a 20-year-old API may very well be even harder. In fact, std::collections::LinkedList is safe but still not great (for example the cursor API is still unstable); and being in std, it was designed/reviewed by some of the most knowledgeable Rust developers, sort of by definition. That’s the conundrum that maintainers face and, if they realize that, it’s a good thing. I would be scared if maintainers handwaved that away.

                                                                                                                                                                                                        Yes, the implementation could have bugs, thus making the abstraction leaky.

                                                                                                                                                                                                        Bugs happen, but if the abstraction is downright wrong then that’s something I wouldn’t underestimate. A lot of the appeal of Rust in Linux lies exactly in documenting/formalizing these unwritten rules, and wrong documentation can be worse than no documentation (cue the negative parts of the API design scale!); even more so if your documentation is a formal model like a set of Rust types and functions.

                                                                                                                                                                                                        That said, the same thing can happen in a Rust-first kernel, which will also have a lot of unsafe code. And it would be much harder to fix it in a Rust-first kernel, than in Linux at a time when it’s just feeling the waters.

                                                                                                                                                                                                        1. 7

                                                                                                                                                                                                          In fact, std::collections::LinkedList is safe but still not great (for example the cursor API is still unstable); and being in std, it was designed/reviewed by some of the most knowledgeable Rust developers, sort of by definition.

                                                                                                                                                                                                          At the same time, it was included almost as like, half a joke, and nobody uses it, so there’s not a lot of pressure to actually finish off the cursor API.

                                                                                                                                                                                                          It’s also not the kind of linked list the kernel would use, as they’d want an intrusive one.

                                                                                                                                                                                                      2. 12

                                                                                                                                                                                                        And yet, safe to use doubly linked lists written in Rust exist. That the implementation needs unsafe is not a real problem. That’s how we should look at wrapping C code in safe Rust abstractions.

                                                                                                                                                                                                        1. 3

                                                                                                                                                                                                          The whole comment you replied to, after the one sentence about linked lists, is about abstractions. And abstractions are rarely going to be easy, and sometimes could be hardly possible.

                                                                                                                                                                                                          That’s just a fact. Confusing this fact for something as hyperbolic as “absurdly low quality” is stunning example of the Dunning Kruger effect, and frankly insulting as well.

                                                                                                                                                                                                          1. 9

                                                                                                                                                                                                            I personally would call Linux low quality because many parts of it are buggy as sin. My GPU stops working properly literally every other time I upgrade Linux.

                                                                                                                                                                                                            No one is saying that Linux is low quality because it’s hard or impossible to abstract some subsystems in Rust, they’re saying it’s low quality because a lot of it barely works! I would say that your “Chinese whispers” misrepresents the situation and what people here are actually saying. “the safety of this API is hard to formalize and you pretty much have to use it the way everybody does it” doesn’t apply if no one can tell you how to use an API, and everyone does it differently.

                                                                                                                                                                                                            1. 3

                                                                                                                                                                                                              I agree, Linux is the worst of all kernels.

                                                                                                                                                                                                              Except for all the others.

                                                                                                                                                                                                              1. 9

                                                                                                                                                                                                                Actually, the NT kernel of all things seems to have a pretty good reputation, and I wouldn’t dismiss the BSD kernels out of hand. I don’t know which kernel is better, but it seems you do. If you could explain how you came to this conclusion that would be most helpful.

                                                                                                                                                                                                                1. 10

                                                                                                                                                                                                                  NT gets a bad rap because of the OS on top of it, not because it’s actually bad. NT itself is a very well-designed kernel.

                                                                                                                                                                                                                  1. 3

                                                                                                                                                                                                                    *nod* I haven’t been a Windows person since shortly after the release of Windows XP (i.e. the first online activation DRM’d Windows) but, whenever I see glimpses of what’s going on inside the NT kernel in places like Project Zero: The Definitive Guide on Win32 to NT Path Conversion, it really makes me want to know more.

                                                                                                                                                                                            2. -1

                                                                                                                                                                                              how low quality the Linux kernel code is

                                                                                                                                                                                              Somewhere else it was mentioned that most developers in the kernel could just not be bothered with checking for basic things.

                                                                                                                                                                                              how much burnout and misery is involved

                                                                                                                                                                                              Nobody is forcing any of these people to do this.

                                                                                                                                                                                            3. 4

                                                                                                                                                                                              What causes bloat

                                                                                                                                                                                              Generally: A lack of vigilance and discipline

                                                                                                                                                                                              Too many reasons, but will discuss a few here: • Features • Layering Dependencies • Open source development

                                                                                                                                                                                              To which should be added hot jazz and short skirts.

                                                                                                                                                                                              Features in this case is really a nod to the fact that software does hugely, vastly more than it did when Rob was using the supercomputer at the university of Toronto.

                                                                                                                                                                                              Layering dependencies and open source are really the triumph of reusable software which allows us to build software from high level components rather than writing our own regex libraries. This and Moore’s Law (rip) is what allows us to build the hugely featureful software of today.

                                                                                                                                                                                              Those of us old enough to have been reading academic writing about software engineering in the 90s will remember the great anticipation of “component software” and “COTS”. What we have today is the worse-is-better version of that, massively tilting the time spent from boilerplate to actual features.

                                                                                                                                                                                              Nevertheless, software is often surprisingly slow. I’d like to propose a series of factors that don’t involve any finger wagging or shaking of fists. I suggest the factors are, in order of importance:

                                                                                                                                                                                              1. He who pays the piper calls the tune
                                                                                                                                                                                              2. Almost no-one whose opinion matters gives even one shit about performance or latency
                                                                                                                                                                                              3. We don’t have a developed way of talking about performance
                                                                                                                                                                                              4. Reasoning about performance is hard, and we can barely be bothered with correctness.

                                                                                                                                                                                              Generally speaking users want features until a product is unusably slow, product managers want to talk about features, and sales people want to talk about features. UX designers want to draw pretty pictures. Even most engineers don’t actually care that much about performance. Those of us who see the absolutely transformative potential of orders of magnitude performance improvements are in the minority. Frankly, if that comes at a cost of features there better be a good story for how this sustainably unlocks a product people want to buy more, with a heuristic for eliminating the worst eg 20% of features. And it would come at the cost of features because maintaining performance is another stream of constant effort.

                                                                                                                                                                                              Meanwhile as an industry we have much more developed techniques for ensuring correctness (or catching obvious defects), but we generally aren’t that invested in those either, because again mostly people want software that does stuff most of the time.

                                                                                                                                                                                              For anyone who cares about performance, we need to stop finger wagging and develop ways of talking about performance, and measuring performance in ways that engineers and business people understand, to create products that people actually want because being faster makes it obviously better. Actually delivering that will require systems whose performance characteristics can be reasoned about, but we’re not even there in terms of anyone caring to do so.

                                                                                                                                                                                              1. 2

                                                                                                                                                                                                This touches upon a couple of things that have been going in my head for some time.

                                                                                                                                                                                                I have some complementary ideas:

                                                                                                                                                                                                • Note that Microsoft wrote both VS Code and Teams. VS Code might not be as fast as other editors, but it’s IMHO very good. I’d go as far as saying that it’s likely one of the most beloved Electron apps out there. OTOH, Teams is a laggy shitshow. Observe that VS Code has massive competition, whereas Teams is bundled with a larger product and often chosen by the higher-ups, who likely doesn’t use the product much, because why bother getting another chat client if we have bundled Teams. Theory: Electron can be fast if the people calling the shots in developing the software care.

                                                                                                                                                                                                • What the parent post wrote about dogfooding. I think most software is not dogfooded as much as it should.

                                                                                                                                                                                                • … and complementary: most software devs (and likely people calling the shots) tend to be on great hardware. There’s plenty of software that runs great on expensive recent devices. But good luck if you are using a cheap phone or an old laptop.

                                                                                                                                                                                                I think the combination of all those is unavoidable. Too few of us vote with their pockets to have performant software on long-lived modest hardware. I sadly moved to a new desktop computer because my 10-year old ThinkPad was becoming a pain when using some specific pieces of software- although it was fine for many compute-intensive tasks!

                                                                                                                                                                                                (Also note website advertisement load!)

                                                                                                                                                                                                1. 1

                                                                                                                                                                                                  Fair points. I consider VS code to be a classic editor of the same stature as emacs and vim.

                                                                                                                                                                                                2. 1

                                                                                                                                                                                                  I suggest the factors are, in order of importance:

                                                                                                                                                                                                  1. He who pays the piper calls the tune
                                                                                                                                                                                                  2. Almost no-one whose opinion matters gives even one shit about performance or latency
                                                                                                                                                                                                  3. We don’t have a developed way of talking about performance
                                                                                                                                                                                                  4. Reasoning about performance is hard, and we can barely be bothered with correctness.

                                                                                                                                                                                                  The 3 struck me as an undervalued one. Maybe your prioritization comes from the analytical side; but a more prescriptive take (future-oriented or delusional, not sure which) is to re-shuffle it to:

                                                                                                                                                                                                  1. Experts or academia can propose concepts and language to discuss the problem of employee-visible (!) latencies in software,
                                                                                                                                                                                                  2. …which can get attention of big employers (not necessarily Big Tech),
                                                                                                                                                                                                  3. …which can get the big employers to pay the piper and call the tune.

                                                                                                                                                                                                  (I’m deleting point 4, because correctness is orthogonal. It’s up to us techies to fit in our heads both narratives.)

                                                                                                                                                                                                  Arguments why this might work: notorious history of the entire Agile movement.

                                                                                                                                                                                                  So when you see people cargo-culting bloat/latency, it’s probably a very good sign: it means point 1 had delivered!

                                                                                                                                                                                                  1. 2

                                                                                                                                                                                                    Reasoning and talking about a subject are related but distinct. Once we can talk about performance, we’ll at least initially be stuck with tools not designed for ease of reasoning about their performance.

                                                                                                                                                                                                3. 12

                                                                                                                                                                                                  These are all very valid points, and I think the industry took a big swing from “sqlite is useful just for local dev or embedding in mobile/desktop apps” to “you can build your entire SaaS on sqlite” and the truth is somewhere in the middle.

                                                                                                                                                                                                  Where sqlite shines I find is small to medium apps that won’t necessarily “scale” in the modern sense. I run a few small-ish directories, forums, tools that run on sqlite quite happily, but I wouldn’t necessarily choose it for a high throughput product. I did choose to support it for a product I build (storyden) because at one time I had a really old slow laptop as my main one got stolen, so I had to run a light weight database without docker and sqlite slotted in perfectly (thanks to the orm i use transparently switching up its query dialect to talk to sqlite) and a nice benefit of that is now I can say that product can deploy to small simple environments with no need for a postgres instance, so that’s a nice bonus!

                                                                                                                                                                                                  it does feel like just another case of a hype cycle, it’ll settle over time. sqlite has its place and it is a really nice impressive piece of software that can run server workloads if those workloads are of suitable size.

                                                                                                                                                                                                  1. 4

                                                                                                                                                                                                    I’ve had a half baked idea before about making social software that deliberately runs (ideally very) fast on a single machine but doesn’t trivially scale up to multiple machines. Decisions like using sqlite for backing storage would be part of that.

                                                                                                                                                                                                    1. 2

                                                                                                                                                                                                      I’m currently using sqlite to scale a very large app because Postgres can’t scale out the way we need it to. ¯\_(ツ)_/¯

                                                                                                                                                                                                        1. 2

                                                                                                                                                                                                          I work on a product called Notion that has a user-defined schema feature (“Notion database”) that is impossible to index efficiently with traditional monolithic DBMS, in addition we have a fine grained graph like data model where many queries are an infinity of N+1 recursion. SQLite is an interesting tool for tackling both challenges: colocate data to application so N+1 is efficient, and reify the user defined schema to an SQLite table for indexing in its natural shape instead of some kind of convoluted EAV or column packing strategy in Postgres.

                                                                                                                                                                                                          EDIT: the other issue is that we use Amazon Web Services RDS Postgres with EBS volumes, which in my opinion is very slow compared to anything on instance local nvme storage.

                                                                                                                                                                                                          1. 1

                                                                                                                                                                                                            oh OH! that’s interesting! I’ve been implementing a very similar feature for storyden these last couple weeks and i was very closely looking at notion to try infer how they did it! I ended up implementing our user defined schemas in-schema as a basic key-value table with application defined schema validation but dynamically creating a tenanted database is another approach! I didn’t tackle indexing yet and foresee i will run into that exact issue you described. Luckily my product isn’t at Notion scale (yet:)

                                                                                                                                                                                                            1. 1

                                                                                                                                                                                                              With KV table, do you mean a table like CREATE TABLE property (row_id uuid, column_id uuid, value jsonb), like a table with one column for every cell (row x column)? Or a table like CREATE TABLE row (row_id uuid, properties jsonb) row per row? If you’re using a regular SQL database stay away from the first option, the “entity attribute value” schema! Database will struggle to produce a good query plan if user filters on property A and sorts on property B, since it has to decide which to do first and the join can end up timing out your query if you have Notion sized data.

                                                                                                                                                                                                              1. 1

                                                                                                                                                                                                                yeah the first iteration is an EAV style approach in sqlite+postgres+crdb, good to know it won’t scale! i’ll come back to it in future and see if there’s a nicer approach. I’m curious how jsonb can perform better, if you want to filter/query on it? i’ll have to do some stress tests with large datasets too… so much to do and so little time!

                                                                                                                                                                                                    2. 14

                                                                                                                                                                                                      My brother in christ, please just let me switch on strings.

                                                                                                                                                                                                      1. 6

                                                                                                                                                                                                        The word “just” is sitting on top of a lot of complexity in this statement.

                                                                                                                                                                                                        Pardon my lack of zig experience but I wouldnt expect any reference type to work in switch in a systems language (similar to C)

                                                                                                                                                                                                        1. 10

                                                                                                                                                                                                          I would count rust as a systems language and it works in magch there (not just for &str but any reference).

                                                                                                                                                                                                          1. 2

                                                                                                                                                                                                            Any constant reference.

                                                                                                                                                                                                        2. 5

                                                                                                                                                                                                          Do you want equivalent but differently encoded unicode strings to be considered equal?

                                                                                                                                                                                                          1. 17

                                                                                                                                                                                                            No, just raw byte-for-byte comparison.

                                                                                                                                                                                                            1. 3

                                                                                                                                                                                                              None of the workarounds in the article seem to address that either.

                                                                                                                                                                                                              Do you think that not allowing users to switch on strings will avoid canonicalisation issues somehow?

                                                                                                                                                                                                              Can users define their own string type and make switch work on it?

                                                                                                                                                                                                              1. 3

                                                                                                                                                                                                                would it make sense to allow an optional comparator to the switch statement? e.g. switch (mystr, &std.mem.equal) { ... }

                                                                                                                                                                                                                1. 2

                                                                                                                                                                                                                  I had been imagining something like the ability to extend switch to operate on any aggregate type that has a hash-function definition available, but having a comparator generalizes that, and even supports some interesting use-cases like canonicalization-in-comparison (as @andrewrk asked about) as well as even fancier things like edit distance matching!

                                                                                                                                                                                                                  I find this idea quite compelling…

                                                                                                                                                                                                                  All the best,

                                                                                                                                                                                                                  -HG

                                                                                                                                                                                                                  1. 2

                                                                                                                                                                                                                    Sounds like pattern matching

                                                                                                                                                                                                            2. 2

                                                                                                                                                                                                              Funnily enough the Linux foundation has set itself up as the home of forks. A proper fork would require a huge organizational effort to get distributions using it and manage the whole thing. Which is a shame.

                                                                                                                                                                                                              1. 5

                                                                                                                                                                                                                I thought Drew DeVault’s proposal looked pragmatic, and potentially quite consequential. I wonder if it will get any traction?

                                                                                                                                                                                                                1. 32

                                                                                                                                                                                                                  The Rust for Linux folks are Linux maintainers. They want to keep working on Linux. Building another kernel is not the point.

                                                                                                                                                                                                                  Drew is repeating a framing that is just not correct: that Rust for Linux is about “Rust people” getting involved with Linux because they like Rust. That’s just not the case. They’re Linux people who want to make Linux better. That’s why it’s nonsense.

                                                                                                                                                                                                                  For other people, sure, it makes sense to start your own project. And there are a bunch of them. But it’s basically just not applicable in this case.

                                                                                                                                                                                                                  1. 1

                                                                                                                                                                                                                    It still has the problem that feature sets will diverge so they won’t stay compatible for long regardless of whether they rewrite or fork.

                                                                                                                                                                                                                    In either case there’s going to be a lot of stuff to make sure their version is in major distros as an option.

                                                                                                                                                                                                                    1. 3

                                                                                                                                                                                                                      One amazing thing about the linux kernel is how long they’ve kept the ABI stable and backward-compatible even when introducing new features.

                                                                                                                                                                                                                      I don’t really expect major distros to adopt a RiiR linux clone, but new distros could form around it.

                                                                                                                                                                                                                      1. 2

                                                                                                                                                                                                                        Quite. But a version that maintains the abi by increasingly writing more of the fork in rust has a chance.

                                                                                                                                                                                                                  2. 1

                                                                                                                                                                                                                    wouldn’t distributions see the value in Rust and jump at the opportunity?

                                                                                                                                                                                                                    1. 1

                                                                                                                                                                                                                      Maybe! But in general better technology doesn’t win without a concerted marketing effort.

                                                                                                                                                                                                                      1. 1

                                                                                                                                                                                                                        I don’t get that. linux distributions weigh the tradeoffs in free software projects all the time without the projects committing a “concerted marketing effort.”

                                                                                                                                                                                                                  3. 1

                                                                                                                                                                                                                    I always had a feeling emojis were evil, glad it has been scientifically confirmed now.

                                                                                                                                                                                                                    1. 2

                                                                                                                                                                                                                      You could maybe conclude that Unicode is evil, but as the article says, this technique is not specific to emoji.

                                                                                                                                                                                                                      1. 2

                                                                                                                                                                                                                        There certainly wouldn’t be anything morally objectionable about ASCII supremacism, would there?

                                                                                                                                                                                                                        Let’s just skip to the chase: writing is a mixed bag.

                                                                                                                                                                                                                        1. 1

                                                                                                                                                                                                                          Tbh I’m p sure the attack on writing is not meant to be taken too seriously, for earlier in Phaedrus Socrates says “you have the man himself there” of a manuscript of Lysias’ speech, declining to listen to Phaedrus’ recollection of the speech in question.

                                                                                                                                                                                                                          1. 1

                                                                                                                                                                                                                            Oh, it’s definitely an ironic move. Brings up interesting stuff, though!