Threads for steveklabnik

    1.  

      I was surprised to see such a large performance improvement from shrinking the Object structure. What primarily drives this? My best guess is you can pack more Objects in a cache line

      1. 6

        They mention in the post that it ends up being able to fit into a register. That possibility is pretty huge, I’d imagine.

        1.  

          Makes sense! The high probability was a piece I was missing.

      2. 86

        The TypeScript dev lead posted this response about the language choice on Reddit, for anyone who’s curious:

        (dev lead of TypeScript here, hi!)

        We definitely knew when choosing Go that there were going to be people questioning why we didn’t choose Rust. It’s a good question because Rust is an excellent language, and barring other constraints, is a strong first choice when writing new native code.

        Portability (i.e. the ability to make a new codebase that is algorithmically similar to the current one) was always a key constraint here as we thought about how to do this. We tried tons of approaches to get to a representation that would have made that port approach tractable in Rust, but all of them either had unacceptable trade-offs (perf, ergonomics, etc.) or devolved in to “write your own GC”-style strategies. Some of them came close, but often required dropping into lots of unsafe code, and there just didn’t seem to be many combinations of primitives in Rust that allow for an ergonomic port of JavaScript code (which is pretty unsurprising when phrased that way - most languages don’t prioritize making it easy to port from JavaScript/TypeScript!).

        In the end we had two options - do a complete from-scrach rewrite in Rust, which could take years and yield an incompatible version of TypeScript that no one could actually use, or just do a port in Go and get something usable in a year or so and have something that’s extremely compatible in terms of semantics and extremely competitive in terms of performance.

        And it’s not even super clear what the upside of doing that would be (apart from not having to deal with so many “Why didn’t you choose Rust?” questions). We still want a highly-separated API surface to keep our implementation options open, so Go’s interop shortcomings aren’t particularly relevant. Go has excellent code generation and excellent data representation, just like Rust. Go has excellent concurrency primitives, just like Rust. Single-core performance is within the margin of error. And while there might be a few performance wins to be had by using unsafe code in Go, we have gotten excellent performance and memory usage without using any unsafe primitives.

        In our opinion, Rust succeeds wildly at its design goals, but “is straightforward to port to Rust from this particular JavaScript codebase” is very rationally not one of its design goals. It’s not one of Go’s either, but in our case given the way we’ve written the code so far, it does turn out to be pretty good at it.

        Source: https://www.reddit.com/r/typescript/comments/1j8s467/comment/mh7ni9g

        1. 83

          And it’s not even super clear what the upside of doing that would be (apart from not having to deal with so many “Why didn’t you choose Rust?” questions)

          People really miss the forest for the trees.

          I looked at the repo and the story seems clear to me: 12 people rewrote the TypeScript compiler in 5 months, getting a 10x speed improvement, with immediate portability to many different platforms, while not having written much Go before in their lives (although they are excellent programmers).

          This is precisely the reason why Go was invented in the first place. “Why not Rust?” should not be the first thing that comes to mind.

          1. 12

            I honestly do think the “Why not Rust?” question is a valid question to pop into someone’s head before reading the explanation for their choice.

            First of all, if you’re the kind of nerd who happens to follow the JavaScript/TypeScript dev ecosystem, you will have seen a fair number of projects either written, or rewritten, in Rust recently. Granted, some tools are also being written/rewritten in other languages like Go and Zig. But, the point is that there’s enough mindshare around Rust in the JS/TS world that it’s fair to be curious why they didn’t choose Rust while other projects did. I don’t think we should assume the question is always antagonistic or from the “Rust Evangelism Strike Force”.

            Also, it’s a popular opinion that languages with algebraic data types (among other things) are good candidates for parsers and compilers, so languages like OCaml and Rust might naturally rank highly in languages for consideration.

            So, I honestly had the same question, initially. However, upon reading Anders’ explanation, I can absolutely see why Go was a good choice. And your analysis of the development metrics is also very relevant and solid support for their choice!

            I guess I’m just saying, the Rust fanboys (myself, included) can be obnoxious, but I hope we don’t swing the pendulum too far the other way and assume that it’s never appropriate to bring Rust into a dev conversation (e.g., there really may be projects that should be rewritten in Rust, even if people might start cringing whenever they hear that now).

            1.  

              While tweaking a parser / interpreter a few years ago written in Go, I specifically replaced a struct with an ‘interface {}’ in order to exercise its pseudo-tagged-union mechanisms. Together with using type-switch form.

              https://github.com/danos/yang/commit/c98b220f6a1da7eaffbefe464fd9e734da553af0

              These day’s I’d actually make it a closed interface such that it is more akin to a tagged-union. Which I did for another project which was passing around instances of variant-structs (i.e. a tagged union), rather than building an AST.

              So it is quite possible to use that pattern in Go as a form of sum-type, if for some reason one is inclined to use Go as the implementation language.

          2. 36

            That is great explanation of “Why Go and not Rust?”

            If you’re looking for “Why Go and not AOT-compiled C#?” see here: https://youtu.be/10qowKUW82U?t=1154s

            A relevant quote is that C# has “some ahead-of-time compilation options available, but they’re not on all platforms and don’t really have a decade or more of hardening.”

            1. 9

              That interview is really interesting, worth watching the whole thing.

              1. 11

                Yeah Hjelsberg also talks about value types being necessary, or at least useful, in making language implementations fast

                If you want value types and automatically managed memory, I think your only choices are Go, D, Swift, and C# (and very recently OCaml, though I’m not sure if that is fully done).

                I guess Hjelsberg is conceding that value types are a bit “second class” in C#? I think I was surprised by the “class” and “struct” split, which seemed limiting, but I’ve never used it. [1]

                And that is a lesson learned from the Oils Python -> C++ translation. We don’t have value types, because statically typed Python doesn’t, and that puts a cap on speed. (But we’re faster than bash in many cases, though slower in some too)


                Related comment about GC and systems languages (e.g. once you have a million lines of C++, you probably want GC): https://lobste.rs/s/gpb0qh/garbage_collection_for_systems#c_rrypks

                Now that I’ve worked on a garbage collector, I see a sweet spot in languages like Go and C# – they have both value types deallocated on the stack and GC. Both Java and Python lack this semantic, so the GCs have to do more work, and the programmer has less control.

                There was also a talk that hinted at some GC-like patterns in Zig, and I proposed that TinyGo get “compressed pointers” like Hotspot and v8, and then you would basically have that:

                https://lobste.rs/s/2ah6bi/programming_without_pointers#c_5g2nat


                [1] BTW Guy Steele’s famous 1998 “growing a language” actually advocated value types in Java. AFAIK as of 2025, “Project Valhalla” has not landed yet

                1. 6

                  and very recently OCaml, though I’m not sure if that is fully done

                  Compilers written in OCaml are famous for being super-fast. See eg OCaml itself, Flow, Haxe, BuckleScript (now ReScript).

                  1.  

                    Yeah, I’m kind of curious about whether OCaml was considered at some point (I asked about this in the Reddit thread, haven’t gotten a reply yet).

                    OCaml seems much more similar to TS than Go, and has a proven track record when it comes to compilers. Maybe portability issues? (Good portability was mentioned as a must-have IIRC)

                    1.  

                      Maybe, but given that Flow, its main competitor, distributes binaries for all major platforms: https://github.com/facebook/flow/releases/tag/v0.264.0

                      Not sure what more TypeScript would have needed. In fact, Flow’s JavaScript parser is available as a separate library, so they would have shaved off at least a month from the proof of concept…

                  2. 5

                    If you want value types and automatically managed memory, I think your only choices are Go, D, Swift, and C#

                    Also Nim.

                    1. 3

                      Also Julia.

                      There surely are others.

                      1. 4

                        Yes good points, I left out Nim and Julia. And apparently Crystal - https://colinsblog.net/2023-03-09-values-and-references/

                        Although thinking about it a bit more, I think Nim, Julia, (and maybe Crystal) are like C#, in that they are not as general as Go / D / Swift.

                        You don’t have a Foo* type as well as a Foo type, i.e. the layout is orthogonal to whether it’s a value or reference. Instead, Nim apparently has value objects and reference objects. I believe C# has “structs” for values and classes for references.

                        I think Hjelsberg was hinting at this category when saying Go wins a bit on expressiveness, and it’s also “as close to native as you can get with GC”.


                        I think the reason this Go’s model is uncommon is because it forces the GC to support interior pointers, which is a significant complication (e.g. it is not supported by WASM GC). Go basically has the C memory model, with garbage collection.

                        I think C#, Julia, and maybe Nim/Crystal do not support interior pointers (interested in corrections)


                        Someone should write a survey of how GC tracing works with each language :) (Nim’s default is reference counting without cycle collection.)

                        1. 3

                          Yeah that’s interesting. Julia has a distinction between struct (value) and mutable struct (reference). You can use raw pointers but safe interior references (to an element of an array for example) include a normal reference to the (start of the) backing array, and the index.

                          I can understand how in Rust you can safely have an interior pointer as the borrow checker ensures a reference to an array element is valid for its lifetime (the array can’t be dropped or resized before the reference is dropped). I’m very curious - I would like to understand how Go’s tracing GC works with interior pointers now! (I would read such a survey).

                          1. 3

                            Ok - Go’s GC seems to track a memory span for each object (struct or array), stored in kind of a span tree (interval tree) for easy lookup given some pointer to chase. Makes sense. I wonder if it smart enough to deallocate anything dangling from non-referenced elements of an array / fields of a struct, or just chooses to be conservative (and if so do users end up accidentally creating memory leaks very often)? What’s the performance impact of all of this compared to runtimes requiring non-interior references? The interior pointers themselves will be a performance win, at the expense of using an interval tree during the mark phase.

                            https://forum.golangbridge.org/t/how-gc-handles-interior-pointer/36195/5

                    2.  

                      It’s been a few years since I’ve written any Go, but I have a vague recollection that the difference between something being heap or stack allocated was (sometimes? always?) implicit based on compiler analysis of how you use the value. Is that right? How easy it, generally, to accidentally make something heap-allocated and GC’d?

                      That’s the only thing that makes me nervous about that as a selling point for performance. I feel like if I’m worried about stack vs heap or scoped vs memory-managed or whatever, I’d probably prefer something like Swift, Rust, or C# (I’m not familiar at all with how D’s optional GC stuff works).

                      1.  

                        Yes, that is a bit of control you give up with Go. Searching for “golang escape analysis”, this article is helpful:

                        https://medium.com/@trinad536/escape-analysis-in-golang-fc81b78f3550

                        $ go build -gcflags "-m" main.go
                        
                        .\main.go:8:14: *y escapes to heap
                        .\main.go:11:13: x does not escape
                        

                        So the toolchain is pretty transparent. This is actually something I would like for the Oils Python->C++ compiler, since we have many things that are “obviously” locals that end up being heap allocated. And some not so obvious cases. But I think having some simple escape analysis would be great.

                        1.  

                          Yes, the stack/heap distinction is made by the compiler, not the programmer, in Go.

                        2.  

                          Why did you leave JS/TS off the list? They seem to have left it off too and that confuses me deeply because it also has everything they need

                          1. 7

                            Hejlsberg said they got about 3x performance from native compilation and value types, which also halved the memory usage of the compiler. They got a further 3x from shared-memory multithreading. He talked a lot about how neither of those are possible with the JavaScript runtime, which is why it wasn’t possible to make tsc 10x faster while keeping it written in TypeScript.

                            1.  

                              Yeah but I can get bigger memory wins while staying inside JS by sharing the data structures between many tools that currently hold copies of the same data: the linter, the pretty-formatter, the syntax highlighter, and the type checker

                              I can do this because I make my syntax tree nodes immutable! TS cannot make their syntax tree nodes immutable (even in JS where it’s possible) because they rely on the node.parent reference. Because their nodes are mutable-but-typed-as-immutable, these nodes can never safely be passed as arguments outside the bounds of the TS ecosystem, a limitation that precludes the kind of cross-tool syntax tree reuse that I see as being the way forward

                              1. 5

                                Hejlsberg said that the TypeScript syntax tree nodes are, in fact, immutable. This was crucial for parallelizing tsgo: it parses all the source files in parallel in the first phase, then typechecks in parallel in the second phase. The parse trees from the first phase are shared by all threads in the second phase. The two phases spread the work across threads differently. He talks about that kind of sharing and threading being impractical in JavaScript.

                                In fact he talks about tsc being designed around immutable and incrementally updatable data structures right from the start. It was one of the early non-batch compilers, hot on the heels of Roslyn, both being designed to support IDEs.

                                Really, you should watch the interview https://youtu.be/10qowKUW82U

                                AIUI a typical LSP implementation integrates all the tools you listed so they are sharing a syntax tree already.

                                1.  

                                  It’s true that I haven’t watched the interview yet, but I have confirmed with the team that the nodes are not immutable. My context is different than Hejlsberg’s context. For Hejlsberg if something is immutable within the boundaries of TS, it’s immutable. Since I work on JS APIs if something isn’t actually locked down with Object.freeze it isn’t immutable and can’t safely be treated as such. They can’t actually lock their objects down because they don’t actually completely follow the rules of immutability, and the biggest thing they do that you just can’t do with (real, proper) immutable structures is have a node.parent reference.

                                  So they have this kinda-immutable tech, but those guarantees only hold if all the code that ever holds a reference to the node is TS code. That is why all this other infrastructure that could stand to benefit from a shared standard format for frozen nodes can’t: it’s outside the walls of the TS fiefdom, so the nodes are meant to be used as immutable but any JS code (or any-typed code) the trees are ever exposed to would have the potential to ruin them by mutating the supposedly-immutable data

                                  1.  

                                    To be more specific about the node.parent reference, if your tree is really truly immutable you need to replace a leaf node you must replace all the nodes on the direct path from the root to that leaf. TS does this, which is good.

                                    The bad part is that then all the nodes you didn’t replace have chains of node.parent references that lead to the old root instead of the new one. Fixing this with immutable nodes would mean replacing every node in the tree, so the only alternative is to mutate node.parent, which means that 1) you can’t actually Object.freeze(node) and 2) you don’t get all the wins of immutability since the old data structure is corrupted by the creation of the new one.

                                    1.  

                                      See https://ericlippert.com/2012/06/08/red-green-trees/ for why Roslyn’s key innovation in incremental syntax trees was actually breaking the node.parent reference by splitting into the red and green trees, or as I call them paths and nodes. Nodes are deeply immutable trees and have no parents. Paths are like an address in a particular tree, tracking a node and its parents.

                        3. 8

                          You are not joking, just the hack to make type checking itself parallel is well worth an entire hour!

                          1. 11

                            Hm yeah it was a very good talk. My summary of the type checking part is

                            1. The input to the type checker is immutable ASTs
                              • That is, parsing is “embarassingly parallel”, and done per file
                            2. They currently divide the program into 4 parts (e.g. 100 files turns into 4 groups of 25 files), and they do what I’d call “soft sharding”.

                            That is, the translation units aren’t completely independent. Type checking isn’t embarassingly parallel. But you can still parallelize it and still get enough speedup – he says ~3x from parallelism, and ~3x from Go’s better single core perf, which gives you ~10x overall.

                            What wasn’t said:

                            • I guess you have to de-duplicate the type errors? Because some type errors might come twice, since you are duplicating some work
                            • Why the sharding is in 4 parts, and not # CPUs. Even dev machines have 8-16 cores these days, and servers can have 64-128 cores.

                            I guess this is just because, empirically, you don’t get more than 3x speedup.

                            That is interesting, but now I think it shows that TypeScript is not designed for parallel type checking. I’m not sure if other compilers do better though, like Rust (?) Apparently rustc uses the Rayon threading library. Though it’s hard to compare, since it also has to generate code


                            A separate thing I found kinda disappointing from the talk is that TypeScript is literally what the JavaScript code was. There was never a spec and will never be one. They have to do a line-for-line port.

                            There was somebody who made a lot of noise on the Github issue tracker about this, and it was basically closed “Won’t Fix” because “nobody who understands TypeScript well enough has enough time to work on a spec”. (Don’t have a link right now, but I saw it a few months ago)

                            1.  

                              Why the sharding is in 4 parts, and not # CPUs. Even dev machines have 8-16 cores these days, and servers can have 64-128 cores.

                              Pretty sure he said it was an arbitrary choice and they’d explore changing it. The ~10x optimization they’ve gotten so far is enough by itself to keep the project moving. Further optimization is bound to happen later.

                              1.  

                                I’m not sure if other compilers do better though, like Rust (?) Apparently rustc uses the Rayon threading library.

                                Work has been going on for years to parallelize rust’s frontend, but it apparently still has some issues, and so isn’t quite ready for prime time just yet, though it’s expected to be ready in the near term.

                                Under 8 cores and 8 threads, the parallel front end can reduce the clean build (cargo build with -Z threads=8 option) time by about 30% on average. (These crates are from compiler-benchmarks of rustc-perf)

                                1.  

                                  I guess this is just because, empirically, you don’t get more than 3x speedup.

                                  In my experience, once you start to do things “per core” and want to actually get performance out of it, you end up having to pay attention to caches, and get a bit into the weeds. Given just arbitrarily splitting up the work as part of the port has given a 10x speed increase, it’s likely they just didn’t feel like putting in the effort.

                                2.  

                                  Can you share the timestamp to the discussion of this hack, for those who don’t have one hour?

                                  1.  

                                    I think this one: https://www.youtube.com/watch?v=10qowKUW82U&t=2522s

                                    But check the chapters, they’re really split into good details. The video is interesting anyway, technically focused, no marketing spam. I can also highly recommend watching it.

                              2. 6

                                Another point on “why Go and not C#” is that, he said, their current (typescript) compiler is highly functional, they use no classes at all. And Go is “just functions and data structures”, where C# has “a lot of classes”. Paraphrasing a little, but that’s roughly what he said.

                              3. 9

                                They also posted a (slightly?) different response on GitHub: https://github.com/microsoft/typescript-go/discussions/411

                                1. 5

                                  Acknowledging some weak spots, Go’s in-proc JS interop story is not as good as some of its alternatives. We have upcoming plans to mitigate this, and are committed to offering a performant and ergonomic JS API.

                                  Yes please!

                              4. 10

                                I find it slightly odd that an “epic treatise on error models” would fail to mention Common Lisp and Smaltalk, whose error models provide a facility that all others lack: resuming from an error.

                                1. 7

                                  Hi, author here, the title also does say “for systems programming languages” :)

                                  For continuations to work in a systems programming language, you can probably only allow one-shot delimited continuations. It’s unclear to me as to how one-shot continuations can be integrated into a systems language where you want to ensure careful control over lifetimes. Perhaps you (or someone else here) knows of some research integrating ownership/borrowing with continuations/algebraic effects that I’m unfamiliar with?

                                  The closest exception to this that I know of is Haskell, which has support for both linear types and a primitive for continuations. However, I haven’t seen anyone integrate the two, and I’ve definitely seen some soundness-related issues in various effect systems libraries in Haskell (which doesn’t inspire confidence), but it’s also possible I missed some developments there as I haven’t written much Haskell in a while.

                                  1. 10

                                    I’m sorry for the slightly snarky tone of my original reply, but even if you were to discount the Lisp machines, or all the stuff Xerox and others did with Smalltalk (including today’s Croquet), as somehow not being systems, I would have expected an epic treatise to at least mention that error resumption exists – especially since academia is now rediscovering this topic as effect handlers (typically without any mention of the prior art).

                                    For continuations to work in a systems programming language, you can probably only allow one-shot delimited continuations.

                                    This misconception is so common (and dear to my heart) that I have to use bold:

                                    Resumable exceptions do not require first-class continuations, whether delimited or undelimited, whether one-shot or multi-shot. None at all. Nada. Zilch.

                                    To take the example I posted earlier about writing to a full disk: https://lobste.rs/s/az2qlz/epic_treatise_on_error_models_for_systems#c_ss3n1k

                                    ... outer stack ...
                                        write()
                                            signal_disk_is_full()
                                                disk_is_full_handler()
                                    

                                    Suppose write() discovers that the disk is full (e.g. from an underlying primitive). This causes it to call signal_disk_is_full(). Note that the call to signal_disk_is_full() happens inside the stack of write() (obviously).

                                    Now signal_disk_is_full() looks for a handler and calls it: disk_is_full_handler(). Again, the call to the handler happens inside the stack of signal_disk_is_full() (and write()). The handler can return normally to write() once it has cleaned up space.

                                    write() is never popped off the stack. It always stays on the stack. IOW, there is never a need to capture a continuation, and never a need to reinstate one. The disk_is_full_handler() runs inside the stack of the original call to write().

                                    effect systems

                                    A side note: most effect systems do use and even require first-class continuations, but IMO that’s completely overkill and only needed for rarely used effects like nondeterminism. For simple effects, like resumable exceptions, no continuations are needed whatsoever.

                                    1. 2

                                      but even if you were to discount the Lisp machines, or all the stuff Xerox and others did with Smalltalk (including today’s Croquet), as somehow not being systems

                                      I provided the working definition of “systems programming language” that I used in the blog post. It’s a narrow one for sure, but I have to put a limit somewhere. My point is not trying to exclude the work done by smart people; but I need a stopping point somewhere after 100~120 hours of research and writing.

                                      Resumable exceptions do not require first-class continuations, whether delimited or undelimited, whether one-shot or multi-shot. None at all. Nada. Zilch.

                                      Thank you for writing down a detailed explanation with a concrete example. I will update the post with some of the details you shared tomorrow.

                                      You will notice that my comment does not use the phrase “first-class” anywhere; that was deliberate, but perhaps I should’ve been more explicit about it. 😅

                                      As I see it, the notion of a continuation is that of a control operator, which allows one to “continue” a computation from a particular point. So in that sense, it’s a bit difficult for me to understand where exactly you disagree, perhaps you’re working with a different definition of “continuation”? Or perhaps the difference of opinion is because of the focus on first-class continuations specifically?

                                      If I look at Chapter 3 in Advances in Exception Handling Techniques, titled ‘Condition Handling in the Lisp Language Family’ by Ken M. Pitman, that states:

                                      At the time of the Common Lisp design, Scheme did not have an error system, and so its contribution to the dialog on condition systems was not that of contributing an operator or behavior. However, it still did have something to contribute: the useful term continuation […] This metaphor was of tremendous value to me socially in my efforts to gain acceptance of the condition system, because it allowed a convenient, terse explanation of what “restarts” were about in Common Lisp. [..] And so I have often found myself thankful for the availability of a concept so that I could talk about the establishment of named restart points as “taking a continuation, labeling it with a tag, and storing it away on a shelf somewhere for possible later use.”

                                      So it might be the case that the mismatch here is largely due to language usage, or perhaps my understanding of continuations is lacking.


                                      I’m also a little bit confused as to why your current comment (and the linked blog post) focus on unwinding/stack representation. For implementing continuations, there are multiple possible implementation strategies, sure, and depending on the exact restrictions involved, one can potentially use more efficient strategies. If a continuation is second-class in the sense that it must either be immediately invoked (or discarded), it makes sense that the existing call stack can be reused.


                                      Regardless of the specifics of whether we can call Common Lisp style conditions and resumption a form of continuations or not, I believe the concern about non-local control flow interacting with type systems and notions of ownership/regions/lifetimes still applies.

                                      1. 4

                                        As I see it, the notion of a continuation is that of a control operator, which allows one to “continue” a computation from a particular point. … Or perhaps the difference of opinion is because of the focus on first-class continuations specifically?

                                        Typically, there are two notions of continuations:

                                        1. Continuations as an explanatory or semantic concept. E.g. consider the expression f(x + y). To evaluate this, we first need to compute x + y. At this point our continuation is f(_), where _ is the place into which we will plug the result of x + y. This is the notion of a continuation as “what happens next” or “the rest of the program”.

                                        2. Continuations as an actually reified value/object in a programming language, i.e. first-class continuations. You can get such a first-class continuation e.g. from Scheme’s call/cc or from delimited control operators. This typically involves copying or otherwise remembering some part of the stack on the part of the language implementation.

                                        Resumable exceptions have no need for first-class continuations (2). Continuations as an explanatory concept (1) of course still apply, but only because they apply to every expression in a program.

                                        I believe the concern about non-local control flow interacting with type systems and notions of ownership/regions/lifetimes still applies.

                                        The example I used has no non-local control flow at all. write() calls signal_disk_is_full() and that calls the disk_is_full_handler(), and that finally returns normally to write(). This is my point: resumption does not require any non-local control flow.

                                        1. 4

                                          As well as what @manuel wrote, it’s worth noting that basically every language has second-class continuations: a return statement skips to the current function’s continuation.

                                          Your comment talked about one-shot delimited continuations, which are a kind of first-class continuation in that (per Strachey’s definition of first vs second class) they can be assigned to variables and passed around like other values.

                                          1. 1

                                            it’s worth noting that basically every language has second-class continuations: a return statement skips to the current function’s continuation.

                                            In most languages, a return statement cannot be passed as an argument to a function call. So is it still reasonable to call it as “support for a second-class continuation”?

                                            Your comment talked about one-shot delimited continuations, which are a kind of first-class continuation in that (per Strachey’s definition of first vs second class) they can be assigned to variables and passed around like other values.

                                            I understand your and @manuel’s points that the common usage may very well be that “one-shot delimited continuation” implies “first-class” (TIL, thank you).

                                            We can make this same point about functions where generally functions are assumed to be first class. However, it’s not unheard of to have second-class functions (e.g. Osvald et al.’s Gentrification gone too far? and Brachthäuser et al.’s Effects, Capabilities, and Boxes describe such systems). I was speaking in this more general sense.

                                            As I see it, the “one-shot delimited” aspect is disconnected from the “second class” aspect.

                                            1. 5

                                              In most languages, a return statement cannot be passed as an argument to a function call. So is it still reasonable to call it as “support for a second-class continuation”?

                                              That you can’t pass it as an argument is exactly why it’s called second-class. Only a first-class continuation is reified into a value in the language, and therefore usable as an argument.

                                              As I see it, the “one-shot delimited” aspect is disconnected from the “second class” aspect.

                                              One-shot strongly implies a first-class continuation. Second-class continuations are always one-shot, since, again, you can’t refer to them as values, so how would you invoke one multiple times?

                                              1. 1

                                                One-shot strongly implies a first-class continuation. Second-class continuations are always one-shot, since, again, you can’t refer to them as values, so how would you invoke one multiple times?

                                                Here is the wording from Strachey’s paper, as linked by @fanf

                                                they always have to appear in person and can never be represented by a variable or expression (except in the case of a formal parameter) [emphasis added]

                                                Isn’t this “except in the case of a formal parameter” exactly what is used by Osvald et al. and Brachthäuser et al. in their papers? Here is the bit from Osvald et al.’s paper:

                                                Our solution is a type system extension that lets us define file as a second-class value, and that ensures that such second-class values will not escape their defining scope. We introduce an annotation @local to mark second-class values, and change the signature of withFile as follows:

                                                def withFile[U](n: String)(@local fn: (@local File) => U): U
                                                

                                                [..] Note that the callback function fn itself is also required to be second-class, so that it can close over other second-class values. This enables, for example, nesting calls to withFile

                                                In the body of withFile, fn is guaranteed to have several restrictions (it cannot be escaped, it cannot be assigned to a mutable variable etc.). But the type system (as in the paper) cannot prevent the implementation of withFile from invoking fn multiple times. That would require an additional restriction – that fn can only be invoked 0-1 times in the body of withFile.

                                              2. 2

                                                @manuel wrote most of what I was going to (thanks, @manuel!) but I think it’s worth quoting the relevant passage from Strachey’s fundamental concepts in programming languages

                                                3.5. Functions and routines as data items.

                                                3.5.1. First and second class objects.

                                                In ALGOL a real number may appear in an expression or be assigned to a variable, and either may appear as an actual parameter in a procedure call. A procedure, on the other hand, may only appear in another procedure call either as the operator (the most common case) or as one of the actual parameters. There are no other expressions involving procedures or whose results are procedures. Thus in a sense procedures in ALGOL are second class citizens—they always have to appear in person and can never be represented by a variable or expression (except in the case of a formal parameter), while we can write (in ALGOL still)

                                                (if x > 1 then a else b) + 6
                                                

                                                when a and b are reals, we cannot correctly write

                                                (if x > 1 then sin else cos)(x)
                                                

                                                nor can we write a type procedure (ALGOL’s nearest approach to a function) with a result which is itself a procedure.

                                            2. 2

                                              Regardless of the specifics of whether we can call Common Lisp style conditions and resumption a form of continuations or not, I believe the concern about non-local control flow interacting with type systems and notions of ownership/regions/lifetimes still applies.

                                              That’s a concern, sure, but most “systems” languages have non-local control flow, right? C++ has exceptions, and Rust panics can be caught and handled. It would be very easy to implement a Common Lisp-like condition system with nothing more than thread local storage, function pointers (or closures) and catch/throw.

                                              (And I’m pretty sure you can model exceptions / anything else that unwinds the stack as essentially being a special form of “return”, and handle types, ownership, and lifetimes just the same as you do with the ? operator in Rust)

                                              1. 1

                                                My point is not about ease of implementation, it’s about usability when considering type safety and memory safety. It’s not sufficient to integrate a type system with other features – the resulting thing needs to be usable…

                                                I’ve added a section at the end, Appendix A8 describing the concrete concerns.

                                                Early Rust did have conditions and resumptions (as Steve pointed out elsewhere in the thread), but they were removed because of usability issues.

                                          2. 5

                                            If you dig into the code a bit, you discover that SEH on Windows has full support for Lisp-style restartable and resumable exceptions in the lower level, they just aren’t exposed in the C/C++ layer. The same component is used in the NT kernel and so there’s an existence proof that you can support both of these models in systems languages, I just don’t know of anyone who does.

                                            The SEH model is designed to work in systems contexts. Unlike the Itanium model (used everywhere except Windows) it doesn’t require heap allocation. The throwing frame allocates the exception and metadata and then invokes the unwinder. The unwinder then walks the stack and invokes ‘funclets’ for each frame being unwound. A funclet is a function that runs on the top of the stack but with access to another frame’s stack pointer and so can handle all cleanup for that frame without actually doing the unwind. As with the Itanium model, this is a two-stage process, with the first determining what needs to happen on the unwind and the second running cleanup and catch logic.

                                            This model is very flexible because (as with the Lisp and Smalltalk exception models) the stack isn’t destroyed until after the first phase. This means that you can build any kind of policy on top quite easily.

                                            1. 3

                                              Oh yes, that reminds me, Microsoft’s Annex K broken C library extensions have a runtime constraint handler that is vaguely like a half-arsed Lisp condition.

                                              1. 2

                                                Yes. However, even the Itanium model supports it: https://itanium-cxx-abi.github.io/cxx-abi/abi-eh.html

                                                A two-phase exception-handling model is not strictly necessary to implement C++ language semantics, but it does provide some benefits. For example, the first phase allows an exception-handling mechanism to dismiss an exception before stack unwinding begins, which allows resumptive exception handling (correcting the exceptional condition and resuming execution at the point where it was raised). While C++ does not support resumptive exception handling, other languages do, and the two-phase model allows C++ to coexist with those languages on the stack.

                                                1. 1

                                                  If you dig into the code a bit

                                                  Are you referring to some closed-source code here, or is the implementation source-available/open-source somewhere? I briefly looked that the microsoft/STL repo, and the exception handling machinery seems to be linked to vcruntime which is closed-source AFAICT.

                                                  The SEH model is designed to work in systems contexts [..]

                                                  Thanks for the context, I haven’t seen a simple explanation of SEH works elsewhere, so this is good to know. I have one follow-up question:

                                                  it doesn’t require heap allocation. The throwing frame allocates the exception and metadata

                                                  So the exception and metadata is statically sized (and hence space for it is already reserved on the throwing frame’s stack frame)? Or can it be dynamically sized (and hence there is a risk of triggering stack overflow when throwing)?

                                                  The same component is used in the NT kernel and so there’s an existence proof that you can support both of these models in systems languages, I just don’t know of anyone who does.

                                                  As Steve pointed out elsewhere in the thread, Rust pre-1.0 did support conditions and resumptions, but they removed it.

                                                  To be clear, I don’t doubt whether you can support it, the question in my mind is whether can you support it in a way that is usable.

                                                  1. 1

                                                    Are you referring to some closed-source code here, or is the implementation source-available/open-source somewhere?

                                                    I thought I read it in a public repo, but possibly it was a MS internal one.

                                                    So the exception and metadata is statically sized (and hence space for it is already reserved on the throwing frame’s stack frame)? Or can it be dynamically sized (and hence there is a risk of triggering stack overflow when throwing)?

                                                    The throwing context allocates the exception on the stack. The funclet can then use it in place. If it needs to persist beyond the catch scope, the funclet can copy it elsewhere.

                                                    This can lead to stack overflow (which is fun because stack overflow is, itself, handled as an SEH exception.

                                                  1. 3

                                                    Incidentally, Rust had conditions long ago. They were removed because users preferred Result.

                                                    1. 1

                                                      Is there any documentation or code examples of how they worked?

                                                      1. 1

                                                        https://github.com/rust-lang/rust/issues/9795 Here’s the bug about removing them. There was some documentation in those early releases, I don’t have the time to dig right now.

                                                  2. 2

                                                    I’ve only dabbled slightly with both - how is resuming from an error different from catching it? Is it that execution restarts right after the line that threw the error?

                                                    1. 7

                                                      Consider the following:

                                                      A program wants to write() something to a file, but – oops – the disk is full.

                                                      In ordinary languages, this means write() will simply fail, signal an error (via error code or exception or …), and unwind its stack.

                                                      In languages with resumable or restartable errors, something entirely different happens: write() doesn’t fail, it simply pauses and notifies its calling environment (i.e. outer, enclosing layers of the stack) that it has encountered a DiskIsFull situation.

                                                      In the environment, there may be programmed handlers that know how to deal with such a DiskIsFull situation. For example, a handler may try to empty the /tmp directory if this happens.

                                                      Or there may be no such handler, in which case an interactive debugger is invoked and presented to the human user. The user may know how to make space such as deleting some no longer needed files.

                                                      Once a handler or the user has addressed the DiskIsFull situation, it can tell write() to try writing again. Remember, write() hasn’t failed, it is still paused on the stack.

                                                      Well, now that space is available, write() succeeds, and the rest of the program continues as if nothing had happened.

                                                      Only if there is no handler that knows how to deal with DiskIsFull situations, or if the user is not available to handle the situation interactively, would write() fail conclusively.

                                                      1. 5

                                                        Is it that execution restarts right after the line that threw the error?

                                                        Yes. Common Lisp and Smalltalk use condition systems, where the handler gets executed before unwinding.

                                                        So unwinding is just one possible option (one possible restart), other common ones are to start a debugger, to just resume, to resume with a value (useful to provide e.g. default values, or replacement for invalid values), etc… the signalling site can provide any number of restart for the condition they signal.

                                                        It’s pretty cool in that it’s a lot more flexible, although because it’s adjacent to dynamic scoping it can make the program’s control flow much harder to grasp if you start using complex restarts or abusing conditions.

                                                        1. 2

                                                          Exactly. For example “call with current continuation” or call-cc allows you to optionally continue progress immediately after the throw. It’s a generalization of the callback/continuation style used in async-await systems.

                                                          (There’s also hurl, which I think was intended as an esolang but stumbled upon something deep (yet already known): https://ntietz.com/blog/introducing-hurl/)

                                                          1. 7

                                                            You don’t need continuations to implement resumable errors. The trick is simply to not unwind the stack when an error happens. I wrote an article about how it works a while ago: http://axisofeval.blogspot.com/2011/04/whats-condition-system-and-why-do-you.html

                                                            1. 2

                                                              Even if you want to do stack unwinding, you don’t need continuations. Catch and throw are adequate operations to implement restarts that unwind the stack to some point first.

                                                        2. 8

                                                          Not sure I see this in such a rose-tinted way. The linked ‘bug report’ is full of people specifying very exact ways that they want to be warned beforehand: I want a blog post, the post has to be dedicated specifically to this one thing, I don’t read Reddit, the tool itself should loudly warn me several months in advance.

                                                          Notably, none of these comments actually volunteer any help with any of the above. Even the maintainer is talking about rustup as a ‘Rust project product’. I feel like everyone’s framing is wrong here…

                                                          1. 6

                                                            Nothing is perfect for sure. There were some more spicy commentary on this on social media, for example. But, it’s all about the overall ratio, and the outcome.

                                                            Notably, none of these comments actually volunteer any help with any of the above.

                                                            I both agree and disagree. I think it’s normal in bug reports to report the bug, and wait for a decision to be made, before offering help. I’m sure that if they asked for help, they would have gotten it. There’s actually a pretty high degree of social cohesion here, which may not be obvious to outsiders, but I recognize virtually all of the names on that ticket, and so like, I think it’s sort of understood that if active help was desired, all you have to do is ask. The original reporter is on the language team, for example.

                                                            Even the maintainer is talking about rustup as a ‘Rust project product’.

                                                            The Rust Project has long described their output as a product. There’s pros and cons to this, but it’s been the norm for at least 12 years now. I do attribute that style of thinking to be one of the reasons why Rust has been successful.

                                                            1. 5

                                                              Notably, none of these comments actually volunteer any help with any of the above. Even the maintainer is talking about rustup as a ‘Rust project product’. I feel like everyone’s framing is wrong here…

                                                              As an ex-lead of the Rust project: I don’t think the Rust project needs more volunteers all the time. It has capacity for fixes, so just stating problems is fine. A lot of people in the discussion are also people who contribute to Rust at other places, by writing libraries, etc.

                                                              1. 4

                                                                I think treating a tool as a product of some kind is an important part of ensuring it delivers value to users. There are several things this mindset implies, such as having a product vision rather than just exposing your internals and calling it a day.

                                                              2. 2

                                                                Not to downplay the role that thoughtful people played, but I also think a large part of why this went so well is that, unbeknownst to I think everyone, rustup actually had (and has) staged rollouts. Specifically there’s a small number of CI systems that update it promptly, and everyone else that gets the update… someday.

                                                                Your average desktop user appears to be similarly insulated. I just checked and I’m still running the release before the problematic one… I’m not sure when/how this automatically updates, but apparently not promptly.

                                                                This meant that the population impacted was relatively small (for such a widely used tool). Which helps a lot with avoiding piling on.

                                                                1. 1

                                                                  It’s a little more complex than that; when you update or install a toolchain, by default, rustup will automatically update itself. This means cases like CI get updates immediately, while users may not see it as long. It’s not so much about staging as it is about usage patterns.

                                                                  This meant that the population impacted was relatively small (for such a widely used tool). Which helps a lot with avoiding piling on.

                                                                  I do agree that this plays into things, though.

                                                                2. 2

                                                                  Can I get an invite without having to go though the pain of setting up IRC?

                                                                  1. 2

                                                                    I had the same reaction, but web.libera.chat ended up being fine to use when I dropped by.

                                                                    1. 1

                                                                      i’d suggest using kiwiirc anonymously if you just want to drop in and request access!

                                                                      1. 6

                                                                        mdBook absolutely is one of the inspirations! In particular, the way we got rid of docusaurus metadata is by parsing the ToC information directly from markdown, the way mdBook does it.

                                                                        But I don’t think we’d be able to use it directly — we do need full control over the content. For example, we don’t want to use SUMMARY.md file, and want to stick with README.md, as it is more conventional and handled specially by GitHub. Similarly, we want full control of the resulting HTML and CSS.

                                                                        To get that level of customizability, you might use some fully general SSG, like Jekyll or Docusaurus, but, at that point, you might as well just roll your own!

                                                                        1. 5

                                                                          I totally agree mdbook probably wouldn’t suit your needs. I will say that until very recently I didn’t appreciate how customizable it truly is, I’ve been messing with the plugin system and it’s pretty cool.

                                                                        2. 4

                                                                          We’re using mdBook for our documentation and there have been benefits and drawbacks.

                                                                          We’ve customised things quite a lot with extra CSS, JS, and preprocessors which has worked well. For example the wavedrom preprocessor generates register diagrams and waveforms without us having to check in SVGs.

                                                                          The main problem has been that we want to mix our documentation into the code. There are README.md files all over the repo and most modules have a doc/ directory. This means turning the whole repo into a book, which doesn’t seem like something mdBook was designed for. When you mdbook build it copies the entire repository (including build artifacts) to the output and seems to run them all through the search indexer.

                                                                          There are also limits to how far we want to customise it. It would be nice to be able to search in only the hardware or software sections for example, but we don’t want to mess with the search bar too much. It would also be nice to have the doxygen/rustdoc easily accessible within the book, but we don’t want to mess with the navigation too much.

                                                                          Soon we may need to add a way to navigate to slightly different variants/versions of our documentation and I think this might push mdbook a bit too far. The TigerBeetle docs here look great and I’ll think of them if we need a change. I also think the Pigweed docs are a good example, though I don’t remember if they’re custom-made.

                                                                          1. 3

                                                                            One useful heuristic here is that the tool should bend to the content, not vice verse. It’s easy to change content processing rules, but it is much harder to change the content to fit the rules.

                                                                            That’s actually was one of the bigger gripes with docusaurus — it forced us to spill part of the information about the content into docusaurus-specifc meta.

                                                                          2. 3

                                                                            Why? The article explains many of their choices. TigerBeetle is written in Zig. Mdbook is rust

                                                                            1. 6

                                                                              As a counter example, we use Haskell pandoc for parsing markdown, not Zig’s SuperMD.

                                                                              1. 2

                                                                                Have you considered using cmark-gfm, a C library for parsing GFM (used by GitHub themselves)?

                                                                                We use it to convert Markdown to XHTML which we then splice into the container document (used to display package README.md in our package repository web interface). Works reasonably well. If I were writing a Markdown-based document system from scratch, that’s what I would likely use.

                                                                                1. 3

                                                                                  Yup! Our gut feeling is that pandoc should be more long-term stable, but this is mostly a coin toss. Using cmark-gfm would’ve also worked fine I suppose, but for our use case I don’t think it makes much of a difference.

                                                                              2. 6

                                                                                Mdbook is a CLI tool. It doesn’t matter what language it’s implemented in. It would be like saying you can’t use grep in a Zig project.

                                                                            2. 4

                                                                              Am I missing something? Where’s the actual proposed standard? They talk about the need for a standard, discuss how it should operate, say they are proposing a standard and then…. crickets?

                                                                              1. 4

                                                                                The ACM article this is supporting goes into more depth. You are correct that this is not presenting a standard, it is a rallying cry for said work to begin, and is likely to take a long time (“several decades”) to become mature.

                                                                                1. 1

                                                                                  Thanks for the response Steve. I understand what they’re trying to achieve, but I’m not sure it’s going to turn out how they hope. A lot of ‘safety-washing’ and obfuscation has already happened in the memory safety debate, and it’s not yet clear whether the standard will clarify or obscure those issues.

                                                                              2. 1

                                                                                Is there a specific reason the README is worded in a way that people (see someone’s comment here) have to ask: Is this a web framework?

                                                                                Might be nitpicky, but when I just saw this, I also thought of something slightly different. And yes, of course web framework is just as loose a term as any. Kinda glad the README is not “THIS IS THE NEW BEST WEB FRAMEWORK”, although that would have given away that a) it is indeed a web framework and b) it’s probably not the best.

                                                                                1. 2

                                                                                  I’ve never seen this reaction before, so I’m not sure. Worth thinking about.

                                                                                  1. 1

                                                                                    I think it reminded me of https://docs.postgrest.org/en/v12/ - the “ turns your PostgreSQL database directly into a RESTful API” part, but I’m not claiming I make sense :P

                                                                                2. 3

                                                                                  Why is there no way to add an API handler function that runs on every request?

                                                                                  How has this design choice played out? It’s been a few years, I’m curious to hear lessons learned.

                                                                                  1. 1

                                                                                    I’m into it. I’m a big fan of the typestate pattern, and even though this can feel a bit repetitive for endpoints with less logic, I like that it’s so straightforward. No more worrying about the order various handlers run…

                                                                                    1. 1

                                                                                      Interesting. So is the idea with regards to typestate that you’d ensure that your routes/apis do X and Y and Z steps before calling into some function F(DidZ) ?

                                                                                      1. 1

                                                                                        I haven’t open sourced my codebase yet, but yeah. So like, instead of saying “this API call is guarded by a is_logged_in? handler, my “save this Foo to the database” function requires an Authorization struct. How can you get one of those? Well, the only way is to call into the authorization subsystem. And doing that requires a User. And you can only get a User by calling into the authentication subsystem. And that happens to take a request context.

                                                                                        Nexus (the control plane API) does it slightly differently, but they have significantly more complexity to their auth than I do. See 24-39 here: https://github.com/oxidecomputer/omicron/blob/main/nexus/auth/src/context.rs

                                                                                        And some examples of using it here: https://github.com/oxidecomputer/omicron/blob/main/nexus/src/external_api/http_entrypoints.rs

                                                                                        You can see how a lot of these handlers have the general form “grab a nexus instance from the Dropshot context, construct a Context from the request context, then call some nexus method, passing the context in.” Same basic idea, except a little cleaner; I’m sort of in a “embrace a little more boilerplate than I’m used to” moment and so rather than the Context stuff I’m doing the same idea but a bit more “inline” in the handlers. I might remove that duplication soon but I want to sit with it a bit more before I overly refactor.

                                                                                        Anyway, I think that in Nexus, that these handlers have the same sort of shape but are also a bit different is the strength of this approach. I’ve worked on rails apps where the before, after, and around request middleware ended up with subtle dependencies between them, and ordering issues, and “do this 80% of the time but not 20% of the time” kinds of things. Doing stuff this way eliminates all of that; it’s just normal code that you just read in the handler. I’ve also found that this style is basically the “skinny controller” argument from back in the day, and comes with the same benefits. It’s easier to test stuff without needing Dropshot at all, since Dropshot itself is really not doing any business logic whatsoever, which middlewares can often end up doing.

                                                                                        1. 2

                                                                                          Yep, perfect. Makes sense to me. I’m wondering about composability though. (edit: nvm, code explained my questions)

                                                                                  2. 6

                                                                                    Happy to answer questions about Dropshot, I use it every day.

                                                                                    Specifically, Dropshot can produce OpenAPI documents, and then https://github.com/oxidecomputer/oxide.ts can generate a typescript client from that diagram. I personally am also using sqlx, so I get types the whole way from my database up through into the browser.

                                                                                    It’s not a perfect web server, but it serves us really well.

                                                                                    1. 2

                                                                                      A bit of a random one, but is the name a reference to dropwizard by any chance?

                                                                                      1. 3

                                                                                        It is definitely in part a dropwizard reference. Also part Fishpong

                                                                                        Fishpong is https://github.com/fishpong/docs/wiki/Primer, it’s a ping-pong variant that a lot of folks at Oxide love. “drop shot” is a concept that’s across various racket sports. https://en.wikipedia.org/wiki/Drop_shot

                                                                                        It also has the advantage of being short and wasn’t taken at the time.

                                                                                        1. 2

                                                                                          A common pattern I see in HTTP APIs is to treat responses like sum types where the status code determines the structure of the body and so on. For example, the Matrix protocol’s media APIs use 200 when media can be served directly and 307/308 if a redirect is necessary. Is it possible to model this in Dropshot such that it’s reflected in the generated OpenAPI document?

                                                                                          It seems like it probably isn’t, because each request handler function is permitted to have exactly 1 successful response code and several error codes (which must be in the 4XX and 5XX range, see ErrorStatusCode), and each error code shares the same body structure. Looking at the API design, ApiEndpointResponse has an Option<StatusCode> field, HttpCodedResponse has a StatusCode associated constant, and HttpResponseError only has a method for getting the current status code of an error response value. Looking at the generated OpenAPI document for the custom-error example seems to support this conclusion. Am I missing something?

                                                                                          1. 3

                                                                                            Not 100% sure but I believe that’s true today, yes – HttpCodedResponse has a const status code.

                                                                                            Modeling this is an interesting challenge – I can imagine accepting an enum with variants and then producing the corresponding sum type in the OpenAPI document.

                                                                                            1. 1

                                                                                              Yeah, that’s the solution I came up with for a similar project I attempted that was more warp-shaped than axum-shaped; here’s an example. The derive macro generates metadata about the response variants and uses the variant names to decide the HTTP status code to use. Maybe useful as prior art, I dunno. The obvious downside is that ? won’t work in this situation without either Try stabilizing or defining a second function where ? works and doing some mapping in the handler function from that function.

                                                                                              1. 2

                                                                                                re ? not working, you can always map a successful response into the corresponding enum variant explicitly, right?

                                                                                                1. 1

                                                                                                  In the design I posted, there aren’t really “successful” or “unsuccessful” responses, just responses. Responses of all status codes go into the same enum, and each handler function directly returns such an enum, not wrapped in Result or anything. So if you want to use ? to e.g. handle errors, you have to define a separate function that returns e.g. Result which uses ? internally, and then call that function in the handler function and then use match or similar to convert the Result‘s inner values into your handler function’s enum.

                                                                                                  I believe that’s a long winded way of answering “yes”, but I thought I’d elaborate further to try to make it clearer what I was trying to say originally just in case.

                                                                                                  1. 2

                                                                                                    Ah I see. What do you think of separating out successful and unsuccessful variants into a result type? 2xx and 3xx on one side, 4xx and 5xx on the other.

                                                                                                    1. 1

                                                                                                      That could work. I think doing it that way could be more convenient for users because ? would be usable (though probably still require some map and map_err), but at the cost of some library-side complexity. Using a single enum obviates the need for categorizing status codes. Dropshot has already solved that problem with its ErrorStatusCode (but you’ll probably also need a SuccessStatusCode, which I’m not sure currently exists).

                                                                                                      Personally, I don’t think either way would make much difference for me as user. When working with HTTP libraries, I generally implement actual functionality in separate functions from request handlers anyway, so that such functions have no knowledge of any HTTP stuff. There are various reasons for this, the relevant one being that this minimizes the amount of code that has to actually care how the HTTP library is designed. Not everyone operates this way, though.

                                                                                          2. 1

                                                                                            Something about the headline doesn’t quite click for me: is this a web framework? Namely, does it provide the web server loop as well? Or is it just related to the data model mapping between types and endpoints?

                                                                                            1. 2

                                                                                              Yeah, it has the web server loop too. It’s maybe a bit too bare bones to be a “framework,” like it’s closer to a flask/sinatra than a Django/Rails. But it’s focused on “I want to produce a JSON API with OpenAPI” as a core use case.

                                                                                          3. 8

                                                                                            This crate should really be more well-known in the Rust ecosystem! The following quote on OpenAPI support resonates a lot with me (source):

                                                                                            [An] important goal for us was to build something with strong OpenAPI support, and particularly where the code could be the source of truth and a spec could be generated from the code that thus could not diverge from the implementation. We weren’t sure that would be possible, but it seemed plausible and worthwhile if we could do it. None of the crates we found had anything like this.

                                                                                            1. 1

                                                                                              I use something like this in the Go ecosystem, and generating an OpenAPI spec from source code is much better than generating source code from an OpenAPI spec.

                                                                                              1. 5

                                                                                                One under-appreciated reason why it’s better: you don’t have to handle the entirely of the OpenAPI spec, just the features your server framework wants to produce. We took advantage of this by also writing our own code gen for clients too, that’s also easier to make good because you only need to handle that same subset.

                                                                                            2. 5

                                                                                              I wonder how much code in the kernel also has been broken without anyone noticing.

                                                                                              The whole thing also suggests there’s not much testing in the kernel in general, be it automated or manual.

                                                                                              1. 14

                                                                                                Kernel development is pretty unusual in the sense that many things you’d expect to be part of a project are carried out downstream of it.

                                                                                                For one, there’s no CI and not much in the way of testing there. Instead, developers and users do all kinds of testing on their own terms, such as the linux test project or syzbot (which was the thing that found this filesystem bug).

                                                                                                I was even more surprised when I found out that documentation is also largely left to downstream, and so there are a bunch of syscalls that the Linux manpages project simply hasn’t gotten around to add manpages for.

                                                                                                1. 15

                                                                                                  I was pretty surprised to find that the only documentation for the ext filesystems was just one guy trying to read the (really dodgy) code and infer what happened under different cases. A lot of “I think it does X when Y happens, but I’m not sure” in the documentation. And reading through the code, I understand it. That filesystem is its own spec because no one actually understands it. It’s wild how much this kind of stuff exists near the bottom of our tech stacks.

                                                                                                  1. 8

                                                                                                    Yup. I tried to learn about the concurrency semantics of ext4 - stuff like “if I use it from multiple processes, do they see sequentially consistent operations?”, so I asked around. Nobody was able to give me a good answer.

                                                                                                    Also, one kernel dev in response smugly told me to go read memory-barriers.txt. Which I’d already read and which is useless for this purpose! Because it documents kernel-internal programming practices, not the semantics presented to userland.

                                                                                                  2. 3

                                                                                                    This is also true of stuff like gcc. Just a very different style than more modern projects.

                                                                                                    1. 1

                                                                                                      GCC has quite a bit of upstream documentation.

                                                                                                      1. 2

                                                                                                        I’m talking about the testing, you’re right that every time I’ve looked at it’s documentation, it feels very thorough.

                                                                                                2. 1

                                                                                                  I think this sort of thing, beyond anything technical, is the biggest argument for growing the standard library (or, at least, those official ordained and maintained by the rust-lang org). I can argue about the technical merits of Rust to my Rust-sceptic colleagues all day, but it’s difficult to deny that a worrying number of important dependencies in the ecosystem have a low bus factor.

                                                                                                  1. 8

                                                                                                    I don’t understand how the standard library is any different?

                                                                                                    Python, as one example, has several part of its standard library that are unmaintained and falling backward. Same for erlang. How is being part of the standard library helping?

                                                                                                    1. 7

                                                                                                      Indeed, being in the standard library doesn’t guarantee active maintenance.

                                                                                                      There aren’t enough people in the core team to work on everything. There still has to be someone with expertise and dedication to improve the code.

                                                                                                      There aren’t that many people available for reviewing pull requests either. The standard library is focused on preserving backward compatibility and portability, so changes take more time and it’s harder to contribute more than an occasional drive-by tweak.

                                                                                                      In Rust, mpsc channels were an example of this: they were in the standard library, technically maintained, but nobody actively worked on improving the implementation. Eventually the std code got replaced with an implementation developed outside of the standard library.

                                                                                                    2. 4

                                                                                                      This just puts a bit of rusty paint on the problem. The Rust project had very little capacity for library maintenance, the bus factor would not differ substantially. Indeed, for that reason, the project has more moved libraries out rather than in.

                                                                                                      1. 6

                                                                                                        Coming from another language ecosystem, I think this doesn’t acknowledge an extremely common scenario: library maintainer is AWOL, and people show up with patches, and it’s just that there is nobody there to hit merge and do a release.

                                                                                                        There are huge piles of projects where there are a bunch of people who are capable, reviewing each other’s changes, and saying “yes this works, I have a fork and we have it working” etc etc etc, but the release can’t happen because one person holds the keys.

                                                                                                        Having a large team who holds release keys is generally something that can work out! For every “the maintainer is the only one who could actually make a call on this”-style issue there are hundreds of “maintainer just is done with the project” situations.

                                                                                                        I’m not calling to have eminent domain, but having lang teams be able to adopt packages (with maintainer consent) and allowing for a light touch can totally work, especially in the age of Github & co.

                                                                                                        1. 4

                                                                                                          Bus factor and capacity are different things. I’m not expecting development to go faster or anything: but when something really, really critical pops up that needs fixing with haste, a larger pool of people ready to merge a fix and deploy a release matters. Ossification and glacial development is fine, not having an official patch for a CVE because the one dev is (quite understandably!) on holiday, less so.

                                                                                                          1. 7

                                                                                                            The standard library only releases every six weeks. The more stuff that’s in the standard library, the more stuff that would maybe call for an emergency release, which is far heavier weight and far more disruptive to the ecosystem. People already think Rust releases too often.

                                                                                                            1. 4

                                                                                                              Also, a crypto vulnerability is likely to have a high priority. If have to urgently release/deploy, I’d much rather update a single crate than the whole compiler and stdlib.

                                                                                                              1. 1

                                                                                                                Not trying to be snarky, but is the complaint about too many releases (and operational churn)? Or too much breakage (from stuff like the “partial ordering for sorts can trigger panics now” saga)?

                                                                                                                Like one, you do land on “we should bundle up releases”, but the other is a different flavor of thing, right? Though I mention the sorting panic thing only because it’s the only case I have even heard of.

                                                                                                                Saying all that, while I like fat standard libraries, it does feel like the weaker version of “it’s in the rust-lang org but still packaged separately” gets things 95% of the way there for these kinds of issues, IMO.

                                                                                                                1. 5

                                                                                                                  Operational and “reputational” churn. The time crate is the only breakage I’ve experienced personally in a long time. The process for releasing rustc is involved, the process of releasing a crate is “cargo publish.”

                                                                                                                  I already regularly hear complaints from very conservative type folks who look at the six week cadence and complain about how often releases happen. Adding more to the mix regularly increases the perception that Rust is still mega unstable, when that’s just not the case.

                                                                                                                  1. 1

                                                                                                                    Can defeinitely see rustc’s released process being involved vs cargo. Having multiple trusted people be able to run cargo publish is the valuable thing here, moreso than it being in the standard library itself.

                                                                                                                    I do think it’s important to dig into what the reason for the cadence complaints really are though. I feel like the biggest lesson to be taken from Chrome showing up and doing releases all the time is that smaller releases (-> more release) are easier to manage in theory. It would be a shame if those complaints are downstream of other issues that are, in a word, “more” fixable?

                                                                                                                    This is all context free of course. At least in Python land complaints of things moving too fast that I hear tend to be broken down into “too much new stuff to learn” (a reason I guess, but one I don’t care about) and “old code is not working anymore too often” combined with “package maintainers disappearing” (which are realer issues, and can be handled).

                                                                                                                    1. 5

                                                                                                                      It’s just systems people who are used to C89 and so they see backwards compatible but new features as “breaking” because they want to stay on old compilers but compile new code.

                                                                                                                      1. 1

                                                                                                                        OK I see the case now. The maximally generous version of this is “well there is some issue X that is stopping my upgrade but everything around me is moving so fast that it throws me forward” (in Node this is common because packages agressively upper bound node versions even when they don’t need to, leading to a bunch of incidental things making upgrades hard), but tbh I feel like Rust isn’t doing it that much. Especially nowadays where people seem pretty serious about not using stuff from nightly etc.

                                                                                                                        1. 4

                                                                                                                          The maximally generous version of this is “well there is some issue X that is stopping my upgrade but everything around me is moving so fast that it throws me forward”

                                                                                                                          I agree, but the “issue x” is mostly “i want to use a dependency and it uses some rust language feature that is newer than my current stable version” and the “stopping my upgrade” is simply “I don’t want to” not “I tried it and it didn’t work.” That is, it’s perceived instability, rather than actual instability. For these folks, “stable” means “unchanging” not “backwards compatible.”

                                                                                                          2. 2

                                                                                                            The shift from RSA to ECDSA is, relatively speaking, quite recent and we just had Microsoft announce their crazy qubit capabilities meaning quantum resistant cryptos are maybe sooner required than we previously thought. I think crypto is exactly the kind of thing that needs to be kept out of a standard library to avoid churn/deprecation.

                                                                                                            Some officially blessed crate, sure.

                                                                                                            1. 3

                                                                                                              I’ll leave a top-level comment on it, but aside from having a single “blessed” library, I think consensus on cryptographic trait implementations for backends would be fantastic.

                                                                                                              High-level libraries that can be built to consume those traits, low-level implementations that implement them; would lead to easier agility when the “blessed” library is no longer the go-to.

                                                                                                            2. 2

                                                                                                              The idea that absorbing something like ring into stdlib would automatically give it new maintainers feels like wishful thinking. Getting maintenance done is a combination of having the required skill (and for crypto, the bar is high) and accepting that it’s not someone else’s problem.

                                                                                                              The “someone else’s problem” feeling can be pretty high for code in a stdlib. Whereas here, in the crate ecosystem, when the ring maintainer gave up we got an advisory that prompted new maintainers to step in. Looks like FOSS is working.

                                                                                                            3. 8

                                                                                                              Hi all! I finally decided to write the monad tutorial of my dreams, using Rust and property-based testing as the vehicle to understand them.

                                                                                                              A lot of monad explanations focus on either the formalism, or an IMHO excessive level of casualness. While those have their place, I think they tend to distract from what I think is an important and profound design pattern. My explanation has:

                                                                                                              • no analogies to food items or anything else
                                                                                                              • no Haskell
                                                                                                              • no category theory

                                                                                                              Instead, this post talks about:

                                                                                                              • real-world problems
                                                                                                              • production-grade Rust
                                                                                                              • with performance numbers, including a very pretty log-log cdf!

                                                                                                              There’s also a small Rust program to accompany the post.

                                                                                                              1. 6

                                                                                                                Hi, it’s commendable that you set forth to bridge communities and languages with this post, but among other TCS/algebraic concepts monads tend to get variously conflated with their implementation in this or that concrete language. However you make some pretty strong claims in the post, and some downright inaccurate ones.

                                                                                                                The thing we care about is composition of computations that might have effects. As such bind/flatMap are generalizations of the pure function composition “dot” operator.

                                                                                                                This is the essence of monadic composition: powerful, unconstrained, and fundamentally unpredictable.

                                                                                                                Not really. Lists and Optional computations can be composed with monadic bind, but would you say composition of programs that return these values is unpredictable?

                                                                                                                Some of this misunderstanding is a historical accident; Haskell lets you talk about effect types that do not involve I/O, so it proved to be a good test bench for this concept.

                                                                                                                monadic composition is Turing-complete

                                                                                                                This is a property of the language, not of the composition operator.

                                                                                                                1. 7

                                                                                                                  Thank you for the feedback! For context, I studied some category theory in graduate school and I can also talk about natural transformations and endofunctors, but I made a deliberate decision to avoid all of that. There are a large number of developers who would benefit from recognition of monadic patterns in their daily professional lives, but have been traumatized by overly technical monad tutorials – that’s the audience for my post. If you’re not in that audience, there are plenty of other monad tutorials for you :)

                                                                                                                  Even in the appendix, I talk about the monad laws in terms of their application to strategies with the concrete Just and prop_flat_map. That’s a deliberate pedagogical decision too. I enjoy working in abstract domains, I just think it really helps to be motivated by practical concerns before going deeper into them.

                                                                                                                  but among other TCS/algebraic concepts monads tend to get variously conflated with their implementation in this or that concrete language.

                                                                                                                  This is absolutely true. In my post I was trying to focus on why monadic bind is of deep interest and not just a Haskell/FP curiosity — why working programmers should at least notice it whenever it pops up, no matter what environment or language they are in. (Speaking personally, in one of my first projects at Oxide, very far away from grad school, I had the opportunity to add monadic bind to a system, but deliberately chose not to do so based on this understanding.)

                                                                                                                  I think of it as similar to group theory, where you can either introduce groups formally as a set with an attached operation that has certain properties, or motivate them through an “implementation” in terms of symmetries (maybe even through the specific example of solving the quintic). I personally prefer the latter because it gives me really good reasons for why I should care about them, and also helps explain things like why group theory is central to physics. In my experience teaching people, this preference is shared by most of them. There’s always time later to understand the most general concept.

                                                                                                                  Not really. Lists and Optional computations can be composed with monadic bind, but would you say composition of programs that return these values is unpredictable?

                                                                                                                  Absolutely, relative to fmap in each of the monads. I mention an example at the end with lists, where with bind you don’t know the size of the resulting list upfront – for Rust this has performance implications while collecting into a vector, since you can’t allocate the full capacity upfront (though that’s a tiny difference compared to the exponential behavior of test case shrinking). Even with optionals, fmap means a Some always stays a Some, while bind can turn a Some into a None. Relatively speaking that’s quite unpredictable.

                                                                                                                  The extent to which monadic bind’s unpredictability matters is specific to each monad (it matters less for simpler monads like optionals, and more for strategies, futures and build systems.) But in all cases it is unpredictable relative to functor (or applicative functor) composition within that monad.

                                                                                                                  This is a property of the language, not of the composition operator.

                                                                                                                  This is true. What I was trying to say here is that in a typical Turing-complete environment, the Turing completeness means that introspecting the lambda is impossible in general. (And this is true in non-Turing-complete environments too – for example, in primitive recursive/bounded loop environments it’s still quite unpredictable.) I’ll try and reword this tomorrow.

                                                                                                                  1. 5

                                                                                                                    I appreciate the thoughtful approach. I learn much better starting with the specific and moving to the general than starting with the abstract.

                                                                                                                    I wrapped up Georgia tech OMSCS and one of my favorite classes was Knowledge Based AI which focused on how humans learn things. (And the AI is classical and not LLMs). One core takeaway for me was the general/specific learning dichotomy.

                                                                                                                    A really interesting learning approach was called “version spaces” which acted like bidirectional search by building general and specific info at the same time. Basically, people need both models to fully absorb a concept, but how individuals learn best varies.

                                                                                                                    All that to say: thanks again, I think it takes a lot of work and effort to make something approachable and I appreciate your post.

                                                                                                                    1. 4

                                                                                                                      You may enjoy this post: https://terrytao.wordpress.com/career-advice/theres-more-to-mathematics-than-rigour-and-proofs/

                                                                                                                      This three-staged learning process is very common, in my experience, and i feel like general/specific is similar to intuitive/rigorous.

                                                                                                                    2. 2

                                                                                                                      There are a large number of developers who would benefit from recognition of monadic patterns in their daily professional lives, but have been traumatized by overly technical monad tutorials

                                                                                                                      When intersected with “Rust developer”, do you think that’s still a large group…? If someone finds monad tutorials to be “overly technical” then they’re never going to make it past JavaScript, much less discover systems programming languages like Rust.

                                                                                                                      I’m one of the fools who once upon a time tried to use Haskell for systems programming, and of all of the Rust/C/C++ developers I’ve met, their primary difficulty with monads is that all the documentation was non-technical (i.e. prose) and did the normal academic math thing of using five different words to identify the same concept in different contexts.

                                                                                                                      This article is a new way of writing a confusing explanation of monads, which is that it starts off by diving deep into an obscure testing strategy that’s pretty much only used by people really into functional programming, and then slowly works its way back into the shallow waters of monads, then dives right into a discussion on how function composition is Turing-complete[0] and how exemplar reduction in property-based testing can have unpredictable runtime performance. You’ve got, like, three different articles there!

                                                                                                                      If you want someone who can’t spell “GHC” to understand monads, there’s only three types you need:

                                                                                                                      • Lists (Rust’s Vec, C++ std::vector) with flat_map
                                                                                                                      • Maybe (Option, std::optional or local equivalent) with and_then
                                                                                                                      • Either (Result, std::expected or local equivalent) with and_then

                                                                                                                      Don’t go off into the weeds about property testing, only teach one alien concept at a time. Those three types have super-simple implementations of (>>=), they’re universally familiar to everyone who’s not been frozen in Arctic sea ice since 1995, and it’s easy to go from ? to do-notation if you want to demystify the smug “monads are programmable semicolons” in-joke while you’re at it.

                                                                                                                      Then, once your reader is comfortable with the concept of nested inline functions as a control flow primitive, you can link them to your follow-up article about the performance implications of monadic combinators or whatever.

                                                                                                                      [0] The article acts as if this is a surprising and meaningful revelation, so I might be misunderstanding what’s actually being discussed, but when you say “monadic composition is Turing-complete” you mean something like “bind :: (a -> T b) -> T a -> T b is Turing-complete”, yes? I feel like among people who know what “Turing-complete” means, the knowledge of a Turing machine’s equivalence to function composition is well-known.

                                                                                                                      1. 7

                                                                                                                        When intersected with “Rust developer”, do you think that’s still a large group…? If someone finds monad tutorials to be “overly technical” then they’re never going to make it past JavaScript, much less discover systems programming languages like Rust.

                                                                                                                        Yes, it’s a large group. The number of Rust people that come from Haskell or other FP languages is tiny.

                                                                                                                        This article is a new way of writing a confusing explanation of monads, which is that it starts off by diving deep into an obscure testing strategy that’s pretty much only used by people really into functional programming

                                                                                                                        Property-based testing is quite widely used by Rust projects! It’s not the most common way to test software, but it’s far from obscure. It’s also really effective at finding bugs in systems code. I haven’t done any professional work in FP languages, but I’ve used PBT extensively.

                                                                                                                        I even picked a very systems-y example, writing a production-grade sort function that’s resilient to comparator misbehavior. This is the the kind of thing Rust developers enjoy.

                                                                                                                        and then slowly works its way back into the shallow waters of monads, then dives right into a discussion on how function composition is Turing-complete[0] and how exemplar reduction in property-based testing can have unpredictable runtime performance. You’ve got, like, three different articles there!

                                                                                                                        Yes, deliberately so. This kind of progressive complexity enhancement, with a tutorial for A also being a tutorial for B in disguise, is the style I teach in. It’s one I’ve practiced and refined over a decade. It doesn’t work for everyone (what does?) but it reaches a lot of people who bounce off other explanations.

                                                                                                                        If you want someone who can’t spell “GHC” to understand monads, there’s only three types you need

                                                                                                                        With respect, I quite emphatically disagree. I personally believe this approach is a colossal mistake, and I’m far from the only one to believe this (I’ve had a number of offline conversations about this in the past, and after I published this post a professor reached out to me privately about this as well.) I deliberately picked PBT as one of the simplest examples of monadic bind being weird and fascinating.

                                                                                                                        1. 4

                                                                                                                          With respect, I quite emphatically disagree. I personally believe this approach is a colossal mistake, and I’m far from the only one to believe this (I’ve had a number of offline conversations about this in the past, and after I published this post a professor reached out to me privately about this as well.) I deliberately picked PBT as one of the simplest examples of monadic bind being weird and fascinating.

                                                                                                                          The problem with only using List, Option and (perhaps to a lesser extent) Either, as examples is that they’re “containers” (List is commonly understood; Option can be understood as “a list with length < 2”; Either can be understood as Option whose “empty” case provides a “reason” (e.g. an error message)). Containers come with all sorts of intuitions that make interfaces like Monad less appealing. For example, what’s the point of y = x.flat_map(f) compared to ordinary, everyday, first-order code like for (element : x) { y += f(element); }?[0]

                                                                                                                          List (and Option) are definitely good anchors for our understanding, e.g. as sanity checks when reading a generic type signature or a commutative diagram; but we also need examples that aren’t “containers”, to show why these interfaces aren’t just some weird alternative to “normal” looping. A set of examples which aren’t “containers”, but may still be familiar, are things which “generate” values; e.g. parsers, random generators, IO, etc.[1]. Those are situations where our intuition probably isn’t “just loop”[2], so other interfaces can get a chance[3]. The fact that Monad & friends apply to both “containers” and “generators” is then a nice justification for their existence. Once we’re happy that this is a reasonable interface, we can go further into the weeds by deriving some examples purely from the interface, to get a better feel for what it does/doesn’t say, in general (e.g. a “container” that’s always empty, i.e. a parameterised unit type; or a Delay type, which captures general recursion; etc.).

                                                                                                                          [0] Indeed, Scala’s for is syntactic sugar for monadic operations, similar to Haskell’s do. Although that does require extra concepts like yield and <-, which don’t appear in e.g. a Java for-loop; and may require understanding of monad-like-things (I can’t say for sure, since I first tried Scala after after many years of Haskell programming!).

                                                                                                                          [1] Parsers and random generators are actually very closely related, e.g. we can think of a random generator as a parser that’s been given a random (but valid) input. Parser combinators are the most obvious way to understand what it means for parsers to be monads: they used to be quite academic, but I think are now common enough to be a motivating example for “why care”, even for those who tend to use other approaches. Random generators like those in QuickCheck (i.e. built using composition) seem much less common than e.g. only generating random ints, and operating on those directly; which makes them a less motivating example up-front. However, passing around PRNGs may be a good motivation for monadic state, and combining this with monadic parsing to get QuickCheck-style generators might be a nice climax :)

                                                                                                                          [2] We can do equivalent things with loops if we’re comfortable with yield; but (a) that’s a smaller subset of those that are comfortable with for, and (b) that rabbit hole leads more towards delimited continuations, algebraic effects, etc. which are alternatives to monads that are just as fascinating to think about :)

                                                                                                                          [3] I think the intuition for “generators” motivates the idea of composition. For “containers”, composition is merely an optimisation: we can just do multiple passes instead. Whereas for “generators”, it feels like we “don’t have the values yet”, so we need some way to plug them together before they’re “run”. IO is not a good motivator here, since imperative languages automatically compose statements for us. It seems better to come back to it after grokking parsers, random generators, etc.; even then it might be better to first describe an abstraction like a “task queue”, and only later introduce IO as a sort of “task queue for the whole language”.

                                                                                                                          1. 4

                                                                                                                            Thanks for the thoughtful reply.

                                                                                                                            List (and Option) are definitely good anchors for our understanding, e.g. as sanity checks when reading a generic type signature or a commutative diagram; but we also need examples that aren’t “containers”, to show why these interfaces aren’t just some weird alternative to “normal” looping. A set of examples which aren’t “containers”, but may still be familiar, are things which “generate” values; e.g. parsers, random generators, IO, etc.

                                                                                                                            I really love this way of looking at it, as well as you pointing out later that IO is not a good monad to introduce to people either because it is implicit in imperative code.

                                                                                                                            For me, there were two things that made monads really click:

                                                                                                                            • Build Systems a la Carte, which draws a distinction between monadic and applicative build systems – this one made me first realize that monads are a general system property much more than a Haskell curiosity
                                                                                                                            • The PBT example in the post, where generation isn’t hugely affected by monadic composition, but shrinking is
                                                                                                                        2. 3

                                                                                                                          If you want someone who can’t spell “GHC” to understand monads, there’s only three types you need:

                                                                                                                          • Lists
                                                                                                                          • Maybe
                                                                                                                          • Either

                                                                                                                          I dunno if it’s because I learned monads when they had only recently been discovered as the solution to purely functional IO, but I expect that many programmers who have vaguely heard about monads know that they are how you do IO in Haskell. So I think a practical monad tutorial should try to bridge the gap between monads as pure algebra (lists, maybe, either) and how they are useful for impure programming.

                                                                                                                          (I have noticed in several cases that many programmers find it hard to get to grips with very abstract ideas, if the ideas don’t come with concrete practical examples of how the abstraction is used in practice and what benefits the abstraction provides. This is especially true the less complicated the abstraction is, which is why monads are so troublesome.)

                                                                                                                          The problem with list/either/maybe is that they are all parametric types, so all the monad is able to do is faff around with the layout of the values. It’s hard for a tutorial to illustrate what benefit you get from an explicit monad as opposed to less polymorphic list/either/maybe combinators.

                                                                                                                          So I think a monad tutorial should show an example of something more imperative such as the state monad. That allows you to show monads in use with functions that do practical things with the container type, and how the monad sequences those functions. (Perhaps also emphasise Either as the exception monad.) It’s then only a small step to monadic IO.

                                                                                                                          1. 3

                                                                                                                            ISTM that now that many languages have features like promises, there’s more relevant common knowledge among imperative programmers than there used to be. This might be an easier on-ramp than what the initial discoverers had to struggle through. A Promise<Promise<a>> can be flattened into a Promise<a>, and you can write a version of bind. Thinking of bind as “map + join” also helps avoid the “but I don’t have an a so how can I run my a -> m b function?” that I struggled with when understanding monads as they applied to things other than concrete data structures.

                                                                                                                          2. 2

                                                                                                                            Dealing with your footnote, even as someone fairly familiar with function composition, I wouldn’t immediately notice that “bind :: (a -> T b) -> T a -> T b” qualifies as function composition, but “fmap :: (a -> b) -> T a -> T b” is not. Sure, if I as down and wrote it out, it would become clear quickly, but leaving this as an exercise for the reader is just poor pedagogy.

                                                                                                                            1. 1

                                                                                                                              Would it be clearer if you considered composeM :: (a -> T b) -> (b -> T c) -> (a -> T c)? Because you can write it in terms of bind and vice-versa, provided you also have pure. (Final parens are redundant but added for clarity.)

                                                                                                                          3. 1

                                                                                                                            Relatively speaking that’s quite unpredictable.

                                                                                                                            Yep, thank you for clarifying. But I still think that not preserving the number of elements of a container is not the same thing as being unpredictable. For example, there are things for which you can define monadic bind (e.g. functions, https://hackage.haskell.org/package/base-4.12.0.0/docs/src/GHC.Base.html#line-828 ), for which binding means piling applications on top of each other.

                                                                                                                        3. 2

                                                                                                                          Do you think it’s perverse that when I first read a rust tutorial I was perplexed about not putting semicolons at the end of a block until I decided that semicolons are just monadic bind (I don’t think I got around to writing any rust)

                                                                                                                          1. 3

                                                                                                                            It is true that semicolons are a monadic bind, but I also think that lens confuses more than it enlightens :)

                                                                                                                            1. 3

                                                                                                                              It’s the sunglasses from They Live but they show you the monads underlying all computation

                                                                                                                        4. 69

                                                                                                                          The majority of bugs (quantity, not quality/severity) we have are due to the stupid little corner cases in C that are totally gone in Rust. Things like simple overwrites of memory (not that rust can catch all of these by far), error path cleanups, forgetting to check error values, and use-after-free mistakes. That’s why I’m wanting to see Rust get into the kernel, these types of issues just go away, allowing developers and maintainers more time to focus on the REAL bugs that happen (i.e. logic issues, race conditions, etc.)

                                                                                                                          This is an extremely strong statement.

                                                                                                                          I think a few things are also interesting:

                                                                                                                          1. I think people are realizing how low quality the Linux kernel code is, how haphazard development is, how much burnout and misery is involved, etc.

                                                                                                                          2. I think people are realizing how insanely not in the open kernel dev is, how much is private conversations that a few are privy to, how much is politics, etc.

                                                                                                                          1. 35

                                                                                                                            I think people are realizing how insanely not in the open kernel dev is, how much is private conversations that a few are privy to, how much is politics, etc.

                                                                                                                            The Hellwig/Ojeda part of the thread is just frustrating to read because it almost feels like pleading. “We went over this in private” “we discussed this already, why are you bringing it up again?” “Linus said (in private so there’s no record)”, etc., etc.

                                                                                                                            1. 45

                                                                                                                              Dragging discussions out in front of an audience is a pretty decent tactic for dealing with obstinate maintainers. They don’t like to explain their shoddy reasoning in front of people, and would prefer it remain hidden. It isn’t the first tool in the toolbelt but at a certain point there is no convincing people directly.

                                                                                                                              1. 31

                                                                                                                                Dragging discussions out in front of an audience is a pretty decent tactic for dealing with

                                                                                                                                With quite a few things actually. A friend of mine is contributing to a non-profit, which until recently had this very toxic member (they’ve even attempted felony). They were driven out of the non-profit very soon after members talked in a thread that was accessible to all members. Obscurity is often one key component of abuse, be it mere stubbornness or criminal behaviour. Shine light, and it often goes away.

                                                                                                                                1. 13

                                                                                                                                  IIRC Hintjens noted this quite explicitly as a tactic of bad actors in his works.

                                                                                                                                  It’s amazing how quickly people are to recognize folks trying to subvert an org piecemeal via one-off private conversations once everybody can compare notes. It’s equally amazing to see how much the same people beforehand will swear up and down oh no that’s a conspiracy theory such things can’t happen here until they’ve been burned at least once.

                                                                                                                                  This is an active, unpatched attack vector in most communities.

                                                                                                                                  1. 12

                                                                                                                                    I’ve found the lowest example of this is even meetings minutes at work. I’ve observed that people tend to act more collaboratively and seek the common good if there are public minutes, as opposed to trying to “privately” win people over to their desires.

                                                                                                                                2. 5

                                                                                                                                  There is something to be said for keeping things between people with skin in the game.

                                                                                                                                  It’s flipped over here, though, because more people want to contribute. The question is whether it’ll be stabe long-term.

                                                                                                                                3. 18

                                                                                                                                  I think people are realizing how low quality the Linux kernel code is, how haphazard development is, how much burnout and misery is involved, etc.

                                                                                                                                  Something I’ve noticed is true in virtually everything I’ve looked deeply at is the majority of work is poor to mediocre and most people are not especially great at their jobs. So it wouldn’t surprise me if Linux is the same. (…and also wouldn’t surprise me if the wonderful Rust rewrite also ends up poor to mediocre.)

                                                                                                                                  yet at the same time, another thing that astonishes me is how much stuff actually does get done and how well things manage to work anyway. And Linux also does a lot and works pretty well. Mediocre over the years can end up pretty good.

                                                                                                                                  1. 14

                                                                                                                                    After tangentially following the kernel news, I think a lot of churning and death spiraling is happening. I would much rather have a rust-first kernel that isn’t crippled by the old guard of C developers reluctant to adopt new tech.

                                                                                                                                    Take all of this energy into RedoxOS and let Linux stay in antiquity.

                                                                                                                                    1. 36

                                                                                                                                      I’ve seen some of the R4L people talk on Mastodon, and they all seem to hate this argument.

                                                                                                                                      They want to contribute to Linux because they use it, want to use it, and want to improve the lives of everyone who uses it. The fact that it’s out there and deployed and not a toy is a huge part of the reason why they want to improve it.

                                                                                                                                      Hopping off into their own little projects which may or may not be useful to someone in 5-10 years’ time is not interesting to them. If it was, they’d already be working on Redox.

                                                                                                                                      1. 2

                                                                                                                                        The most effective thing that could happen is for the Linux foundation, and Linus himself, to formally endorse and run a Rust-based kernel. They can adopt an existing one or make a concerted effort to replace large chunks of Linux’s C with Rust.

                                                                                                                                        IMO the Linux project needs to figure out something pretty quickly because it seems to be bleeding maintainers and Linus isn’t getting any younger.

                                                                                                                                        1. 0

                                                                                                                                          They may be misunderstanding the idea that others are not necessarily incentivized to do things just because it’s interesting for them (the Mastodon posters).

                                                                                                                                        2. 4

                                                                                                                                          Yep, I made a similar remark upthread. A Rust-first kernel would have a lot of benefits over Linux, assuming a competent group of maintainers.

                                                                                                                                          1. 4

                                                                                                                                            along similar lines: https://drewdevault.com/2024/08/30/2024-08-30-Rust-in-Linux-revisited.html

                                                                                                                                            Redox does have the chains of trying to do new OS things. An ABI-compatible Rust rewrite of the Linux kernel might get further along than expected (even if it only runs in virtual contexts, without hardware support (that would come later.))

                                                                                                                                            1. 44

                                                                                                                                              Linux developers want to work on Linux, they don’t want to make a new OS. Linux is incredibly important, and companies already have Rust-only drivers for their hardware.

                                                                                                                                              Basically, sure, a new OS project would be neat, but it’s really just completely off topic in the sense that it’s not a solution for Rust for Linux. Because the “Linux” part in that matters.

                                                                                                                                              1. 19

                                                                                                                                                I read a 25+ year old article [1] from a former Netscape developer that I think applies in part

                                                                                                                                                The idea that new code is better than old is patently absurd. Old code has been used. It has been tested. Lots of bugs have been found, and they’ve been fixed. There’s nothing wrong with it. It doesn’t acquire bugs just by sitting around on your hard drive. Au contraire, baby! Is software supposed to be like an old Dodge Dart, that rusts just sitting in the garage? Is software like a teddy bear that’s kind of gross if it’s not made out of all new material?

                                                                                                                                                Adopting a “rust-first” kernel is throwing the baby out with the bathwater. Linux has been beaten into submission for over 30 years for a reason. It’s the largest collaborative project in human history and over 30 million lines of code. Throwing it out and starting new would be an absolutely herculean effort that would likely take years, if it ever got off the ground.

                                                                                                                                                [1] https://www.joelonsoftware.com/2000/04/06/things-you-should-never-do-part-i/

                                                                                                                                                1. 33

                                                                                                                                                  The idea that old code is better than new code is patently absurd. Old code has stagnated. It was built using substandard, out of date methodologies. No one remembers what’s a bug and what’s a feature, and everyone is too scared to fix anything because of it. It doesn’t acquire new bugs because no one is willing to work on that weird ass bespoke shit you did with your C preprocessor. Au contraire, baby! Is software supposed to never learn? Are we never to adopt new tools? Can we never look at something we’ve built in an old way and wonder if new methodologies would produce something better?

                                                                                                                                                  This is what it looks like to say nothing, to beg the question. Numerous empirical claims, where is the justification?

                                                                                                                                                  It’s also self defeating on its face. I take an old codebase, I fix a bug, the codebase is now new. Which one is better?

                                                                                                                                                  1. 16

                                                                                                                                                    Like most things in life the truth is somewhere in the middle. There is a reason there is the concept of a “mature node” in the semiconductor industry. They accept that new is needed for each node, but also that the new thing takes time to iron out the kinks and bugs. This is the primary reason why you see apple take new nodes on first before Nvidia for example, as Nvidia require much larger die sizes, and so less defects per square mm.

                                                                                                                                                    You can see this sometimes in software for example X11 vs Wayland, where adoption is slow, but most definetly progressing and now-days most people can see that Wayland is now, or is going to become the dominant tech in the space.

                                                                                                                                                    1. 16

                                                                                                                                                      The truth lies where it lies. Maybe the middle, maybe elsewhere. I just don’t think we’ll get to the truth with rhetoric.

                                                                                                                                                        1. 7

                                                                                                                                                          I don’t think this would qualify as dialectic, it lacks any internal debate and it leans heavily on appeals by analogy and intuition/ emotion. The post itself makes a ton of empirical claims without justification even beyond the quoted bit.

                                                                                                                                                    2. 15

                                                                                                                                                      “Good” is subjective, but there is real evidence that older code does contain fewer vulnerabilities: https://www.usenix.org/conference/usenixsecurity22/presentation/alexopoulos

                                                                                                                                                      That means we can probably keep a lot of the old trusty Linux code around while making more of the new code safe by writing it in Rust in the first place.

                                                                                                                                                      1. 10

                                                                                                                                                        I don’t think that’s a fair assessment of Spolsky’s argument or of CursedSilicon’s application of it to the Linux kernel.

                                                                                                                                                        Firstly, someone has already pointed out the research that suggests that existing code has fewer bugs in than new code (and that the older code is, the less likely it is to be buggy).

                                                                                                                                                        Secondly, this discussion is mainly around entire codebases, not just existing code. Codebases usually have an entire infrastructure around them for verifying that the behaviour of the codebase has not changed. This is often made up of tests, but it’s also made up of the users who try out a release of a codebase and determine whether it’s working for them. The difference between making a change to an existing codebase and releasing a new project largely comes down to whether this verification (both in terms of automated tests and in terms of users’ ability to use the new release) works for the new code.

                                                                                                                                                        Given this difference, if I want to (say) write a new OS completely in Rust, I need to choose: Do I want to make it completely compatible with Linux, and therefore take on the significant challenge of making sure everything behaves truly the same? Or do I make significant breaking changes, write my own OS, and therefore force potential adopters to rebuild their entire Linux workflows in my new OS?

                                                                                                                                                        The point is not that either of these options are bad, it is that they represent significant risks to a project. Added to the general risk that is writing new code, this produces a total level of risk that might be considered the baseline risk of doing a rewrite. Now risk is not bad per se! If the benefits of being able to write an OS in a language like Rust outweigh the potential risks, then it still makes sense to perform the rewrite. Or maybe the existing Linux kernel is so difficult to maintain that a new codebase really would be the better option. But the point that CursedSilicon was making by linking the Spolsky piece was, I believe, that the risks for a project like the Linux kernel are very high. There is a lot of existing, old code. And there is a very large ecosystem where either breaking or maintaining compatibility would each come with significant challenges.

                                                                                                                                                        Unfortunately, it’s very difficult to measure the risks and benefits here in a quantitative, comparable way, so I think where you fall on the “rewrite vs continuity” spectrum will depend mostly on what sort of examples you’ve seen, and how close you think this case is to those examples. I don’t think there’s any objective way to say whether it makes more sense to have something like R4L, or something like RedoxOS.

                                                                                                                                                        1. 7

                                                                                                                                                          Firstly, someone has already pointed out the research that suggests that existing code has fewer bugs in than new code (and that the older code is, the less likely it is to be buggy).

                                                                                                                                                          I haven’t read it yet, but I haven’t made an argument about that, I just created a parody of the argument as presented. I’ll be candid, i doubt that the research is going to compel me to believe that newer code is inherently buggier, it may compel me to confirm my existing belief that testing software in the field is one good method to find some classes of bugs.

                                                                                                                                                          Secondly, this discussion is mainly around entire codebases, not just existing code.

                                                                                                                                                          I guess so, it’s a bit dependent on where we say the discussion starts - three things are relevant; RFL, which is not a wholesale rewrite, a wholesale rewrite of the Linux kernel, and Netscape. RFL is not about replacing the entire Linux kernel, although perhaps “codebase” here refers to some sort of unit, like a driver. Netscape wanted a wholesale rewrite, based on the linked post, so perhaps that’s what’s really “the single worst strategic mistake that any software company can make”, but I wonder what the boundary here is? Also, the article immediately mentions that Microsoft tried to do this with Word but it failed, but that Word didn’t suffer from this because it was still actively developed - I wonder if it really “failed” just because pyramid didn’t become the new Word? Did Microsoft have some lessons learned, or incorporate some of that code? Dunno.

                                                                                                                                                          I think I’m really entirely justified when I say that the post is entirely emotional/ intuitive appeals, rhetoric, and that it makes empirical claims without justification.

                                                                                                                                                          There’s a subtle reason that programmers always want to throw away the code and start over. The reason is that they think the old code is a mess. And here is the interesting observation: they are probably wrong. The reason that they think the old code is a mess is because of a cardinal, fundamental law of programming:

                                                                                                                                                          This is rhetoric. These are unsubstantiated empirical claims. The article is all of this. It’s fine as an interesting, thought provoking read that gets to the root of our intuitions, but I think anyone can dismiss it pretty easily since it doesn’t really provide much in the form of an argument.

                                                                                                                                                          It’s important to remember that when you start from scratch there is absolutely no reason to believe that you are going to do a better job than you did the first time.

                                                                                                                                                          Again, totally unsubstantiated. I have MANY reasons to believe that, it is simply question begging to say otherwise.

                                                                                                                                                          That’s all this post is. Over and over again making empirical claims with no evidence and question beggign.

                                                                                                                                                          We can discuss the risks and benefits, I’d advocate for that. This article posted doesn’t advocate for that. It’s rhetoric.

                                                                                                                                                          1. 11

                                                                                                                                                            existing code has fewer bugs in than new code (and that the older code is, the less likely it is to be buggy).

                                                                                                                                                            This is a truism. It is survival bias. If the code was buggy, it would eventually be found and fixed. So all things being equal newer code is riskier than old code. But it’s also been impirically shown that using Rust for new code is not “all things being equal”. Google showed that new code in Rust is as reliable as old code in C. Which is good news: you can use old C code from new Rust projects without the risk that comes from new C code.

                                                                                                                                                            1. 5

                                                                                                                                                              But it’s also been impirically shown that using Rust for new code is not “all things being equal”.

                                                                                                                                                              Yeah, this is what I’ve been saying (not sure if you’d meant to respond to me or the parent, since we agree) - the issue isn’t “new” vs “old” it’s things like “reviewed vs unreviewed” or “released vs unreleased” or “tested well vs not tested well” or “class of bugs is trivial to express vs class of bugs is difficult to express” etc.

                                                                                                                                                              1. 2

                                                                                                                                                                I don’t disagree that the rewards can outweigh the risks, and in this case I think there’s a lot of evidence that suggests that memory safety as a default is really important for all sorts of reasons. Let alone the many other PL developments that make Rust a much more suitable language to develop in than C.

                                                                                                                                                                That doesn’t mean the risks don’t exist, though.

                                                                                                                                                          2. 4

                                                                                                                                                            It’s also self defeating on its face. I take an old codebase, I fix a bug, the codebase is now new. Which one is better?

                                                                                                                                                            Nobody would call an old codebase with a handful of fixes a new codebase, at least not in the contexts in which those terms have been used here.

                                                                                                                                                              1. 6

                                                                                                                                                                It’s a Ship of Theseus—at no point can you call it a “new” codebase, but after a period of time, it could be completely different code. I have a C program I’ve been using and modifying for 25 years. At any given point, it would have been hard to say “this is now a new codebase, yet not one line of code in the project is the same as when I started (even though it does the same thing at it always has).

                                                                                                                                                                1. 4

                                                                                                                                                                  I don’t see the point in your question. It’s going to depend on the codebase, and on the nature of the changes; it’s going to be nuanced, and subjective at least to some degree. But the fact that it’s prone to subjectivity doesn’t mean that you get to call an old codebase with a single fixed bug a new codebase, without some heavy qualification which was lacking.

                                                                                                                                                                  1. 1

                                                                                                                                                                    If it requires all of that nuance and context maybe the issue isn’t what’s “old” and what’s “new”.

                                                                                                                                                                      1. 4

                                                                                                                                                                        What’s old and new is poorly defined and yet there’s an argument being made that “old” and “new” are good indicators of something. If they’re so poorly defined that we have to bring in all sorts of additional context like the nature of the changes, not just when they happened or the number of lines changed, etc, then it seems to me that we would be just as well served to throw away the “old” and “new” and focus on that context.

                                                                                                                                                                        1. 2

                                                                                                                                                                          I feel like enough people would agree more-or-less on what was an “old” or “new” codebase (i.e. they would agree given particular context) that they remain useful terms in a discussion. The general context used here is apparent (at least to me) given by the discussion so far: an older codebase has been around for a while, has been maintained, has had kinks ironed out.

                                                                                                                                                                          1. 3

                                                                                                                                                                            There’s a really important distinction here though. The point is to argue that new projects will be less stable than old ones, but you’re intuitively (and correctly) bringing in far more important context - maintenance, testing, battle testing, etc. If a new implementation has a higher degree of those properties then it being “new” stops being relevant.

                                                                                                                                                                            1. 2

                                                                                                                                                                              Ok, but:

                                                                                                                                                                              It’s also self defeating on its face. I take an old codebase, I fix a bug, the codebase is now new. Which one is better?

                                                                                                                                                                              My point was that this statement requires a definition of “new codebase” that nobody would agree with, at least in the context of the discussion we’re in. Maybe you are attacking the base proposition without applying the surrounding context, which might be valid if this were a formal argument and not a free-for-all discussion.

                                                                                                                                                                              If a new implementation has a higher degree of those properties

                                                                                                                                                                              I think that it would be considered no longer new if it had had significant battle-testing, for example.

                                                                                                                                                                              FWIW the important thing in my view is that every new codebase is a potential old codebase (given time and care), and a rewrite necessarily involves a step backwards. The question should probably not be, which is immediately better?, but, which is better in the longer term (and by how much)? However your point that “new codebase” is not automatically worse is certainly valid. There are other factors than age and “time in the field” that determine quality.

                                                                                                                                                              2. 1

                                                                                                                                                                Methodologies don’t matter for quality of code. They could be useful for estimates, cost control, figuring out whom you shall fire etc. But not for the quality of code.

                                                                                                                                                                1. 4

                                                                                                                                                                  You’re suggesting that the way you approach programming has no bearing on the quality of the produced program?

                                                                                                                                                                  1. 3

                                                                                                                                                                    I’ve never observed a programmer become better or worse by switching methodology. Dijkstra would’ve not became better if you made him do daily standups or go through code reviews.

                                                                                                                                                                    There are ways to improve your programming by choosing different approach but these are very individual. Methodology is mostly a beancounting tool.

                                                                                                                                                                    1. 3

                                                                                                                                                                      When I say “methodology” I’m speaking very broadly - simply “the approach one takes”. This isn’t necessarily saying that any methodology is better than any other. The way I approach a task today is better, I think, then the way that I would have approached that task a decade ago - my methodology has changed, the way I think has changed. Perhaps that might mean I write more tests, or I test earlier, but it may mean exactly the opposite, and my methods may only work best for me.

                                                                                                                                                                      I’m not advocating for “process” or ubiquity, only that the approach one tasks may improve over time, which I suspect we would agree on.

                                                                                                                                                              3. 28

                                                                                                                                                                If you take this logic to its end, you should never create new things.

                                                                                                                                                                At one point in time, Linux was also the new kid on the block.

                                                                                                                                                                The best time to plant a tree is 30 years ago. The second best time is now.

                                                                                                                                                                1. 7

                                                                                                                                                                  I read a 25+ year old article [1] from a former Netscape developer that I think applies in part

                                                                                                                                                                  I don’t think Joel Spolsky was ever a Netscape developer. He was a Microsoft developer who worked on Excel.

                                                                                                                                                                  1. 2

                                                                                                                                                                    My mistake! The article contained a bit about Netscape and I misremembered it

                                                                                                                                                                  2. 5

                                                                                                                                                                    It’s the largest collaborative project in human history and over 30 million lines of code.

                                                                                                                                                                    How many of those lines are part of the core? My understanding was that the overwhelming majority was driver code. There may not be that much core subsystem code to rewrite.

                                                                                                                                                                    1. 5

                                                                                                                                                                      For a previous project, we included a minimal Linux build. It was around 300 KLoC, which included networking and the storage stack, along with virtio drivers.

                                                                                                                                                                      That’s around the size a single person could manage and quite easy with a motivated team.

                                                                                                                                                                      If you started with DPDK and SPDK then you’d already have filesystems and a copy of the FreeBSD network stack to run in isolated environments.

                                                                                                                                                                      1. 2

                                                                                                                                                                        Once many drivers share common rust wrappers over core subsystems, you could flip it and write the subsystem in Rust. Then expose C interface for the rest.

                                                                                                                                                                        1. 3

                                                                                                                                                                          Oh sure, that would be my plan as well. And I bet some subsystem maintainers see this coming, and resist it for reasons that aren’t entirely selfless.

                                                                                                                                                                          1. 3

                                                                                                                                                                            That’s pretty far into the future, both from a maintainer acceptance PoV and from a rustc_codegen_gcc and/or gccrs maturity PoV.

                                                                                                                                                                            1. 4

                                                                                                                                                                              Sure. But I doubt I’ll running a different kernel 10y from now.

                                                                                                                                                                              And like us, those maintainers are not getting any younger and if they need a hand, I am confident I’ll get faster into it with a strict type checker.

                                                                                                                                                                              I am also confident nobody in our office would be able to help out with C at all.

                                                                                                                                                                        2. 4

                                                                                                                                                                          It’s the largest collaborative project in human history

                                                                                                                                                                          This cannot possibly be true.

                                                                                                                                                                          1. 5

                                                                                                                                                                            It’s the largest collaborative project in human history

                                                                                                                                                                            It’s the largest collaborative open source os kernel project in human history

                                                                                                                                                                            1. 4

                                                                                                                                                                              It’s been described as such based purely on the number of unique human contributions to it

                                                                                                                                                                          2. 7

                                                                                                                                                                            I see that Drew proposes a new OS in that linked article, but I think a better proposal in the same vein is a fork. You get to keep Linux, but you can start porting logic to Rust unimpeded, and it’s a manageable amount of work to keep porting upstream changes.

                                                                                                                                                                            Remember when libav forked from ffmpeg? Michael Niedermayer single-handedly ported every single libav commit back into ffmpeg, and eventually, ffmpeg won.

                                                                                                                                                                            At first there will be extremely high C percentage, low Rust percentage, so porting is trivial, just git merge and there will be no conflicts. As the fork ports more and more C code to Rust, however, you start to have to do porting work by inspecting the C code and determining whether the fixes apply to the corresponding Rust code. However, at that point, it means you should start seeing productivity gains, community gains, and feature gains from using a better language than C. At this point the community growth should be able to keep up with the extra porting work required. And this is when distros will start sniffing around, at first offering variants of the distro that uses the forked kernel, and if they like what they taste, they might even drop the original.

                                                                                                                                                                            I genuinely think it’s a strong idea, given the momentum and potential amount of labor Rust community has at its disposal.

                                                                                                                                                                            I think the competition would be great, especially in the domain of making it more contributor friendly to improve the kernel(s) that we use daily.

                                                                                                                                                                            1. 15

                                                                                                                                                                              I certainly don’t think this is impossible, for sure. But the point ultimately still stands: Linux kernel devs don’t want a fork. They want Linux. These folks aren’t interested in competing, they’re interested in making the project they work on better. We’ll see if some others choose the fork route, but it’s still ultimately not the point of this project.

                                                                                                                                                                            2. 5

                                                                                                                                                                              Linux developers want to work on Linux, they don’t want to make a new OS.

                                                                                                                                                                              While I don’t personally want to make a new OS, I’m not sure I actually want to work on Linux. Most of the time I strive for portability, and so abstract myself from the OS whenever I can get away with it. And when I can’t, I have to say Linux’s API isn’t always that great, compared to what the BSDs have to offer (epoll vs kqueue comes to mind). Most annoying though is the lack of documentation for the less used APIs: I’ve recently worked with Netlink sockets, and for the proc stuff so far the best documentation I found was the freaking source code of a third party monitoring program.

                                                                                                                                                                              I was shocked. Complete documentation of the public API is the minimum bar for a project as serious of the Linux kernel. I can live with an API I don’t like, but lack of documentation is a deal breaker.

                                                                                                                                                                              1. 10

                                                                                                                                                                                While I don’t personally want to make a new OS, I’m not sure I actually want to work on Linux.

                                                                                                                                                                                I think they mean that Linux kernel devs want to work on the Linux kernel. Most (all?) R4L devs are long time Linux kernel devs. Though, maybe some of the people resigning over LKML toxicity will go work on Redox or something…

                                                                                                                                                                                1. 5

                                                                                                                                                                                  I’m talking about the people who develop the Linux kernel, not people who write userland programs for Linux.

                                                                                                                                                                              2. 2

                                                                                                                                                                                Re-Implementing the kernel ABI would be a ton of work for little gain if all they wanted was to upstream all the work on new hardware drivers that is already done - and then eventually start re-implementing bits that need to be revised anyway.

                                                                                                                                                                            3. 3

                                                                                                                                                                              If the singular required Rust toolchain didn’t feel like such a ridiculous to bootstrap 500 ton LLVM clown car I would agree with this statement without reservation.

                                                                                                                                                                                1. 4

                                                                                                                                                                                  Zig is easier to implement (and I personally like it as a language) but doesn’t have the same safety guarantees and strong type system that Rust does. It’s a give and take. I actually really like Rust and would like to see a proliferation of toolchain options, such as what’s in progress in GCC land. Overall, it would just be really nice to have an easily bootstrapped toolchain that a normal person can compile from scratch locally, although I don’t think it necessarily needs to be the default, or that using LLVM generally is an issue. However, it might be possible that no matter how you architect it, Rust might just be complicated enough that any sufficiently useful toolchain for the language could just end up being a 500 ton clown car of some kind anyways.

                                                                                                                                                                                  1. 2

                                                                                                                                                                                    Depends on which parts of GP’s statement you care about: LLVM or bootstrap. Zig is still depending on LLVM (for now), but it is no longer bootstrappable in a limited number of steps (because they switched from a bootstrap C++ implementation of the compiler to keeping a compressed WASM build of the compiler as a blob.

                                                                                                                                                                                    1. 2

                                                                                                                                                                                      Yep, although I would also add it’s unfair to judge Zig in any case on this matter now given it’s such a young project that clearly is going to evolve a lot before the dust begins to settle (Rust is also young, but not nearly as young as Zig). In ten to twenty years, so long as we’re all still typing away on our keyboards, we might have a dozen Zig 1.0 and a half dozen Zig 2.0 implementations!

                                                                                                                                                                              1. 6

                                                                                                                                                                                Yeah, the absurdly low code quality and toxic environment make me think that Linux is ripe for disruption. Not like anyone can produce a production kernel overnight, but maybe a few years of sustained work might see a functional, production-ready Rust kernel for some niche applications and from there it could be expanded gradually. While it would have a lot of catching up to do with respect to Linux, I would expect it to mature much faster because of Rust, because of a lack of cruft/backwards-compatibility promises, and most importantly because it could avoid the pointless drama and toxicity that burn people out and prevent people from contributing in the first place.

                                                                                                                                                                                1. 14

                                                                                                                                                                                  the absurdly low code quality

                                                                                                                                                                                  What is the, some kind of a new meme? Where did you hear it first?

                                                                                                                                                                                  1. 22

                                                                                                                                                                                    From the thread in OP, if you expand the messages, there is wide agreement among the maintainers that all sorts of really badly designed and almost impossible to use (safely) APIs ended up in the kernel over the years because the developers were inexperienced and kind of learning kernel development as they went. In retrospect they would have designed many of the APIs very differently.

                                                                                                                                                                                    1. 4

                                                                                                                                                                                      Someone should compile everything to help future OS developers avoid those traps! There are a lot of exieting non-posix experiments though.

                                                                                                                                                                                    2. 14

                                                                                                                                                                                      It’s based on my forays into the Linux kernel source code. I don’t doubt there’s some quality code lurking around somewhere, but the stuff I’ve come across (largely filesystem and filesystem adjacent) is baffling.

                                                                                                                                                                                      1. 7

                                                                                                                                                                                        Seeing how many people are confidently incorrect about Linux maintainers only caring about their job security and keeping code bad to make it a barrier to entry, if nothing else taught me how online discussions are a huge game of Chinese whispers where most participants don’t have a clue of what they are talking about.

                                                                                                                                                                                        1. 15

                                                                                                                                                                                          I doubt that maintainers are “only caring about their job security and keeping back code” but with all due respect: You’re also just taking arguments out of thin air right now. What I do believe is what we have seen: Pretty toxic responses from some people and a whole lot of issues trying to move forward.

                                                                                                                                                                                          1. 8

                                                                                                                                                                                            Seeing how many people are confidently incorrect about Linux maintainers only caring about their job security and keeping code bad to make it a barrier to entry

                                                                                                                                                                                            Huh, I’m not seeing any claim to this end from the GP, or did I not look hard enough? At face value, saying that something has an “absurdly low code quality” does not imply anything about nefarious motives.

                                                                                                                                                                                            1. 10
                                                                                                                                                                                              1. 7

                                                                                                                                                                                                Indeed that remark wasn’t directly referring to GP’s comment, but rather to the range of confidently incorrect comments that I read in the previous episodes, and to the “gatekeeping greybeards” theme that can be seen elsewhere on this page. First occurrence, found just by searching for “old”: Linux is apparently “crippled by the old guard of C developers reluctant to adopt new tech”, to which GP replied in agreement in fact. Another one, maintainers don’t want to “do the hard work”.

                                                                                                                                                                                                Still, in GP’s case the Chinese whispers have reduced “the safety of this API is hard to formalize and you pretty much have to use it the way everybody does it” to “absurdly low quality”. To which I ask, what is more likely. 1) That 30-million lines of code contain various levels of technical debt of which maintainers are aware; and that said maintainers are worried even of code where the technical debt is real but not causing substantial issue in practice? Or 2) that a piece of software gets to run on literally billions of devices of all sizes and prices just because it’s free and in spite of its “absurdly low quality”?

                                                                                                                                                                                                Linux is not perfect, neither technically nor socially. But it sure takes a lot of entitlement and self-righteousness to declare it “of absurdly low quality” with a straight face.

                                                                                                                                                                                                1. 11

                                                                                                                                                                                                  GP here: I probably should have said “shockingly” rather than “absurdly”. I didn’t really expect to get lawyered over that one word, but yeah, the idea was that for a software that runs on billions of devices, the code quality is shockingly low.

                                                                                                                                                                                                  Of course, this is plainly subjective. If your code quality standards are a lot lower than mine then you might disagree with my assessment.

                                                                                                                                                                                                  That said, I suspect adoption is a poor proxy for code quality. Internet Explorer was widely adopted and yet it’s broadly understood to have been poorly written.

                                                                                                                                                                                                  But it sure takes a lot of entitlement and self-righteousness to declare it “of absurdly low quality” with a straight face

                                                                                                                                                                                                  I’m sure self-righteousness could get you to the same place, but in my case I arrived by way of experience. You can relax, I wasn’t attacking Linux—I like Linux—it just has a lot of opportunity for improvement.

                                                                                                                                                                                                  1. 5

                                                                                                                                                                                                    I guess I’ve seen the internals of too much proprietary software now to be shocked by anything about Linux per se. I might even argue that the quality of Linux is surprisingly good, considering its origins and development model.

                                                                                                                                                                                                    I think I’d lawyer you a tiny bit differently: some of the bugs in the kernel shock me when I consider how many devices run that code and fulfill their purposes despite those bugs.

                                                                                                                                                                                                    1. 7

                                                                                                                                                                                                      FWIW, I was not making a dig at open source software, and yes plenty of corporate software is worse. I guess my expectations for Linux are higher because of how often it is touted as exemplary in some form or another. I don’t even dislike Linux, I think it’s the best thing out there for a huge swath of use cases—I just see some pretty big opportunities for improvement.

                                                                                                                                                                                                  2. 4

                                                                                                                                                                                                    But it sure takes a lot of entitlement and self-righteousness to declare it “of absurdly low quality” with a straight face.

                                                                                                                                                                                                    Or actual benchmarks: the performance the Linux kernel leaves on the table in some cases is absurd. And sure it’s just one example, but I wouldn’t be surprised if it was representative of a good portion of the kernel.

                                                                                                                                                                                                    1. 3

                                                                                                                                                                                                      absurdly low quality

                                                                                                                                                                                                      Well not quite but still “considered broken beyond repair by many people related to life time management” - which is definitely worse than “hard to formalize” when “the way ever[y]body does it” seems to vary between each user.

                                                                                                                                                                                                      1. 4

                                                                                                                                                                                                        I love Rust but still, we’re talking of a language which (for good reasons!) considers doubly linked lists unsafe. Take an API that gets a 4 on Rusty Russell’s API design scale (“Follow common convention and you’ll get it right”), but which was designed for a completely different programming language if not paradigm, and it’s not surprising that it can’t easily be transformed into a 9 (“The compiler/linker won’t let you get it wrong”). But at the same time there are a dozen ways in which, according to the same scale, things could actually be worse!

                                                                                                                                                                                                        What I dislike is that people are seeing “awareness of complexity” and the message they spread is “absurdly low quality”.

                                                                                                                                                                                                        1. 13

                                                                                                                                                                                                          Note that doubly linked lists are not a special case at all in Rust. All the other common data structures like Vec, HashMap etc. also need unsafe code in their implementation.

                                                                                                                                                                                                          Implementing these datastructures in Rust, and writing unsafe code in general, is indeed roughly a 4. But these are all already implemented in the standard library, with an API that actually is at a 9. And std::collections::LinkedList is constructive proof that you can have a safe Rust abstraction for doubly linked lists.

                                                                                                                                                                                                          Yes, the implementation could have bugs, thus making the abstraction leaky. But that’s the case for literally everything, down to the hardware that your code runs on.

                                                                                                                                                                                                          1. 4

                                                                                                                                                                                                            You’re absolutely right that you can build abstractions with enough effort.

                                                                                                                                                                                                            My point is that if a doubly linked list is (again, for good reasons) hard to make into a 9, a 20-year-old API may very well be even harder. In fact, std::collections::LinkedList is safe but still not great (for example the cursor API is still unstable); and being in std, it was designed/reviewed by some of the most knowledgeable Rust developers, sort of by definition. That’s the conundrum that maintainers face and, if they realize that, it’s a good thing. I would be scared if maintainers handwaved that away.

                                                                                                                                                                                                            Yes, the implementation could have bugs, thus making the abstraction leaky.

                                                                                                                                                                                                            Bugs happen, but if the abstraction is downright wrong then that’s something I wouldn’t underestimate. A lot of the appeal of Rust in Linux lies exactly in documenting/formalizing these unwritten rules, and wrong documentation can be worse than no documentation (cue the negative parts of the API design scale!); even more so if your documentation is a formal model like a set of Rust types and functions.

                                                                                                                                                                                                            That said, the same thing can happen in a Rust-first kernel, which will also have a lot of unsafe code. And it would be much harder to fix it in a Rust-first kernel, than in Linux at a time when it’s just feeling the waters.

                                                                                                                                                                                                            1. 7

                                                                                                                                                                                                              In fact, std::collections::LinkedList is safe but still not great (for example the cursor API is still unstable); and being in std, it was designed/reviewed by some of the most knowledgeable Rust developers, sort of by definition.

                                                                                                                                                                                                              At the same time, it was included almost as like, half a joke, and nobody uses it, so there’s not a lot of pressure to actually finish off the cursor API.

                                                                                                                                                                                                              It’s also not the kind of linked list the kernel would use, as they’d want an intrusive one.

                                                                                                                                                                                                          2. 12

                                                                                                                                                                                                            And yet, safe to use doubly linked lists written in Rust exist. That the implementation needs unsafe is not a real problem. That’s how we should look at wrapping C code in safe Rust abstractions.

                                                                                                                                                                                                            1. 3

                                                                                                                                                                                                              The whole comment you replied to, after the one sentence about linked lists, is about abstractions. And abstractions are rarely going to be easy, and sometimes could be hardly possible.

                                                                                                                                                                                                              That’s just a fact. Confusing this fact for something as hyperbolic as “absurdly low quality” is stunning example of the Dunning Kruger effect, and frankly insulting as well.

                                                                                                                                                                                                              1. 9

                                                                                                                                                                                                                I personally would call Linux low quality because many parts of it are buggy as sin. My GPU stops working properly literally every other time I upgrade Linux.

                                                                                                                                                                                                                No one is saying that Linux is low quality because it’s hard or impossible to abstract some subsystems in Rust, they’re saying it’s low quality because a lot of it barely works! I would say that your “Chinese whispers” misrepresents the situation and what people here are actually saying. “the safety of this API is hard to formalize and you pretty much have to use it the way everybody does it” doesn’t apply if no one can tell you how to use an API, and everyone does it differently.

                                                                                                                                                                                                                1. 3

                                                                                                                                                                                                                  I agree, Linux is the worst of all kernels.

                                                                                                                                                                                                                  Except for all the others.

                                                                                                                                                                                                                  1. 9

                                                                                                                                                                                                                    Actually, the NT kernel of all things seems to have a pretty good reputation, and I wouldn’t dismiss the BSD kernels out of hand. I don’t know which kernel is better, but it seems you do. If you could explain how you came to this conclusion that would be most helpful.

                                                                                                                                                                                                                    1. 10

                                                                                                                                                                                                                      NT gets a bad rap because of the OS on top of it, not because it’s actually bad. NT itself is a very well-designed kernel.

                                                                                                                                                                                                                      1. 3

                                                                                                                                                                                                                        *nod* I haven’t been a Windows person since shortly after the release of Windows XP (i.e. the first online activation DRM’d Windows) but, whenever I see glimpses of what’s going on inside the NT kernel in places like Project Zero: The Definitive Guide on Win32 to NT Path Conversion, it really makes me want to know more.

                                                                                                                                                                                                2. -1

                                                                                                                                                                                                  how low quality the Linux kernel code is

                                                                                                                                                                                                  Somewhere else it was mentioned that most developers in the kernel could just not be bothered with checking for basic things.

                                                                                                                                                                                                  how much burnout and misery is involved

                                                                                                                                                                                                  Nobody is forcing any of these people to do this.

                                                                                                                                                                                                3. 33

                                                                                                                                                                                                  I found the first reply on LKML to be very interesting.

                                                                                                                                                                                                  To quote:

                                                                                                                                                                                                  for lots of the in-kernel APIs, compile-time constraints enforcement to prevent misuse doesn’t matter, because those APIs don’t provide any way to be used safely. Looking at the two subsystems I know the best, V4L2 and DRM, handling the life time of objects safely in drivers isn’t just hard, it’s often plain impossible

                                                                                                                                                                                                  And

                                                                                                                                                                                                  in order to provide API that are possible to use correctly, we have many areas deep in kernel code that will require a complete redesign [..] I would be very surprised if I was working in the only area in the kernel that is considered broken beyond repair by many people related to life time management

                                                                                                                                                                                                  Which feels to me like there is a strong chicken and egg problem: To actually add any rust bindings for certain kernel parts you would need to first rewrite them, because there is apparently no actual defined way to call them safely.

                                                                                                                                                                                                  Which means it’s not about adding rust, it’s about rust being the reason to poke where it hurts. Potentially requiring a rewrite of hundreds of thousands LOC to even start seeing any benefits. In a state where I wouldn’t blame any maintainer that told me they don’t actually know how that part of the code truly works.

                                                                                                                                                                                                  1. 29

                                                                                                                                                                                                    Yeah. Part of the drama has been the R4L folks trying to get subsystem maintainers in these areas to document the “right ways” to use the APIs so the Rust API can incorporate those rules, and some maintainers saying “just do it like that other filesystem and stop harassing us, you said you’d do all the work”. (At least that’s how they’re perceived.) But it’s not like they would let the R4L folks go in and rewrite that stuff, either.

                                                                                                                                                                                                    1. 43

                                                                                                                                                                                                      I recall Asahi Lina’s comments on drm_sched. Choice quotes:

                                                                                                                                                                                                      But the scheduler also keeps track of jobs, which reference their completion fences, so we have a lifetime loop. That loop is broken at certain points in the job lifecycle, but the fact it exists makes it very difficult to reason about the lifetimes of any of this stuff, and also makes it impossible to implement the requirements imposed by drm_sched via straight refcounting. If you try to refcount the scheduler and have the hw fence hold a reference to it, then the whole thing deadlocks, because the job completion fence might have its final reference dropped by the scheduler itself (when a job is cleaned up after completion), which would lead to trying to free the scheduler from the scheduler workqueue itself.

                                                                                                                                                                                                      So now your driver needs to implement some kind of deferred cleanup workqueue to free schedulers possibly forever in the future. And also your driver module might be blocked from unloading from the kernel forever, because if any buffers hold on to job completion fences, that means your driver can’t unload due to the dependency.

                                                                                                                                                                                                      I fixed it so that tearing down the scheduler gracefully aborts all jobs and detaches the hardware callbacks (it can’t abort the underlying hardware jobs, but it can decouple them from the scheduler side). In my driver’s case, that all works beautifully because my driver internals are basically reference counted everywhere, so while the scheduler and high-level queue can be destroyed, any currently running jobs continue to run to completion or failure and their underlying driver resources get cleaned up then, asynchronously.

                                                                                                                                                                                                      The maintainer rejected the patch, and said it was the driver’s job to ensure that the scheduler outlives job execution.

                                                                                                                                                                                                      But the scheduler owns the jobs lifetime-wise after you submit them, so how would that work? It doesn’t. If you try to introduce a job->scheduler reference, you’re creating a loop again, and the scheduler deadlocks when it frees a job and tries to tear itself down from within.

                                                                                                                                                                                                      So now we’re back at having to introduce an asynchronous cleanup workqueue or similar, just to deal with the DRM scheduler’s incredibly poor lifetime design choices.

                                                                                                                                                                                                      If I remember correctly, most C drivers that use drm_sched do not get this right, but it doesn’t come up much because most people aren’t trying to shut down their GPUs other than when they’re shutting off their computers, unless they’re using an eGPU (and eGPUs are notoriously semi-broken on Linux). Lina’s M1 GPU driver uses a scheduler per GPU context (/per application), hence schedulers are torn down whenever graphical applications are closed, so her driver couldn’t just ignore the complexity like most other drivers appear to do.

                                                                                                                                                                                                    2. 20

                                                                                                                                                                                                      Those statements just come across to me as “we built something unmaintainable and now I don’t want to maintain it”, i.e., a way to avoid doing the hard work.

                                                                                                                                                                                                      https://lore.kernel.org/rust-for-linux/Z7SwcnUzjZYfuJ4-@infradead.org/

                                                                                                                                                                                                      So we’ll have these bindings creep everywhere like a cancer and are very quickly moving from a software project that allows for and strives for global changes that improve the overall project to increasing compartmentalization [2].

                                                                                                                                                                                                      Because the cancer metaphor worked so well for Hellwig the last time he used it…

                                                                                                                                                                                                      1. 3

                                                                                                                                                                                                        I wouldn’t blame anyone for that. The road to hell is paved with good intentions. And most of the people maintaining it now probably didn’t start it.

                                                                                                                                                                                                        1. 16

                                                                                                                                                                                                          If they’re a paid maintainer, then it’s their job to do just that. Hellwig is a guy who has explicitly said he doesn’t want any other maintainers.

                                                                                                                                                                                                          1. 4

                                                                                                                                                                                                            I think you’re underestimating how many years it would take to replace some of this code, let alone verify it actually works on the real hardware without random crashes (as we’ve seen in other reports about new CPU architectures playing Heisenbug). Sure you would want to do that eventually - but I don’t want to be the one telling everyone I’m gonna freeze features until this is done, with potentially more bugs when it’s finished.

                                                                                                                                                                                                            1. 12

                                                                                                                                                                                                              It’s one thing to say “I don’t have the time to fix this”, it’s another to reject a proposed fix (see drm_sched comment above) or to prevent other people from working on fixes elsewhere in the tree (Hellwig). You don’t have to freeze you feature work when other people are working on fixes and refactorings.

                                                                                                                                                                                                            2. 3

                                                                                                                                                                                                              How long are you willing to wait for an updated Linux kernel? It may not be “we are unwilling to do maintenance” and more “this is a lot of major work where intermediate steps might not be usable.”

                                                                                                                                                                                                              1. 9

                                                                                                                                                                                                                There’s a reason it’s called technical debt. It only gets worse the longer you put it off.

                                                                                                                                                                                                                1. 2

                                                                                                                                                                                                                  So I ask again: how long are you willing to wait for an updated Linux kernel with less technical debt?

                                                                                                                                                                                                                  1. 25

                                                                                                                                                                                                                    You’re treating it as a false dichotomy and trying to paint me uncharitably. Stop that.

                                                                                                                                                                                                                    1. 8

                                                                                                                                                                                                                      If they’re a paid maintainer, then it’s their job to do just that.

                                                                                                                                                                                                                      For my personal projects, I can “pay myself” to address technical debt. And I have, because I’m the only user of my code and thus, I have final say in what and how it works. At my previous job, any attempt to address technical debt of the project (that I had been working on for over a decade, pretty much from the start of it) would have been shut down immediately as being too risky, despite the 17,000+ tests [1].

                                                                                                                                                                                                                      Where do the incentives come in to address technical debt in the Linux kernel? Is that a better way to ask the question?

                                                                                                                                                                                                                      [1] Thanks to new management. At one point, my new manager revoked the code I rewrote to address some minor technical debt to the original code plus the minimum to get it working, because the rewrite was deemed “too risky”.

                                                                                                                                                                                                                      1. 5

                                                                                                                                                                                                                        At my previous job, any attempt to address technical debt of the project (that I had been working on for over a decade, pretty much from the start of it) would have been shut down immediately as being too risky, despite the 17,000+ tests

                                                                                                                                                                                                                        Seems to confirm the point I made here:

                                                                                                                                                                                                                        But go explain to your boss who just saw a working prototype, that you need a couple more days to design an alternate implementation, that may or may not be included in the final product. That you still need a couple more automated tests just to make sure. That you’ll take this slow approach now and forever, pinkie promise that’s how we’ll ship sooner.

                                                                                                                                                                                                                        […] So unless I have cause to believe my boss understands those things, I just decide such menial considerations are below their pay grade, and tell them I am not done yet.

                                                                                                                                                                                                                        But if I can’t even get around the morons up top, I’m out pretty quick, one way or another.

                                                                                                                                                                                                                2. 4

                                                                                                                                                                                                                  The kernel has happily managed major API rewrites before, either merging the changes bit by bit or maintaining both versions in tree until the old one is ripe for deletion. And thru the magic of git and community effort, none of that has to delay the release of new kernels.

                                                                                                                                                                                                                    1. 2

                                                                                                                                                                                                                      Ah, thanks. The meaning I take from that statement is not the same meaning I took from your comment.

                                                                                                                                                                                                                      I’m trying to see what you were getting at. What did you mean by “just that”?

                                                                                                                                                                                                                      1. 12

                                                                                                                                                                                                                        I’m trying to see what you were getting at. What did you mean by “just that”?

                                                                                                                                                                                                                        Doing the design work to create safe-to-use APIs with lifetimes considered is part of the work of the maintainer in my view because they should have the best perspective to do so. They got it into that state, they can get it out of that state. Whining that it’s hard work shouldn’t be acceptable as a reason to not do the work.

                                                                                                                                                                                                                        1. 2

                                                                                                                                                                                                                          Doing the design work to create safe-to-use APIs with lifetimes considered is part of the work of the maintainer in my view because they should have the best perspective to do so. They got it into that state, they can get it out of that state.

                                                                                                                                                                                                                          I’m not aware of any precedent for something like this so maybe there’s a way in which you’re right. But there seems to be a contradiction on whether you think we should defer to their judgement.

                                                                                                                                                                                                                          Whining that it’s hard work shouldn’t be acceptable as a reason to not do the work.

                                                                                                                                                                                                                          I don’t agree with that. I accept the RfL side’s refusal to build and test their own OS or Linux fork, for example.

                                                                                                                                                                                                                          1. 3

                                                                                                                                                                                                                            I think their judgment is separate from their capability. I don’t think any of these maintainers are fundamentally incompetent people. I’m not sure if they need mentorship on building APIs with regards to lifetimes because they should be aware memory has lifetimes everywhere, implicitly, in their code already.

                                                                                                                                                                                                                              1. 1

                                                                                                                                                                                                                                because if you trust them to define “lifetimes,” doesn’t that mean you trust them to estimate the amount of time before such a point when the costs of changing the API outweigh the benefits? yet you don’t trust their estimation of the costs imposed by the practice and the amount of extra work it would take before it yields benefits?

                                                                                                                                                                                                              2. 18

                                                                                                                                                                                                                Which means it’s not about adding rust, it’s about rust being the reason to poke where it hurts.

                                                                                                                                                                                                                Good point! This is something I noticed in a previous job as well, where we introduced computer assistance to existing manual workflows. Apparently the real reason for resistance from the workers was that in the course of this computerization, their “traditional” workflows would be documented and maybe even evaluated before they could be encoded in a computer program. But IIRC this reason was never said out loud by anyone – some developers realized this reason on their own and adjusted their approach, but some didn’t realize this and wondered about the constant pushback.

                                                                                                                                                                                                                And maybe to the managers of those workers the computerization was not even the real goal, but the real improvement was supposed to come from the “inventorization” of existing workflows. In a similar way, while the Rust devs want Rust to enter the kernel, maybe some progressive Linux devs sees Rust “just” as a vehicle to make Linux internals more strict and more understandable, and introducing Rust is maybe just a happy side effect of this.

                                                                                                                                                                                                                1. 20

                                                                                                                                                                                                                  maybe some progressive Linux devs sees Rust “just” as a vehicle to make Linux internals more strict and more understandable, and introducing Rust is maybe just a happy side effect of this.

                                                                                                                                                                                                                  This is the entire situation from the start. Like this is what Rust for Linux is.

                                                                                                                                                                                                                  1. 5

                                                                                                                                                                                                                    Absolutely, and also people do not recognize how unforgiving Rust is to even normal APIs that to you would use in C (insert joke on linked lists), and the constraints that the word “safely” means when applied to Rust.

                                                                                                                                                                                                                    Not being able to devise a “safe” Rust abstraction doesn’t mean that the API must be a source of insecurity. Certainly it isn’t a great start, I will grant that, but generally in C you will find that most code ends up doing the same thing that works. The maintainers however recognize that this is not the way to introduce a semi formal definition of how the API operates, and are worried that it may not be possible at all. This is being aware of the environment and the complexity that comes from 20-30 years of development in C; it’s not wanting to “avoid doing the hard work”.

                                                                                                                                                                                                                    (For another example, https://lobste.rs/s/hdj2q4/greg_kroah_hartman_makes_compelling_case#c_f5pzow shows how an API could start causing problems when you use it differently, and how one might want to use it differently if he/she has more confidence thanks to a better programming language).

                                                                                                                                                                                                                    1. 10

                                                                                                                                                                                                                      That comment shows that the API is poorly designed and fixing it was rejected by a maintainer, though?

                                                                                                                                                                                                                      1. 4

                                                                                                                                                                                                                        The maintainer rejected the fix because (according to him) the API was not poorly designed, but simply not supposed to be used like that (literal quote: “this functionality here isn’t made for your use case”). Which makes sense and is consistent with what I wrote above: the maintainer is conscious of the limits of C and does not want the API to be used in ways that were not anticipated, whereas the Rust developer is more confident because of the more powerful compile-time checks.

                                                                                                                                                                                                                        Not knowing the code I cannot understand the tradeoffs involved in the fix. I can’t say whether the maintainer was too cautious, and obviously the failure mode (use after free) is anything but great. My point is that, as you look more in depth, you can see that people actually do put thought in their decisions, but clashes can and will happen if they evaluate the tradeoffs differently.

                                                                                                                                                                                                                        As an aside: drm_sched is utility code, not a core part of the graphics stack, so for now the solution to Lina’s issue is going to be a different scheduler that is written in (safe) Rust. Since it appears that there’s going to be multiple Rust graphics drivers soon, they might be able to use Lina’s scheduler and there will be more data points to compare “reuse C code as much as possible” vs “selectively duplicate infrastructure”; see also https://fosstodon.org/@airlied/113052975389174835. Remember that abstracting C to Rust is neither little code nor easy code, therefore it’s not unexpected that in some cases duplication will be easier.

                                                                                                                                                                                                                        1. 23

                                                                                                                                                                                                                          The maintainer rejected the fix because (according to him) the API was not poorly designed, but simply not supposed to be used like that (literal quote: “this functionality here isn’t made for your use case”). Which makes sense and is consistent with what I wrote above: the maintainer is conscious of the limits of C and does not want the API to be used in ways that were not anticipated, whereas the Rust developer is more confident because of the more powerful compile-time checks.

                                                                                                                                                                                                                          This is not a good summary of the situation, though to be fair, some details are buried on Reddit. First of all, what she was doing was the correct approach to dealing with that hardware, according to multiple other DRM maintainers. Lina’s patch actually would have made existing drivers written in C less buggy, because that maintainer was in fact not conscious of the limits of C. drm_sched has very annoying and complex lifetime requirements that are easy to mess up in C, and her patch would’ve simplified them.

                                                                                                                                                                                                                          Relevant excerpts from that what Lina said on Reddit:

                                                                                                                                                                                                                          The only thing I proposed was making it valid to destroy a scheduler with jobs having not completed. I just added cleanup code to handle an additional case. It was impossible for that change to affect any existing driver that followed the existing implied undocumented rule that you have to wait for all jobs to complete before destroying the scheduler.

                                                                                                                                                                                                                          The scheduler is in charge of job lifetimes by design. So this change makes perfect sense. Enforcing that all jobs complete before scheduler destruction would require tracking job lifetimes in duplicate outside the scheduler, it makes no sense. And you can’t fix it by having a simple job->scheduler ref either, because then the scheduler deadlocks on the last job when it tries to free itself from within.

                                                                                                                                                                                                                          The only reason this doesn’t crash all the time for other GPU drivers is because they use a global scheduler, while mine uses a per-queue scheduler (because Apple’s GPU uses firmware scheduling, and this is the correct approach for that, as discussed with multiple DRM folks). A global scheduler only gets torn down when you unplug the GPU (ask eGPU users how often their systems crash when they do that… it’s a mess). A per-queue scheduler gets torn down any time a process using the GPU shuts down, so all the time. So I can’t afford that codepath to be broken.

                                                                                                                                                                                                                          Here it should be noted that it was not really a case of using something in a new way. The only difference is how many users are affected. Most people don’t unplug their GPU, so the fact that GPU unplugging is broken with many drivers is easy to sweep under the rug. But since it affects every user trying to use Lina’s driver, the problem can’t be swept under the rug and just be ignored.

                                                                                                                                                                                                                          And again, my scheduler change absolutely did not change the behavior for existing drivers at all (unless they were already broken, and then it could strictly improve things). That was provable.

                                                                                                                                                                                                                          I consulted with multiple DRM folks about how to design this, including actual video meetings, and was told this was the correct approach.

                                                                                                                                                                                                                2. 20

                                                                                                                                                                                                                  I am working on a project that is using sqlite on the server. A lot of these things are issues because of assumed scale, and that’s reasonable, but in my case, I know my app is an internal tool will have maximum 10 or 15 users total. It’s all just running on one VM. In this case, I’m choosing SQLite for sort of the inverse of a lot of these reasons: it is meaningfully simplifying things as compared to Postgres.

                                                                                                                                                                                                                  1. 2

                                                                                                                                                                                                                    Most compelling reason for me is how easy it is to test migrations when you don’t need a full DBMS to do any dev work.

                                                                                                                                                                                                                    1. 2

                                                                                                                                                                                                                      Can you elaborate on the simplification? I just started a prototype on a VM using Postgres and all I had to do was apt install postgresql and set up the user/role. I’m not very familiar with SQLite, but you still have to install it and then explicitly opt into the constraint enforcement stuff (at a minimum) so it seems to be a pretty comparable amount of work? And then if/when you need a database migration, Postgres has much better support for things like ALTER TABLE than SQLite. What am I missing?

                                                                                                                                                                                                                      1. 2

                                                                                                                                                                                                                        Running and backing up the data for a Postgres is considerably more difficult. With sqlite backing up is copying the file somewhere, running it: there’s nothing to run.

                                                                                                                                                                                                                        At the scale mentioned above you’ll also not be dealing with that many database migrations I think once you’re stable, so why bother?

                                                                                                                                                                                                                        1. 1

                                                                                                                                                                                                                          Backing up a directory rather than a file really “considerably more difficult”?

                                                                                                                                                                                                                          so why bother?

                                                                                                                                                                                                                          For me, because I know Postgres better and I dislike having to remember which options I need to turn on to make SQLite enforce constraints. It’s also painful when I do run into issues with migrations, or when I need some other feature it doesn’t have, or when that app that “will only ever run on one machine” needs to eventually run on multiple machines.

                                                                                                                                                                                                                          Even on a single machine, Postgres installs with one command and it supports constraints by default and backups are just copying the data directory. Maybe I’m missing something but it seems easier at any scale?

                                                                                                                                                                                                                        2. 2

                                                                                                                                                                                                                          There’s no additional process running for the database. It’s just a file.

                                                                                                                                                                                                                          The ALTER TABLE thing is annoying but I haven’t actually needed to do that yet. It’s also not any migration, just ones where you want to change a column type. Not a super common thing for me.

                                                                                                                                                                                                                      2. 2

                                                                                                                                                                                                                        I have been semi seriously using jj for a couple months (i tried it earlier last year and bounced off pretty hard, so this is my second attempt). I think where I’ve come to with it is that I like it in theory a lot, and the git integration is basically required if I’m collaborating with anybody else, but in practice I find it very difficult to use and often have to drop back into git in order to get something done.

                                                                                                                                                                                                                        I also recognize that it’s still an early project and I haven’t had as much time to put into learning it as would be ideal, so this shouldn’t be read as a critique of the project in any way. I really hope it continues to grow and gain popularity!

                                                                                                                                                                                                                        1. 3

                                                                                                                                                                                                                          What are the things that you’re dropping into git for, out of interest?

                                                                                                                                                                                                                          1. 3

                                                                                                                                                                                                                            Two things:

                                                                                                                                                                                                                            1. not knowing how to do something in jj and not being able to (or not having the time to) figure out from the docs - I forget the exact exact details but there was some change I made recently where I couldn’t get jj to do what I wanted so I ended up just recloning the repo in a separate directory and making the change with git

                                                                                                                                                                                                                            2. i only have jj installed and set up in a handful of repos so I’m constantly context switching which repo I’m in and how to make it work.

                                                                                                                                                                                                                            The second problem is hypothetically solvable by just switching everywhere but the fact that I don’t understand it and have run into situations like (1) make me nervous.

                                                                                                                                                                                                                            Edit: Oh actually I remember now–somehow i’d gotten jj into a state where it assigned the same stable id to two different commits and it was very unhappy about that, and I couldn’t figure out how to fix it

                                                                                                                                                                                                                            1. 3

                                                                                                                                                                                                                              If you’re willing to use discord, the discord is very friendly to questions, even beginner ones.

                                                                                                                                                                                                                              The docs do need work. It’ll get there.

                                                                                                                                                                                                                              somehow i’d gotten jj into a state where it assigned the same stable id to two different commits and it was very unhappy about that, and I couldn’t figure out how to fix it

                                                                                                                                                                                                                              This is called a “divergent change” https://jj-vcs.github.io/jj/latest/FAQ/#how-do-i-deal-with-divergent-changes-after-the-change-id and as the FAQ says, if you only want one of them, abandon the other, and if you want both, duplicate one so it gets a new change id, and then abandon the one with the duplicate change id. Hopefully that can help you for next time!

                                                                                                                                                                                                                        2. 8

                                                                                                                                                                                                                          I’ve been having similar feelings, while git feels a bit like a bloated mess, as a Magit (/gitu) user I have not really seen where it solves a significant problem.

                                                                                                                                                                                                                          I also like the same workflow of making a bunch of changes, then coalescing them into commits. I actually think the “decide on a change, then implement it” flow is nicer, almost Pomodoro-esque, but it’s not how I work in practice.

                                                                                                                                                                                                                          Otherwise I have years of accumulated git fixing experience that doesn’t help me, and commit signing is painfully slow which isn’t great for jj’s constant committing.

                                                                                                                                                                                                                          I do hope we get some progress in the space, Pijul seemed promising, I just don’t see the value for me personally at this point.

                                                                                                                                                                                                                          1. 5

                                                                                                                                                                                                                            Otherwise I have years of accumulated git fixing experience that doesn’t help me, and commit signing is painfully slow which isn’t great for jj’s constant committing.

                                                                                                                                                                                                                            I’ve been using git practically since it was initially released, I have 4 digit github user id. I’m on my 3rd or 4th attempt to switch to JJ. Breaking habits that are so old is really painful. Also there’s bunch of nice things that JJ could have but still doesn’t, and one can tell right away that the project is still early. Still, I hope it will be worth it…

                                                                                                                                                                                                                            In git managing a lot of commits is just too inconvenient, and despite all the deficiencies a lot of people (me included) can immediately see the potential of the whole philosophy.

                                                                                                                                                                                                                            commit signing is painfully slow

                                                                                                                                                                                                                            With git I use gpg on yubikey, so I need to touch the yubikey for every signature (paranoia hurts), and I had to just disable it in JJ, because it’s unworkable.

                                                                                                                                                                                                                            I hope eventually JJ will have ability to sign only when switching away from a change and/or when pushing changes. Similarly, I’m missing git pre-commit hooks. I hope soon JJ will have a hook system optimized for that flow.

                                                                                                                                                                                                                            I also really, really miss the git rebase -i.

                                                                                                                                                                                                                            Pijul seemed promising

                                                                                                                                                                                                                            Pijul seems great in theory, but JJ wins by being (at least initially) fully compatible with Git. One can’t ignore network effects as strong as Git has at this point.

                                                                                                                                                                                                                            1. 9

                                                                                                                                                                                                                              I hope eventually JJ will have ability to sign only when switching away from a change and/or when pushing changes.

                                                                                                                                                                                                                              Rejoice! This feature was released last week in v0.26.0: https://jj-vcs.github.io/jj/latest/config/#sign-commits-only-on-jj-git-push

                                                                                                                                                                                                                              1. 2
                                                                                                                                                                                                                              2. 3

                                                                                                                                                                                                                                With git I use gpg on yubikey

                                                                                                                                                                                                                                Same here, but with a single unlock. I’ve been experimenting with signing through SSH on Yubikey instead, which seems to be somewhat faster. I guess GPG is also just waiting to get replaced by something that isn’t as user-hostile. I get a pit in my stomach every time I have to fix anything related to it.

                                                                                                                                                                                                                                One can’t ignore network effects as strong as Git has at this point.

                                                                                                                                                                                                                                This actually reminds me, I really like the fact that branches aren’t named, but in reality we’re all pushing to GitHub, which means you need to name your branches after all, and JJ even adds the extra step of moving your branch/bookmark before every push, which I thought was a bit of a drag.

                                                                                                                                                                                                                                1. 6

                                                                                                                                                                                                                                  JJ even adds the extra step of moving your branch/bookmark before every push, which I thought was a bit of a drag.

                                                                                                                                                                                                                                  People are discussion big ideas with “topics” that would bring some of the auto-forwarding functionality of branches back. In the meantime, I found making a few aliases is perfectly fine. This one in particular is great:

                                                                                                                                                                                                                                  tug = ["bookmark", "move", "--from", "heads(::@- & bookmarks())", "--to", "@-"]
                                                                                                                                                                                                                                  

                                                                                                                                                                                                                                  It finds the closest ancestor with a bookmark and moves it to the parent of your working-copy commit. For me, pushing a single branch is usually jj tug ; jj p

                                                                                                                                                                                                                                  I also have a more advanced alias that creates a new commit and places it at the tip of an arbitrary bookmark in a single step. This is great in combination with the megamerge workflow and stacked PRs:

                                                                                                                                                                                                                                  cob = ["util", "exec", "--", "bash", "-c", """
                                                                                                                                                                                                                                  #!/usr/bin/env bash
                                                                                                                                                                                                                                  set -euo pipefail
                                                                                                                                                                                                                                  
                                                                                                                                                                                                                                  # usage: jj cob BOOKMARK_NAME [COMMIT_ARGS...]
                                                                                                                                                                                                                                  
                                                                                                                                                                                                                                  if test -z ${1+x}
                                                                                                                                                                                                                                  then
                                                                                                                                                                                                                                      echo "You need to specify a bookmark onto which to place the commit!"
                                                                                                                                                                                                                                      exit 1
                                                                                                                                                                                                                                  fi
                                                                                                                                                                                                                                  target_bookmark="$1" ; shift
                                                                                                                                                                                                                                  
                                                                                                                                                                                                                                  jj commit "$@"
                                                                                                                                                                                                                                  
                                                                                                                                                                                                                                  change_id="$(jj --ignore-working-copy log --revisions @- --no-graph --template change_id)"
                                                                                                                                                                                                                                  
                                                                                                                                                                                                                                  jj --ignore-working-copy rebase --revisions "$change_id" --insert-after "$target_bookmark"
                                                                                                                                                                                                                                  jj --ignore-working-copy bookmark move "$target_bookmark" --to "$change_id"
                                                                                                                                                                                                                                  """, ""]
                                                                                                                                                                                                                                  
                                                                                                                                                                                                                                  1. 2

                                                                                                                                                                                                                                    Oh, welcome, fellow gpg-sufferer.

                                                                                                                                                                                                                                    I’ve been experimenting with signing through SSH on Yubikey instead

                                                                                                                                                                                                                                    I have not looked into it. Does it work? I would definitely consider. Just not having to touch gpg is always a plus.

                                                                                                                                                                                                                                    extra step of moving your branch/bookmark before every push, which I thought was a bit of a drag.

                                                                                                                                                                                                                                    Same. I’m looking forward to some automatically-moving bookmarks

                                                                                                                                                                                                                                    1. 4

                                                                                                                                                                                                                                      I’ve been experimenting with signing through SSH on Yubikey instead

                                                                                                                                                                                                                                      I have not looked into it. Does it work? I would definitely consider. Just not having to touch gpg is always a plus.

                                                                                                                                                                                                                                      It generally does, and seems well enough supported (mainly GitHub). It does require registering the key as an allowed signing key in a separate file in ~/.ssh, but I can live with that I guess.

                                                                                                                                                                                                                                      I followed https://calebhearth.com/sign-git-with-ssh, and that worked just fine with my key from the Yubikey.

                                                                                                                                                                                                                                  2. 3

                                                                                                                                                                                                                                    I also really, really miss the git rebase -i.

                                                                                                                                                                                                                                    Note that you can still use git rebase -i in a colocated repo. Make sure to add the --colocated flag to jj git clone and jj git init. All my repos are always colocated, it makes all the git tooling work out-of-the-box.

                                                                                                                                                                                                                                    The next time you run jj after a git rebase, it will import those changes from the git repo, so everything just works. With one big exception: jj will not be able to correlate the old commit hashes with the new ones, so you lose the evolog of every commit that changed its hash. But then again, git doesn’t have an evolog in the first place, so you’re not losing anything compared to the baseline.

                                                                                                                                                                                                                                    It could be a fun side-project to add the git-like rebase UI to jj. A standalone binary that opens your editor with the same git rebase todo list, parses its content and submits equivalent jj rebase commands instead of letting git do it.

                                                                                                                                                                                                                                    1. 1

                                                                                                                                                                                                                                      It could be a fun side-project to add the git-like rebase UI to jj. A standalone binary that opens your editor with the same git rebase todo list, parses its content and submits equivalent jj rebase commands instead of letting git do it.

                                                                                                                                                                                                                                      Any particular reason you suggest a separate tool, rather than something to contribute to jj upstream? I also wouldn’t mind a convenient tool to reorder changes, and I would rather have it integrated in jj rebase directly. (This could also be a TUI instead, whatever.)

                                                                                                                                                                                                                                      There are other interactive modes that git support, for example the usual -p workflows with a basic TUI to approve/reject patches in the terminal (as an alternative to jj’s split interface relying on a 2-diff tool), which I wouldn’t mind seeing supported in jj as well. (I don’t think it’s a matter of keeping the builtin feature set minimalistic, given that the tool already embeds a builtin pager, etc.)

                                                                                                                                                                                                                                      1. 1

                                                                                                                                                                                                                                        Any particular reason you suggest a separate tool, rather than something to contribute to jj upstream? I also wouldn’t mind a convenient tool to reorder changes, and I would rather have it integrated in jj rebase directly. (This could also be a TUI instead, whatever.)

                                                                                                                                                                                                                                        I think nobody has yet invented a text-based interface for general graph editing:

                                                                                                                                                                                                                                        • For example: It’s not obvious how to intuitively represent divergent branches in the git rebase -i interface.
                                                                                                                                                                                                                                          • Ideally, there would be a simple text-oriented way to express the topology (via indentation, maybe?), rather than relying on label/reset commands in git rebase -i.
                                                                                                                                                                                                                                          • (I don’t think git rebase -i can currently initiate edits which would affect multiple branches (with no common descendant), so you incidentally don’t encounter this situation that often in practice.)
                                                                                                                                                                                                                                        • For example: The “mega-merge” workflow, which involves continually rewriting a set of commits and re-merging them, seems particularly painful in the git rebase -i interface.

                                                                                                                                                                                                                                        A non-text-based TUI might work (perhaps based on GitUp’s control scheme?), but nobody has implemented that, either.

                                                                                                                                                                                                                                        There are other interactive modes that git support, for example the usual -p workflows with a basic TUI to approve/reject patches in the terminal (as an alternative to jj’s split interface relying on a 2-diff tool), which I wouldn’t mind seeing supported in jj as well.

                                                                                                                                                                                                                                        If you’re only relying on approving/rejecting patches (and not e.g. editing hunks), then you should already be able to do this with jj’s :builtin TUI (see https://jj-vcs.github.io/jj/v0.26.0/config/#editing-diffs).

                                                                                                                                                                                                                                        1. 2

                                                                                                                                                                                                                                          A naive idea for a TUI would be to reuse the current terminal-intended visualization approach of jj log, or git log --oneline --graph: they show one change per line, with an ascii-art rendering of the commit placement on the left to visualize the position in the commit graph. In the TUI we could move the cursor to any commit and use simple commands to move them “up” or “down” in their own linear history (the default case), and other commands to move them to another branch that is displayed in parallel to the current branch (or maybe to another parent or child of the current node). Of course, this visualization allows other operations than commit displacement, arbitrary interactive-rebase style operations could be performed, or maybe just prepared, on this view.

                                                                                                                                                                                                                                          You should already be able to do this with jj’s :builtin TUI

                                                                                                                                                                                                                                          Woah, thanks! Turns out I read the doc before trying jj split, and I dutifully followed the recommendation to go with meld from the start, so I never got to try this.

                                                                                                                                                                                                                                          1. 2

                                                                                                                                                                                                                                            Replying to myself: it looks like jjui offers a workflow similar to the text-based graph editing I described above, see their demonstration video for Rebase.

                                                                                                                                                                                                                                        2. 1

                                                                                                                                                                                                                                          (I don’t think it’s a matter of keeping the builtin feature set minimalistic, given that the tool already embeds a builtin pager, etc.)

                                                                                                                                                                                                                                          I do think it would bloat the UI. And I have a hunch that the maintainers would see it the same way, but do feel free to open a feature request! It’s always good to talk these things through.

                                                                                                                                                                                                                                          My opinion is that the rebase todo list workflow is not very good and doesn’t fit with the way jj does things. When you edit the todo file, you are still making several distinct operations (reorder, squash, drop, reword etc.). In jj those map to a single command each. By just using the CLI, you can always confirm that what happens is what you expected. If it isn’t you can easily jj undo that single step. With the todo file, you just hope for the best and have to start all over if something didn’t go well. The todo file also is not a great graphical representation of your commit tree. In git, it’s realy optimized for a single branch. You can configure git so interactive rebase can be used in the context of a mega-merge-like situation… but the todo file will become very ugly and difficult to manage. On the other hand, the output of jj log is great! So I think offering the todo-file approach in jj would be inconsistent UI in the sense that it discourages workflows that other parts of the UI intentionally encourage.

                                                                                                                                                                                                                                          Regarding the comparison with the pager, I don’t think the maintainers are too concerned about binary size or number of dependencies, but rather having a consistent UI. A pager doesn’t really intrude on that.

                                                                                                                                                                                                                                          1. 4

                                                                                                                                                                                                                                            When you edit the todo file, you are still making several distinct operations (reorder, squash, drop, reword etc.). In jj those map to a single command each. By just using the CLI, you can always confirm that what happens is what you expected. If it isn’t you can easily jj undo that single step.

                                                                                                                                                                                                                                            I find this argument very unconvincing. When I operate on a patchset that I am preparing for external review, I have a global view of the patchset, and it is common to think of changes that affect several changes in the set at once. Reordering a line of commits, for example (let’s forget about squash,drop,reword for this discussion), is best viewed as a global operation: instead of A-B-C-D I want to have C-B-A-D. The cli forces me to sequentialize this multi-change operation into a sequence of operations on individual changes, and doing this (1) is unnatural, and (2) introduces needless choices. Exercise time: can you easily describe a series of jj rebase command to do this transformation on commits in that order?

                                                                                                                                                                                                                                            The todo file also is not a great graphical representation of your commit tree.

                                                                                                                                                                                                                                            I agree! But the command-line is even worse as it is no graphical representation at all. It would be nice to have a TUI or a keyboard-driven GUI that is good at displaying trees when we do more complex things, but the linear view of an edit buffer is still better than the no-view-at-all of the CLI when I want to operate on groups of changes as a whole.

                                                                                                                                                                                                                                            1. 3

                                                                                                                                                                                                                                              Exercise time: can you easily describe a series of jj rebase command to do this transformation on commits in that order?

                                                                                                                                                                                                                                              Yeah, that’s not hard with jj.

                                                                                                                                                                                                                                              jj rebase -r C -B A # insert C between A and its parent(s)
                                                                                                                                                                                                                                              jj rebase -r B -B A # insert B between A and its parent, which is now C
                                                                                                                                                                                                                                              

                                                                                                                                                                                                                                              And I would still insist that in a realistic scenario, these commits have semantic meaning so there are naturally going to be thoughts like “X should be before Y” which trivially translates to jj rebase -r X -B Y.

                                                                                                                                                                                                                                              To make it clear though, I’m not saying your perspective is wrong. Just that I don’t think this workflow would be a good addition upstream. I’d be very happy if there was an external tool that implemented this workflow for you and I don’t think the experience would be any worse than as a built-in option (apart from the on-time install step I guess.)

                                                                                                                                                                                                                                              But the command-line is even worse as it is no graphical representation at all.

                                                                                                                                                                                                                                              What do you mean it’s not graphical? Have you seen the output of jj log? For example:

                                                                                                                                                                                                                                              @    zrxsx remo@buenzli.dev 2025-02-14 21:29:44 ea00f612
                                                                                                                                                                                                                                              ├─╮  (empty) merge foo and bar
                                                                                                                                                                                                                                              │ │
                                                                                                                                                                                                                                              │ ○  qvoum remo@buenzli.dev 2025-02-14 21:29:08 53a1147d
                                                                                                                                                                                                                                              │ │  (empty) commit foo
                                                                                                                                                                                                                                              │ │
                                                                                                                                                                                                                                              ○ │  msrkn remo@buenzli.dev 2025-02-14 21:29:22 git_head() 67515939
                                                                                                                                                                                                                                              ├─╯  (empty) commit bar
                                                                                                                                                                                                                                              │
                                                                                                                                                                                                                                              ○  przqo remo@buenzli.dev 2025-02-14 21:28:57 9e9d4e50
                                                                                                                                                                                                                                              │  (empty) initial commit
                                                                                                                                                                                                                                              │
                                                                                                                                                                                                                                              •  zzzzz root() 00000000
                                                                                                                                                                                                                                              

                                                                                                                                                                                                                                              I’d say that’s quite graphical. jj even has a templating language that let’s you customize this output in a very powerful and ergonomic way.

                                                                                                                                                                                                                                              You don’t get this visual tree structure in git rebase’s todo file.

                                                                                                                                                                                                                                              1. 1

                                                                                                                                                                                                                                                The cli forces me to sequentialize this multi-change operation into a sequence of operations on individual changes, and doing this (1) is unnatural, and (2) introduces needless choices.

                                                                                                                                                                                                                                                I hear what you’re saying, and I think it’s kinda funny: from a different perspective, git rebase forces you into a serial sequence of operations, whereas jj rebase never does. Doesn’t mean you’re wrong, of course, it just took me a moment to grok what you meant, given that I usually view it as the opposite!

                                                                                                                                                                                                                                                (Another pain point with the CBAD thing is that last time i had to do this, it introduced a lot of conflicts thanks to the intermediate state being, well, not what i wanted, and so seeing all that red was stressful. they disappeared after moving another commit around, but in the moment, i was not psyched about it)

                                                                                                                                                                                                                                          2. 1

                                                                                                                                                                                                                                            Note that you can still use git rebase -i in a colocated repo. Make sure to add the –colocated flag to jj git clone and jj git init. All my repos are always colocated, it makes all the git tooling work out-of-the-box.

                                                                                                                                                                                                                                            Oh, that’s good know. Generally, I am afraid to do anything with git directly after enabling jj in a given repo. I’m afraid of confusing myself, and I’m afraid of confusing the tooling.

                                                                                                                                                                                                                                            It could be a fun side-project to add the git-like rebase UI to jj. A standalone binary that opens your editor with the same git rebase todo list, parses its content and submits equivalent jj rebase commands instead of letting git do it.

                                                                                                                                                                                                                                            I’m looking forward for it to be built-in, at least eventually. git has it. And jj already opens commit message editor on jj desc, so it’s not some new type of UI.

                                                                                                                                                                                                                                          3. 2

                                                                                                                                                                                                                                            I also really, really miss the git rebase -i.

                                                                                                                                                                                                                                            Can you describe what you’re doing with interactive rebases that you can’t do (or can’t do as efficiently) with JJ? Is it specifically this interface to rebasing commits that you’re missing, or a particular feature that only works with git rebase -i?

                                                                                                                                                                                                                                            1. 6

                                                                                                                                                                                                                                              Just the interface. Editing lines in a text editor is a perfect blend of (T)UI and CLI. Just being able to reorder commits would be great. With squash/fixup, that’s 99.9% of my usage of git rebase -i.

                                                                                                                                                                                                                                              1. 2

                                                                                                                                                                                                                                                TUIs like jjui are really good for that.

                                                                                                                                                                                                                                          4. 4

                                                                                                                                                                                                                                            I have not really seen where it solves a significant problem.

                                                                                                                                                                                                                                            The main problem I encounter that it could solve is when I talk to someone who doesn’t already know git and have to kinda sheepishly say “welllll, yeah you can get the code you want with this one tool, but it suuuuuuucks; it’s so bad, I must apologize on behalf of all programmers everywhere except Richard Hipp”

                                                                                                                                                                                                                                            1. 4

                                                                                                                                                                                                                                              I’m fully aware that I’m just Stockholm-syndromed to git. Having tried to explain how to use git to someone myself, I completely agree that it’s incredibly opaque and inconsistent. I do think that a lot of that only surfaces once you use git in non-trivial ways, clone-edit-stage-commit-push might not be optimal, but it’s fine.

                                                                                                                                                                                                                                              For casual users I feel like the biggest overall usability win would be if GitHub could find a way to let you contribute to a repository without having to fork it.

                                                                                                                                                                                                                                              1. 4

                                                                                                                                                                                                                                                For casual users I feel like the biggest overall usability win would be if GitHub could find a way to let you contribute to a repository without having to fork it.

                                                                                                                                                                                                                                                This is one of the reasons that as a serial drive-by contributor, I much prefer projects hosted on Codeberg (or random Forgejo instances, perhaps even SourceHut, though I’m not a fan of the email based workflow): I can submit PRs without forking.

                                                                                                                                                                                                                                                1. 3

                                                                                                                                                                                                                                                  After your comment I actually went back and signed into Codeberg, but I’m not finding how you’re supposed to PR without forking. Even their documentation talks about the fork-based workflow. Am I missing something?

                                                                                                                                                                                                                                                  1. 7

                                                                                                                                                                                                                                                    It is the AGit workflow that lets you do this. It’s not advertised, because there’s no UI built around it (yet?), and is less familiar than the fork+PR model. But it’s there, even if slightly hidden.

                                                                                                                                                                                                                                                  2. 1

                                                                                                                                                                                                                                                    While I’m in general not a big fan of it, that’s one useful thing about Githubs gh commandline tool: a simple gh repo fork in a clone of the upstream will create a fork and register it as a remote. Now if they only added a way to automatically garbage-collect forks that have no open branches anymore…

                                                                                                                                                                                                                                                    1. 3

                                                                                                                                                                                                                                                      That’s still 1-2 commands more, and an additional tool, compared to a well crafted git push.

                                                                                                                                                                                                                                                      Of course, I could build a small cron job that iterates over my GH repos, and finds forks that have no open PRs, and are older than a day or so, and nukes them. It can be automated. But with Codeberg, I don’t have to automate anything.

                                                                                                                                                                                                                                                2. 2

                                                                                                                                                                                                                                                  Correct. Git is a tool I’d be embarrassed to show a new developer. Jujutsu is one of be proud of.

                                                                                                                                                                                                                                                3. 2

                                                                                                                                                                                                                                                  I at this point have been screwed most of the times that jj ran into a conflict. Not sure how first class it is…

                                                                                                                                                                                                                                                  1. 2

                                                                                                                                                                                                                                                    I’m assuming “being screwed” means it was hard to fix the conflict, that sucks, I’m sorry to hear that.

                                                                                                                                                                                                                                                    What “first class” means in this conflict is that jj stores the information that the conflict exists as part of the metadata of the commit itself. Git doesn’t really let you keep a conflict around, it detects conflicts and then has you resolve them immediately. jj will happily let a change sit there conflicted for as long as you’d like. Rather than “good at making conflicts not happen in the first place,” which is a separate thing that seems like hasn’t been true for you. I don’t know if and what differences jj has with git in that regard.

                                                                                                                                                                                                                                                    1. 1

                                                                                                                                                                                                                                                      I had it again today and managed to work my way through it but not because there’s a good “first aid in case of conflict” documentation or anything. That’s the major issue: the tricks and skills we collectively built up to work with and around git aren’t there for jj yet.

                                                                                                                                                                                                                                                      But then while resolving the conflict, gg showed my bookmark to have split into 3 which was mildly surprising to say the least.