Threads for bert

  1. 19

    I found it kind of weird how a lot of this was like “hey, people were talking about modules 70 years ago!”. They were also talking about process oriented (or service oriented, essentially) architecture in the 50s, 60s, and 70s - ex the actor model, simula67, Erlang, etc. Microservices are more recent but microservice architecture is basically just SOA with some patterns and antipatterns defined.

    There’s a great section in Armstrong’s thesis, “Philosophy”, that has a bit to say on the topic of modules vs processes (which map to services, logically)[0].

    The essential problem that must be solved in making a fault-tolerant software system is therefore that of fault-isolation. Different programmers will write different modules, some modules will be correct, others will have errors. We do not want the errors in one module to adversely affect the behaviour of a module which does not have any errors.

    To provide fault-isolation we use the traditional operating system notion of a process. Processes provide protection domains, so that an error in one process cannot affect the operation of other processes. Different programmers write different applications which are run in different processes; errors in one application should not have a negative influence on the other applications running in the system.

    His thesis is rife with excellent citations about the process abstraction and its power.

    Much of this is to say that, yes, modules are great, but they do not go far enough. Modules do not allow for isolation - processes do. By all means, use modules, but understand their limitations.

    1. Modules do not isolate your address space. You may share references to data across modules that is mutated by those modules independently - this creates complex unsynchronized state.

    2. Modules do not isolate failures. If I pass a value to a function and that function blows up, how far do I have to roll back before that value can be considered sane? Is the value sane? Was it modified? Is there an exception? A fault? All sorts of issues there.

    Joe also talks a lot about the physics of computing, how violating causality is problematic, state explosions, etc. It all fits into the philosophy.

    Anyway, yes, splitting a service into two process incurs overhead. This is factually the case. That’s also why Erlang services were in-memory, but whatever that’s obviously not microservice architecture, which very clearly is built around communication protocols that traverse a network. No question, you will lose performance. That’s why you don’t create a microservice for, say, parsing JSON - just do that in process. Hey, go nuts, even use a module! You… can do that. Or honestly, yeah, parse json in another process - might make sense if you’re dealing with untrusted input.

    I found it ironic that microservices are being shown as somehow mistaken with regards to the “fallacies of distributed computing” when the actual backgroun is in fact to deal with unreliable systems. See ‘Why do computers stop and what can be done about it?’ - 1985 [1]. Notably, that entire paper boils down to “if you want to lower your mtbf you need a distributed system with these specific primitives” (cough processes cough cough) .

    You can do that by embracing standalone processes hosted in Docker containers, or you can do that by embracing standalone modules in an application server that obey a standardized API convention, or a variety of other options

    Weird emphasis on or because all of these work well together. Please, by all means, use well defined and isolated modules in your microservices, everyone will be happier for it. Also don’t just split out random shit into services, especially if the overhead of the RPC is going to matter - again, there’s nothing “microservicy” that would make you do such a thing.

    Microservice architecture, as with all methodologies, has its benefits and downfalls.

    To be honest I didn’t get much out of this article. It basically is “modules good, microservices… something?”. The only criticism seems to be that distributed systems are complicated, which is not really interesting since “distributed systems” is a whole massive domain and “networks have latency” is about as tiny of a drop of water in that ocean as one can get.

    [0] https://erlang.org/download/armstrong_thesis_2003.pdf

    [1] https://www.hpl.hp.com/techreports/tandem/TR-85.7.pdf

    1. 5

      This article is useful when your boss’s boss starts asking about microservices and you need to send them something that says “microservices have tradeoffs and here is an alternative.”

      If you already understand that, you’re well past needing to read this article yourself.

      It’s long and easy to digest, which is great when you’re sending it to someone removed from the day to day “it depends” of software engineering.

    1. 28

      This is an important lesson. Rust is aimed at being a safer systems language. There are a lot of safe applications languages with more mature ecosystems, better tooling, and so on. If you are in a situation where you don’t need fine-grained control over memory layout, where you can tolerate the memory and jitter of a global garbage collector, then Rust is as much the wrong choice as C would be. Rust is a good choice for situations where, if Rust didn’t exist, C would be a good choice. If you wouldn’t consider using C for a project, don’t use Rust. Use Erlang/Elixir, C#/F#, JavaScript/TypeScript, Swift, Pony, or anything else that targets application programming problems.

      1. 20

        There are a lot of safe applications languages with more mature ecosystems, better tooling, and so on

        I think this is sadly not entirely true. On balance, I would say Rust’s “quality of implementation” story is ahead of mature languages, and that’s exactly what creates this “let’s write CRUD in Rust” pressure.

        Let’s say I write something where I don’t need Rust, what are my choices?

        • Erlang/Elexir – dynamic types, the same, if not bigger, level of weirdness in Rust.
        • F# last time I’ve checked, the build system was pretty horrible (specifying files in a specific order in an XML file)
        • C# – until recently, it had nullability problems, and was pretty windows-specific. maybe it is a viable option now. My two questions would be: a) how horrible is the build system? b) how hard is it to produce a static binary which I can copy from one linux PC to another without installing any runtime libraries?
        • JavaScript/TypeScript – npm is a huge time sink. Once deno matures (or Rome conquers the world) it might become a reasonable alternative
        • Swift – doesn’t really exist outside of Mac? A lot of core things were in mac-specific libraries last time I’ve checkd
        • Pony – super niche at this point?
        • Java – nullability, problems with static binaries, horrible build system
        • Kotlin – Java sans nullability
        • Go – if you value utmost simplicity more than sum types, that is a splendid choice. I think Go is the single language which is in the same class as Rust when it comes to “quality of implementation”, and, arguably, it’s even better than Rust in this respect. If only it had enums and null-checking…
        • OCaml – this one is pretty ideal when it comes to the language machinery, but two+ build systems, two standard libraries, etc, make it not a reasonable choice in comparison.

        Really, it seems that there’s “Go with enums / OCaml with channels and Cargo”-shaped hole in our language-space these days, which is pretty imperfectly covered by existing options. I really wish that such language existed, this would relieve a lot of design pressure from Rust and allow it to focus on systems use-case more.

        1. 9

          JavaScript/TypeScript – npm is a huge time sink. Once deno matures (or Rome conquers the world) it might become a reasonable alternative

          I’d have a hard time finding a productive lead or senior engineer for Node.js complaining that npm is a timesink and the number one thing that makes the language not productive. The teams I work with are incredibly productive with Node (and would have a very hard time onboarding with rust), and most of the time we only have to touch package.json once every few weeks.

          1. 6

            Ocaml 5 has channels along with the new multi core support. And algebraic effects.

            1. 4

              [… ] this would relieve a lot of design pressure from Rust and allow it to focus on systems use-case more.

              Considering that a lot of more serious “system use-cases” (the Linux kernel, Firefox) tend to either fight or ditch Cargo, could it be that its current design is optimized for the wrong use-cases?

              1. 3

                That’s a great question.

                The primary answer is indeed that’s not the use-case Cargo is optimized for, and that was the right thing to do. With the exception of the kernel, all folks who ditch Cargo still make use of crates.io ecosystem. And producing re-usable libraries is the use-case Cargo is very heavily skewed towards. The main shtick of Cargo is the (informal, implementation-defined) specification of Cargo.toml file and .crate archive which provide a very specific interface for re-usable units of code. Because every library on crates.io is build in exactly same inflexible way, re-using the libraries is easy.

                The secondary answer is that, while Cargo could be much better (defaults steer you towards poor compile times) for the case of developing a leaf artifact, rather than a library, it’s pretty decent. Stuff like ripgrep, cargo, rust-analyzer, wasmtime are pretty happy with Cargo. There’s a lot of “systems-enough” software which doesn’t need a lot of flexibility in the build process.

                But yeah, ultimately, if you build something BIG, you probably have a generic build-system building the thing, and the way to plug Rust there is by plugging rustc, not cargo. Though, yeah, Cargo could’ve been somewhat more transparent to make this easier.

                1. 1

                  The main shtick of Cargo is the (informal, implementation-defined) specification of Cargo.toml file […]

                  An even bigger shtick is build.rs, a rudimentary build system built with environment variables. For example, in build2 we could map all the concepts of Cargo.toml (even features), but dealing with build.rs is a bleak prospect.

                  1. 2

                    Yeah, build.rs is super-problematic. It’s a necessary evil escape hatch to be able to use C code, if you have too.

                    I think the most promising solution in this space is the metabuild proposal: rather than compiling C “by hand” in build.rs, you specify declarative dependencies on C libraries in Cargo.toml, and then a generic build.rs (eg, the same binary shared between all crates) reads that meta and builds C code. This would provide a generic hook for generic build systems to supply Cargo with C deps.

                    Sadly, that didn’t go anywhere:

                    • the design we arrived at required first-class support in Cargo, which didn’t materialize. I feel that an xtask-shaped polyfil would’ve worked better.
                    • we lacked a brave person to actually submit tonnes of PRs to actually move ecosystem towards declarative native deps.
                    1. 1

                      Yeah, in a parallel universe there is a standard build system and package manager for C/C++ with Rust (and other languages) that are built “on top” simply reusing that and getting all the C/C++ libraries for free.

              2. 3

                I think this overstates the importance of the output being a single static binary. Sometimes that’s a desirable goal, no question. But if you are writing code that will be deployed to a containerized environment, which is the case for a lot of CRUD web apps, installing runtime dependencies (libraries, interpreter, etc.) is a no-op from an operational point of view because the libraries are bundled in the container image just like they’d be bundled in a static binary.

                1. 3

                  But then you’re paying the cost of building container images - which is a huge part of deployment time costs if I understand people’s experience reports correctly.

                  1. 5

                    Even with static binaries you’re probably still looking at containerisation for a lot of intermediate things, or runtime requirements, if only because of the state of infrastructure right now. There isn’t really “k8s but without containers”.

                    1. 2

                      Not really. Building Docker images with proper caching is pretty quick, within about a couple of minutes in my experience. Far more time will be spent running tests (or in bad CI systems, waiting for an agent to become ready).

                      1. 1

                        👍 It only adds seconds on top of iterated builds if done well. Could take an extra minute or two on a fresh machine

                      2. 1

                        It’s pretty quick if you choose not to include a whole operating system

                      3. 2

                        Yeah, for specific “let’s build a CRUD app” use-case static linking is not that important.

                        But it often happens that you need to write many different smaller things for different use-cases. If you pick the best tool for the job for every small use-case, you’ll end up with a bloated toolbox. Picking a single language for everything has huge systemic savings.

                        This is the angle which makes me think that “small artifacts” is an important part of QoI.

                      4. 3

                        the same, if not bigger, level of weirdness in Rust.

                        In the case of Elixir, I’d argue that it actually writes a great deal like Python or Ruby, plus some minor practical guardrails (immutability) and an idiom around message-passing.

                        The really weird stuff you don’t hit until Advanced/Advanced+ BEAM wizard status.

                        1. 1

                          This is your regularly scheduled reminder that D exists. It recently added sumtypes to the standard library!

                          1. 6
                            $ cat main.d
                            import std.stdio;
                            
                            void main(){
                                auto r = f();
                                writeln("hello, ub");
                                writeln(*r);
                            }
                            
                            auto f() {
                                int x = 92;
                                return g(&x);
                            }
                            
                            auto g(int* x) {
                                return x;
                            }
                            
                            $ dmd main.d && ./main
                            
                            hello, ub
                            4722720
                            

                            If, by default, I can trivially trigger UB, the language loses a lot of points on QoI. I think there’s some safe pragma, or compilation flag, or something to fix this, but it really should be default if we think about language being applicable to CRUD domain.

                            1. 2

                              This is fixed by enabling the experimental DIP1000 feature via -dip1000:

                               11: Error: returning `g(& x)` escapes a reference to local variable `x`
                              

                              DIP1000 is in testing and will hopefully become standard soon.

                              That said, just don’t use pointers. You practically never need them. We have large backend services in production use that never use pointers at all.

                          2. 1

                            Scala. The horsepower and ecosystem of the JVM, plus the type system inspired by much more academic languages. In reality though, all of the above mainstream languages are plenty good enough for almost all line of business work. And where they have shortcomings, certainly Rust does have its own shortcomings as outlined in the OP. So it’s all about tradeoffs, as always.

                            1. 1

                              tbh, I think the only reason to use Scala over Kotlin is having an already existing large Scala code-base which went all-in on type-level programming (which I think is surprisingly the case for a lot of bank’s internal departments).

                              Scala deserves a lot of credit for normalizing functional programming in the industry, it was the harbinger of programming language renaissance, but, as a practical tool, I don’t think it is there.

                              1. 1

                                As far as I know Kotlin still doesn’t have a general pattern-matching implementation. This is so important for business logic correctness that I find it difficult to imagine managing a serious production codebase without it. I use it on an almost daily basis in my projects.

                          3. 6

                            Use Erlang/Elixir, C#/F#, JavaScript/TypeScript, Swift, Pony, or anything else that targets application programming problems.

                            Didn’t Wallaroo move from Pony to Rust around this time last year?

                            I remember the corecursive interview with Sean Allen in 2020 talking about his developer experience with Pony “All of us who worked on Wallaroo in the early days have a bit of scar tissue where even though none of us had hit a compiler bug in forever, we were still like, ‘Is that a bug in my code or is that a bug in the compiler?’”

                            Having worked at a startup where we used Haskell for an elaborate CRUD app, this sounds about as painful. They sacked the engineering director after I left and I believe they’ve moved to Typescript for everything now.

                            I wouldn’t put Pony on that list just like I wouldn’t put Haskell on that list.

                            I can also say that my buddy who’s a manager at a big-data company has had everyone under him switch over to Java from Scala.

                            So I’d probably put Java and Golang at the top of the list for a CRUD app.

                            1. 2

                              Great interview, enjoyed reading it.

                              Tbh I think the trade off here is how “business logicky” your app is, rather than crud or whatever. At a certain point something like scala really helps you find cascading implications of changes. If you have that kind of app, all else being equal, a rich static type system is going to improve productivity. The more you’re mostly just doing io with code that’s fairly self contained, the easier it is to crank out code with a golang or mypy like type system.

                              Similarly the more you need to spend time wrangling for throughput the better your life will be with systems that focus on safety and control of execution.

                              Twitter started out on rails, and that makes sense when you don’t need your application to actually be bug free or fast but you do need to just get stuff out there. We continue to have so many different programming systems because there are so many different points in the space that it’s worth optimizing for.

                              (For anyone who thinks I’m being dismissive- I spend my days writing golang with only api tests mostly to run sql, and it’s a solid choice for that kind of thing).

                          1. 21

                            Oh is it time to hype dsls again? That makes sense as we’re starting to all get a little embarrassed about the levels of hype for functional programming.

                            I guess next we’ll be hyping up memory safe object oriented programming.

                            1. 16

                              I’m just sitting here with my Java books waiting for the pendulum to swing back…

                              1. 9

                                I’m going to go long on eiffel books.

                                1. 6

                                  I think a language heavily inspired by Eiffel, while fixing all of its (many, many) dumb mistakes, could go really far.

                                  1. 2

                                    I’ve just started learning Eiffel and like what ive seen so far, just curious what do you consider its mistakes?

                                    1. 8
                                      1. CAT-calling
                                      2. Bertrand Meyer’s absolute refusal to use any standard terminology for anything in Eiffel. He calls nulls “voids”, lambdas “agents”, modules “clusters”, etc.
                                      3. Also his refusal to adopt any PL innovations past 1995, like all the contortions you have to do to get “void safety” (null safety) instead of just adding some dang sum types.
                                    2. 1

                                      Racket!

                                2. 14

                                  I, personally, very much doubt full on OOP will ever come back in the same way it did in the 90s and early 2000s. FP is overhyped by some, but “newer” languages I’ve seen incorporate ideas from FP and explicitly exclude core ideas of OOP (Go, Zig, Rust, etc.).

                                  1. 5

                                    I mean, all of those languages have a way to do dynamic dispatch (interfaces in Go, trait objects in Rust, vtables in Zig as of 0.10).

                                    1. 13

                                      And? They also all support first-class functions from FP but nobody calls them FP languages. Inheritance is the biggest thing missing, and for good reason.

                                      1. 12

                                        This, basically. Single dynamic dispatch is one of the few things from Java-style OO worth keeping. Looking at other classic-OO concepts: inheritance is better off missing most of the time (some will disagree), classes as encapsulation are worse than structs and modules, methods don’t need to be attached to classes or defined all in one batch, everything is not an object inheriting from a root object… did I miss anything?

                                        Subtyping separate from inheritance is a useful concept, but from what I’ve seen the world seldom breaks down into such neat categories to make subtyping simple enough to use – unsigned integers are the easiest example. Plus, as far as I can tell it makes most current type system math explode. So, needs more theoretical work before it wiggles back into the mainstream.

                                        1. 8

                                          I’ve been thinking a lot about when inheritance is actually a good idea, and I think it comes down to two conditions:

                                          1. The codebase will instantiate both Parent and Child objects
                                          2. Anything that accepts a Parent will have indistinguishable behavior when passed a Child object (LSP).

                                          IE a good use of Inheritance is to subclass EventReader with ProfiledEventReader.

                                          1. 10

                                            Take a cookie from a jar for using both LSP and LSP in a single discussion!

                                            1. 4

                                              Inheritance can be very useful when it’s decoupled from method dispatch.

                                              Emacs mode definitions are a great example. Nary a class nor a method in sight, but the fact that markdown-mode inherits from text-mode etc is fantastically useful!

                                              On the other hand, I think it’s fair to say that this is so different from OOP’s definition of inheritance that using the same word for it is just asking for confusion. (I disagree but it’s a reasonable argument.)

                                              1. 2

                                                Inheritance works wonderfully in object systems with multiple dispatch, although I’m not qualified to pinpoint what is it that makes them click together.

                                                1. 1

                                                  I’ve lately come across a case where inheritance is a Good Idea; if you’re plotting another of your fabulous blog posts on this, I’m happy to chat :)

                                                  1. 1

                                                    My impression is that inheritance is extremely useful for a peculiar kind of composition, namely open recursion. For example, you write some sort of visitor-like pattern in a virtual class, then inherit it, implement the visit method or what have you, and use this to recurse between the abstract behavior of traversing some structure, and your use-case-specific code. Without recursion you have to basically reimplement a vtable by hand and it sucks.

                                                    Well, that’s my only use of inheritance in OCaml. Most of the code is just functions, sum types, records, and modules.

                                                    1. 1

                                                      Forrest for the trees? When you want to create a framework that has default behaviour that can be changed, extended or overridden?

                                                    2. 4
                                                      • obj.method syntax for calling functions — a decent idea worth keeping.
                                                      • bundling behavior, mutable state, and identity into one package — not worth doing unless you are literally Erlang.
                                                      1. 3

                                                        IMO there is a fundamental difference between Erlang OO and Java OO to the point that bringing them up in the same conversation is rarely useful. Erlang actively discourages you from having pellets of mutable state scattered around your program: sure, threads are cheap, but that state clump is still a full-blown thread you need to care for. It needs rules on supervision, it needs an API of some kind to communicate, etc, etc. Erlang is at it’s best when you only use threads when you are at a concurrency boundary, and otherwise treat it as purely functional. Java, in contrast, encourages you to make all sorts of objects with mutable state all over the place in your program. I’d wager that MOST non-trivial methods in Java contain the “new” keyword. This results in a program with “marbled” state, which is difficult to reason about, debug, or apply any kind of static analysis to.

                                                      2. 2

                                                        In all honesty, you sound quite apologetic to what could be arguably considered objectively bad design.

                                                        Attaching methods to types essentially boils down to scattering data (state) all over the code and writing non pure functions. Why honestly cannot understand how anyone would think this is a good idea. Other than being influenced by trends or cults or group thinking.

                                                        Almost the same could be said about inheritance. Why would fiting a data model in a unique universal tree be a good idea? Supposedly to implicitly import functionality from parent classes without repeating yourself. Quite a silly way to save a line of code. Specially considering the languages that do it are rather verbose.

                                                        1. 5

                                                          Why honestly cannot understand how anyone would think this is a good idea. Other than being influenced by trends or cults or group thinking.

                                                          Here’s a pro tip that has served me well over many years. Whenever I see millions of otherwise reasonable people doing a thing that is obviously a terribly stupid idea, it is always a lack of understanding on my part about what’s going on. Either I am blind to all of the pros of what they are doing and only see the cons, or what they’re doing is bad at one level but good at a different level in a way that outbalances it, or they are operating under constraints that I don’t see or pretend can be ignored, or something else along those lines.

                                                          Billions of lines of successful shipped software have been written in object-oriented languages. Literally trillions of dollars of economic value have been generated by this software. Millions of software developers have spent decades of their careers doing this. The though that they are all under some sort of collective masochistic delusion simply does pass Hanlon’s Razor.

                                                          1. 1

                                                            To be honest, the more I study OOP (or rather, the hodgepodge of features and mechanisms that are claimed by various groups to be OOP), the less room I see for a genuine advantage.

                                                            Except one: instantiation.

                                                            Say you have a piece of state, composed of a number of things (say a couple integers, a boolean and a string), that represent some coherent whole (say the state of a lexer). The one weird trick is that instead of letting those be global variables, you put them in a struct. And now you can have several lexers running at the same time, isn’t that amazing?

                                                            Don’t laugh, before OOP was popular very prominent people thought it was a good idea to have global state in Lex, Yacc, or error handling (errno). So here’s my current guess: the success we attribute to OOP doesn’t really come from any of its overly hyped features. It comes from a couple very mundane, yet very good programming practices it adopted along the way. People attributed to the hyped stuff (such as inheritance) a success they have earned mostly by avoiding global variables.

                                                            Abstract data types are amazing, and used everywhere for decades, including good old C. The rest of OOP though? Contextual at best.

                                                            1. 1

                                                              It has been the opposite for me.

                                                              • typecast everything to and from object in early versions of java
                                                              • EJBs 2
                                                              • Bower package manager. Its creator wrote on stack overflow that he was confused when he created the project and that it was essentially useless.
                                                              • Ruby gems security incident
                                                              • Left pad fiasco
                                                              • Apache web server htaccess configs

                                                              I could go on with more esoteric examples to an ever growing list.

                                                              All these had criticism screaming long before they happened: why?

                                                            2. 3

                                                              Many decisions are only clearly good or bad in retrospect.

                                                          2. 6

                                                            Inheritance is the biggest thing missing, and for good reason.

                                                            That reason being “inheritance was the very first mechanism for subtyping, ADTs, and code-reuse, and people using it got ideas for better mechanisms from it.” ;)

                                                            1. 1

                                                              Exactly!

                                                            2. 3

                                                              The first versions of Simula and Smalltalk didn’t have inheritance either. Self and other prototypal object-oriented languages don’t use traditional inheritance either. We still call all of them object-oriented.

                                                              Honestly, it’s well beyond time that we retire all programming language paradigm terms. Modern languages simply aren’t organized into paradigms they way older simpler languages were.

                                                              It’s like we’re looking at a Honda Accord and arguing over whether it’s a penny farthing or a carriage. The taxonomy no longer makes sense.

                                                          3. 1

                                                            Ah yes and that’s why it’s ripe to have a come back. :)

                                                            Seriously though I expect that the next incarnation will be “oop without inheritance” or something. Probably combined with some large corporation “inventing” gc-less memory management.

                                                            1. 2

                                                              The good parts of OOP never really left. We already have that exact language: Rust. It has formal interfaces (Traits), encapsulation, polymorphism, and gc-less memory management.

                                                              1. 10

                                                                The main thing about OOP that needs to die is the idea that OOP is a coherent concept worth discussing on its own. Talk about the individual concepts as independent things! It’s much more productive.

                                                                1. 1

                                                                  Talk about the individual concepts as independent things!

                                                                  IMO OOP these days really means inheritance and an object lifecycle. All the other concepts aren’t really unique to OOP.

                                                                  1. 3

                                                                    I think “OOP” generally means “features of object-oriented languages that I don’t like” to a lot of people. The people using those languages don’t generally get into paradigm arguments.

                                                                    (Personally, I consider inheritance to be common in OOP languages but not a particularly interesting or salient part of them. Many early OOP languages didn’t have inheritance and prototypal ones have an entirely different code reuse model.)

                                                                    1. 1

                                                                      For some people “OOP” means “features of languages I do like”. For instance I’ve seen people include templates/generics/parametric polymorphism and unnamed functions as core parts of OOP… having learned CamlLight (OCaml without the “O”) in college, I confessed I was quite astonished.

                                                                    2. 2

                                                                      You say that but it means different things to different people. I don’t disagree that your definition would be a good one if you could get people to agree on it, but I can’t assume that when other people say “OOP” that’s what they’re talking about.

                                                              2. 1

                                                                I think it will come back, rediscovered as something new by a new generation disillusioned with whatever has been the cool solves-everything paradigm of the previous half decade. Perhaps this time as originally envisaged with a “Scandinavian school” modeling approach.

                                                                Of course it never left as the first choice for one genre of software… the creation of frameworks featuring default behavior that can be overridden, extended or changed.

                                                                Those languages you mention (Go, Zig, Rust) are primarily languages solving problems in the computer and data sciences, computing infrastructure and technical capability spaces. Something is going to be needed to replace or update all those complex aging ignored line-of-business systems.

                                                              3. 11

                                                                There isn’t really any need to “hype” DSLs because they’re already widely used in all domains of programming:

                                                                • front end: HTML / CSS / JavaScript, and most JS web frameworks introduce a new DSL (multiple JSX-like languages, Svelte, etc.)
                                                                • back end: a bajillion SQL variants, a bazillion query languages like Redis
                                                                • builds: generating Ninja, generating Make (CMake, Meson, etc.)
                                                                  • there at least 10 CI platforms with their own YAML DSLs, with vars, interpolation, control flow, etc.
                                                                • In games: little scripting languages for every popular game
                                                                • Graphics: scene description languages, shader languages
                                                                • Compilers: LLVM has its own TableGen language, languages for describing compiler optimizations and architecture (in the implementation of Go, a famously “not DSL” language), languages for describing VMs (Ruby)
                                                                • Machine Learning: PyTorch, TensorFlow, etc. (these are their own languages, on top of Python)
                                                                • Distributed computing: at least 10 MapReduce-derived frameworks/languages; there are internal DSLs in Scala for example, as well as external ones
                                                                • Mathematics and CS: Coq, Lem, etc.

                                                                All of these categories can be fractally expanded, e.g. I didn’t mention the dozens of languages here: https://github.com/oilshell/oil/wiki/Survey-of-Config-Languages – many of which are commonly used and featured on this site

                                                                If you think you don’t use DSLs, then you’re probably just working on a small part of a system, and ignoring the parts you’re not working on.

                                                                ALL real systems use tons of DSLs. I think the real issue is to mitigate the downsides

                                                                1. 1

                                                                  Oh yes but at the same time if you haven’t seen the hype for DSLs then you haven’t spent long enough in the industry to go through that part of the hype cycle. DSLs are what they are and it looks like we might be entering a hype cycle where people want to make them out to be much more.

                                                                  1. 3

                                                                    I don’t agree, I’ve been in the industry for 20+ years, there are plenty of things more hyped than DSLs (cloud, machine learning, etc.)

                                                                    DSLs are accepted standard practice, and widely used, but often poorly understood

                                                                    I’m not getting much light from your comments on the subject – you’ve made 2 claims of hype with no examples

                                                                    1. 2

                                                                      Here’s an example of recent hype https://www.codemag.com/Article/0607051/Introducing-Domain-Specific-Languages

                                                                      Here’s some hype from the year 2000 https://www.researchgate.net/publication/276951339_Domain-Specific_Languages

                                                                      Arguably the hype for 4GLs was the prior iteration of that specific hype.

                                                                      I’m not arguing that DSLs are bad - I’m saying that they’re one of the things on the roster of perfectly good things that periodically get trumpeted as the next big thing that will revolutionize computing. These hype cycles are characterized by attempts to make lots of DSLs when there isn’t a strong need for it or any real payoff to making a language rather than a library.

                                                                2. 4

                                                                  I know it might sound a bit controversial, but the way I see it we need to reach a new level of abstraction in order for large-scale software development to be sustainable. Some people might say AI is the way forward, or some other new programming technique. Either way I don’t think we’ll get there by incrementally improving on the paradigms we have—in order to reach the next level we’ll have to drop some baggage on the way up.

                                                                  1. 4

                                                                    I mean, humans aren’t getting better at groking abstraction, so I don’t know that “new levels of abstraction” are the way forward. Personally, I suspect it means more rigor about the software development process–if you’re building a tall tower, maybe the base shouldn’t be built with a “move fast and break things” mentality.

                                                                    1. 3

                                                                      Groking abstractions isn’t the problem, at the end of the day abstractions are just making decisions for the users of an abstraction. Over-abstraction is the root of many maintainability woes IMO, the more a programmer knows what’s actually going on underneath the better, but only to the degree that it’s relevant.

                                                                    2. 3

                                                                      I’ve heard it before. DSLs have their place, and some people love them while others hate them. This is one of a rotating cast of concepts that you’ll eventually see rehyped in 10 years.

                                                                  1. 9

                                                                    The thread on LKML about this work really doesn’t portray the Linux community in a good light. With a dozen or so new kernels being written in Rust, I wouldn’t be surprised if this team gives up dealing with Linus and goes to work on adding good Linux ABI compatibility to something else.

                                                                    1. 26

                                                                      I dunno, Linus’ arguments make a lot of sense to me. It sounds like he’s trying to hammer some realism into the idealists. The easter bunny and santa claus comment was a bit much, but otherwise he sounds quite reasonable.

                                                                      1. 19

                                                                        Disagreement is over whether “panic and stop” is appropriate for kernel, and here I think Linus is just wrong. Debugging can be done by panic handlers, there is just no need to continue.

                                                                        Pierre Krieger said it much better, so I will quote:

                                                                        Part of the reasons why I wrote a kernel is to confirm by experience (as I couldn’t be sure before) that “panic and stop” is a completely valid way to handle Rust panics even in the kernel, and “warn and continue” is extremely harmful. I’m just so so tired of the defensive programming ideology: “we can’t prevent logic errors therefore the program must be able to continue even a logic error happens”. That’s why my Linux logs are full of stupid warnings that everyone ignores and that everything is buggy.

                                                                        One argument I can accept is that this should be a separate discussion, and Rust patch should follow Linux rule as it stands, however stupid it may be.

                                                                        1. 7

                                                                          I think the disagreement is more about “should we have APIs that hide the kernel context from the programmer” (e.g. “am I in a critical region”).

                                                                          This message made some sense to me: https://lkml.org/lkml/2022/9/19/840

                                                                          Linus’ writing style has always been kind of hyperbolic/polemic and I don’t anticipate that changing :( But then again I’m amazed that Rust-in-Linux happened at all, so maybe I should allow for the possibility that Linus will surprise me.

                                                                          1. 1

                                                                            This is exactly what I still don’t understand in this discussion. Is there something about stack unwinding and catching the panic that is fundamentally problematic in, eg a driver?

                                                                            It actually seems like it would be so much better. It recovers some of the resiliency of a microkernel without giving up the performance benefits of a monolithic kernel.

                                                                            What if, on an irrecoverable error, the graphics driver just panicked, caught the panic at some near-top-level entry point, reset to some known good state and continued? Seems like such an improvement.

                                                                            1. 5

                                                                              I don’t believe the Linux kernel has a stack unwinder. I had an intern add one to the FreeBSD kernel a few years ago, but never upstreamed it (*NIX kernel programmers generally don’t want it). Kernel stack traces are generated by following frame-pointer chains and are best-effort debugging things, not required for correctness. The Windows kernel has full SEH support and uses it for all sorts of things (for example, if you try to access userspace memory and it faults, you get an exception, whereas in Linux or FreeBSD you use a copy-in or copy-out function to do the access and check the result).

                                                                              The risk with stack unwinding in a context like this is that the stack unwinder trusts the contents of the stack. If you’re hitting a bug because of stack corruption then the stack unwinder can propagate that corruption elsewhere.

                                                                              1. 1

                                                                                With the objtool/ORC stuff that went into Linux as part of the live-patching work a while back it does actually have a (reliable) stack unwinder: https://lwn.net/Articles/728339/

                                                                                1. 2

                                                                                  That’s fascinating. I’m not sure how it actually works for unwinding (rather than walking) the stack: It seems to discard the information about the location of registers other than the stack pointer, so I don’t see how it can restore callee-save registers that are spilled to the stack. This is necessary if you want to resume execution (unless you have a setjmp-like mechanism at the catch site, which adds a lot of overhead).

                                                                                  1. 2

                                                                                    Ah, a terminological misunderstanding then I think – I hadn’t realized you meant “unwinding” specifically as something sophisticated enough to allow resuming execution after popping some number of frames off the stack; I had assumed you just meant traversal of the active frames on the stack, and I think that’s how the linked article used the term as well (though re-reading your comment now I realize it makes more sense in the way you meant it).

                                                                                    Since AFAIK it’s just to guarantee accurate stack backtraces for determining livepatch safety I don’t think the objtool/ORC functionality in the Linux kernel supports unwinding in your sense – I don’t know of anything in Linux that would make use of it, aside from maybe userspace memory accesses (though those use a separate ‘extable’ mechanism for explicitly-marked points in the code that might generate exceptions, e.g. this).

                                                                                    1. 2

                                                                                      If I understand the userspace access things correctly, they look like the same mechanism as FreeBSD (no stack unwinding, just quick resumption to an error handler if you fault on the access).

                                                                                      I was quite surprised that the ORC[1] is bigger than DWARF. Usually DWARF debug info can get away with being large because it’s stored in separate pages in the binary from the file and so doesn’t consume any physical memory unless used. I guess speed does matter for things like DTrace / SystemTap probes, where you want to do a full stack trace quickly, but in the kernel you can’t easily lazily load the code.

                                                                                      The NT kernel has some really nice properties here. Almost all of the kernel’s memory (including the kernel’s code) is pageable. This means that the kernel’s unwind metadata can be swapped out if not in use, except for the small bits needed for the page-fault logic. In Windows, the metadata for paged-out pages is stored in PTEs and so you can even page out page-table pages, but you can then potentially need to page in every page in a page-table walk to handle a userspace fault. That extreme case probably mattered a lot more when 16 MiB of RAM was a lot for a workstation than it does now, but being able to page out rarely-used bits of kernel is quite useful.

                                                                                      In addition, the NT kernel has a complete SEH unwinder and so can easily throw exceptions. The SEH exception model is a lot nicer than the Itanium model for in-kernel use. The Itanium C++ ABI allocates exceptions and unwind state on the heap and then does a stack walk, popping frames off to get to handlers. The SEH model allocates them on the stack and then runs each cleanup frame, in turn, on the top of the stack then, at catch, runs some code on top of the stack before popping off all of the remaining frames[2]. This lets you use exceptions to handle out-of-memory conditions (though not out-of-stack-space conditions) reliably.

                                                                                      [1] Such a confusing acronym in this context, given that the modern LLVM JIT is also called ORC.

                                                                                      [2] There are some comments in the SEH code that suggest that it’s flexible enough to support the complete set of Common Lisp exception models, though I don’t know if anyone has ever taken advantage of this. The Itanium ABI can’t support resumable exceptions and needs some hoop jumping for restartable ones.

                                                                              2. 4

                                                                                What you are missing is that stack unwinding requires destructors, for example to unlock locks you locked. It does work fine for Rust kernels, but not for Linux.

                                                                            2. 7

                                                                              Does the kernel have unprotected memory and just rolls with things like null pointer dereferences reading garbage data?

                                                                              For errors that are expected Rust uses Result, and in that case it’s easy to sprinkle the code with result.or(whoopsie_fallback) that does not panic.

                                                                              1. 4

                                                                                As far as I understand, yeah, sometimes the kernel would prefer to roll with corrupted memory as far as possible:

                                                                                So BUG_ON() is basically ALWAYS 100% the wrong thing to do. The argument that “there could be memory corruption” is [not applicable in this context]. See above why.

                                                                                (from docs and linked mail).

                                                                                null derefernces in particular though usually do what BUG_ON essentially does.

                                                                                And things like out-of-bounds accesses seem to end with null-dereference:

                                                                                https://github.com/torvalds/linux/blob/45b588d6e5cc172704bac0c998ce54873b149b22/lib/flex_array.c#L268-L269

                                                                                Though, notably, out-of-bounds access doesn’t immediately crash the thing.

                                                                                1. 8

                                                                                  As far as I understand, yeah, sometimes the kernel would prefer to roll with corrupted memory as far as possible:

                                                                                  That’s what I got from the thread and I don’t understand the attitude at all. Once you’ve detected memory corruption then there is nothing that a kernel can do safely and anything that it does risks propagating the corruption to persistent storage and destroying the user’s data.

                                                                                  Linus is also wrong that there’s nothing outside of a kernel that can handle this kind of failure. Modern hardware lets you make it very difficult to accidentally modify the kernel page tables. As I recall, XNU removes all of the pages containing kernel code from the direct map and protects the kernel’s page tables from modification, so that unrecoverable errors can take an interrupt vector to some immutable code that can then write crash dumps or telemetry and reboot. Windows does this from the Secure Kernel, which is effectively a separate VM that has access to all of the main VM’s memory but which is protected from it. On Android, Halfnium provides this kind of abstraction.

                                                                                  I read that entire thread as Linus asserting that the way that Linux does things is the only way that kernel programming can possibly work, ignoring the fact that other kernels use different idioms that are significantly better.

                                                                                  1. 5

                                                                                    Reading this thread is a little difficult because the discussion is evenly spread between the patch set being proposed, some hypothetical plans for further patch sets, and some existing bad blood between the Linux and Rust community.

                                                                                    The “roll with corrupted memory as far as possible” part is probably a case of the “bad blood” part. Linux is way more permissive with this than it ought to be but this is probably about something else.

                                                                                    The initial Rust support patch set failed very eagerly and panicked, including on cases where it really is legit not to panic, like when failing to allocate some memory in a driver initialization code. Obviously, the Linux idiom there isn’t “go on with whatever junk pointer kmalloc gives you there” – you (hopefully – and this is why we should really root for memory safety, because “hopefully” shouldn’t be a part of this!) bail out, that driver’s initialization fails but kernel execution obviously continues, as it probably does on just about every general-purpose kernel out there.

                                                                                    The patchset’s authors actually clarified immediately that the eager panics are actually just an artefact of the early development status – an alloc implementation (and some bits of std) that follows safe kernel idioms was needed, but it was a ton of work so it was scheduled for later, as it really wasn’t relevant for a first proof of concept – which was actually a very sane approach.

                                                                                    However, that didn’t stop seemingly half the Rustaceans on Twitter to take out their pitchforks, insists that you should absolutely fail hard if memory allocation fails because what else are you going to do, and rant about how Linux is unsafe and it’s riddled with security bugs because it’s written by obsolete monkeys from the nineties whose approach to memory allocation failures is “well, what could go wrong?” . Which is really not the case, and it really does ignore how much work went into bolting the limited memory safety guarantees that Linux offers on as many systems as it does, while continuing to run critical applications.

                                                                                    So when someone mentions Rust’s safety guarantees, even in hypothetical cases, there’s a knee-jerk reaction for some folks on the LKML to feel like this is gonna be one of those cases of someone shitting on their work.

                                                                                    I don’t want to defend it, it’s absolutely the wrong thing to do and I think experienced developers like Linus should realize there’s a difference between programmers actually trying to use Rust for real-world problems (like Linux), and Rust advocates for whom everything falls under either “Rust excels at this” or “this is an irrelevant niche case”. This is not a low-effort patch, lots of thinking went into it, and there’s bound to be some impedance mismatch between a safe language that tries to offer compile-time guarantees and a kernel historically built on overcoming compiler permisiveness through idioms and well-chosen runtime tradeoffs. I don’t think the Linux kernel folks are dealing with this the way they ought to be dealing with it, I just want to offer an interpretation key :-D.

                                                                                2. 1

                                                                                  No expert here, but I imagine linux kernel has methods of handling expected errors & null checks.

                                                                                3. 6

                                                                                  In an ideal world we could have panic and stop in the kernel. But what the kernel does now is what people expect. It’s very hard to make such a sweeping change.

                                                                                  1. 6

                                                                                    Sorry, this is a tangent, but your phrasing took me back to one of my favorite webcomics, A Miracle of Science, where mad scientists suffer from a “memetic disease” that causes them to e.g. monologue and explain their plans (and other cliches), but also allows them to make impossible scientific breakthroughs.

                                                                                    One sign that someone may be suffering from Science Related Memetic Disorder is the phrase “in a perfect world”. It’s never clearly stated exactly why mad scientists tend to say this, but I’d speculate it’s because in their pursuit of their utopian visions, they make compromises (ethical, ugly hacks to technology, etc.), that they wouldn’t have to make in “a perfect world”, and this annoys them. Perhaps it drives them to take over the world and make things “perfect”.

                                                                                    So I have to ask… are you a mad scientist?

                                                                                    1. 2

                                                                                      I aspire to be? bwahahaa

                                                                                      1. 2

                                                                                        Hah, thanks for introducing me to that comic! I ended up archive-bingeing it.

                                                                                      2. 2

                                                                                        What modern kernels use “panic and stop”? Is it a feature of the BSDs?

                                                                                        1. 8

                                                                                          Every kernel except Linux.

                                                                                          1. 2

                                                                                            I didn’t exactly mean bsd. And I can’t name one. But verified ones? redox?

                                                                                            1. 1

                                                                                              I’m sorry if my question came off as curt or snide, I was asking out of genuine ignorance. I don’t know much about kernels at this level.

                                                                                              I was wondering how much an outlier the Linux kernel is - @4ad ’s comment suggests it is.

                                                                                              1. 2

                                                                                                No harm done

                                                                                        2. 4

                                                                                          I agree. I would be very worried if people writing the Linux kernel adopted the “if it compiles it works” mindset.

                                                                                          1. 2

                                                                                            Maybe I’m missing some context, but it looks like Linus is replying to “we don’t want to invoke undefined behavior” with “panicking is bad”, which makes it seem like irrelevant grandstanding.

                                                                                            1. 2

                                                                                              The part about debugging specifically makes sense in the “cultural” context of Linux, but it’s not a matter of realism. There were several attempts to get “real” in-kernel debugging support in Linux. None of them really gained much traction, because none of them really worked (as in, reliably, for enough people, and without involving ritual sacrifices), so people sort of begrudgingly settled for debugging by printf and logging unless you really can’t do it otherwise. Realistically, there are kernels that do “panic and stop” well and are very debuggable.

                                                                                              Also realistically, though: Linux is not one of those kernels, and it doesn’t quite have the right architecture for it, either, so backporting one of these approaches onto it is unlikely to be practical. Linus’ arguments are correct in this context but only insofar as they apply to Linux, this isn’t a case of hammering realism into idealists. The idealists didn’t divine this thing in some programming class that only used pen, paper and algebra, they saw other operating systems doing it.

                                                                                              That being said, I do think people in the Rust advocacy circles really underestimate how difficult it is to get this working well for a production kernel. Implementing panic handling and a barebones in-kernel debugger that can nonetheless usefully handle 99% of the crashes in a tiny microkernel is something you can walk third-year students through. Implementing a useful in-kernel debugger that can reliably debug failures in any context, on NUMA hardware of various architectures, even on a tiny, elegant microkernel, is a whole other story. Pointing out that there are Rust kernels that do it well (Redshirt comes to mind) isn’t very productive. I suspect most people already know it’s possible, since e.g. Solaris did it well, years ago. But the kind of work that went into that, on every level of the kernel, not just the debugging end, is mind-blowing.

                                                                                              (Edit: I also suspect this is the usual Rust cultural barrier at work here. The Linux kernel community is absolutely bad at welcoming new contributors. New Rust contributors are also really bad at making themselves welcome. Entertaining the remote theoretical possibility that, unlikely though it might be, it is nonetheless in the realm of physical possibility that you may have to bend your technology around some problems, rather than bending the problems around your technology, or even, God forbid, that you might be wrong about something, can take you a very long way outside a fan bulletin board.)

                                                                                              1. 1

                                                                                                easter bunny and santa claus comment

                                                                                                Wow, Linus really has mellowed over the years ;)

                                                                                            1. 10
                                                                                              1. 17

                                                                                                Not quite the same. Jujutsu, Gitless and Stacked Git are UIs written on top of Git. The UX is better than Git, but they inherit all the issues of not having proper theoretical foundations. Mercurial is in that category too: better UX, same problems.

                                                                                                Pijul solves the most basic problem of version control: all these tools try to simulate patch commutation (using mixtures of merges and rebases), but would never say it like that, while Pijul has actual commutative patches.

                                                                                                1. 3

                                                                                                  Have you read the detailed conflicts document? https://github.com/martinvonz/jj/blob/main/docs/conflicts.md It also links to a more technical description: https://github.com/martinvonz/jj/blob/main/docs/technical/conflicts.md

                                                                                                  1. 2

                                                                                                    Yes. As someone who cares a lot about version control, I have read all I could about Jujutsu when it was first released. I actually find it nice and interesting.

                                                                                                    What I meant in my comment above was that while I acknowledge that the snapshot model is dominant and incremental improvements are a very positive thing, I also don’t think this is the right way to look at the problem:

                                                                                                    • All the algorithms modelling these problems in the context of concurrent datastructures use changes rather than snapshots: CRDTs, OTs… (I’m talking about the “latest” version being a CRDT, here, not the entire DAG of changes).

                                                                                                    • In all the Git UIs that I know of, the user is always shown commits as patches. Why represent it differently internally?

                                                                                                    1. 1

                                                                                                      What’s presented to the user does not need to be the internal representation. Just like users and most of the tools work on the snapshot of the source repo, yet you can represent the snapshot as a set of patches internally. That, however, does not necessarily mean either snapshot or set-of-patches works superior than the other. Besides, any practical VCS would have both representations available anyway.

                                                                                                      1. 1

                                                                                                        Good point, and actually Pijul’s internal representation is far from being as simple as just a list of patches.

                                                                                                        However, what I meant there wasn’t about the bytes themselves, but rather about the operations defined on the internal datastructure. When your internals model snapshots (regardless of what bytes are actually written), all your operations will be on snapshots – yet, Git never graphically shows anything as snapshots, all UIs show patches. This has real consequences visible in the real world, for example the lack of associativity (bad merges), unintelligible conflicts (hello, git rerere), endless rebases…

                                                                                                        Also, my main interests in this project are mathematical (how to model things properly?) and technical (how to make the implementation really fast?). So, I do claim that patches can simulate snapshots at no performance penalty, whereas the converse isn’t true if you want to do a proper merge and deal with conflicts rigorously.

                                                                                                      2. 1

                                                                                                        Yeah, I do think that, like many people have commented, collaboration networks like GitHub are one thing any new DVCS will either need to produce, or need to somehow be compatible with. Even GitHub is famously unusable for the standard kernel workflow and it could be argued that it suffers due to that.

                                                                                                        I really like the Jujutsu compromise of allowing interop with all the common git social networks at the same time as allowing more advanced treatment of conflicts and ordering between operations.

                                                                                                        There isn’t a document yet on how the transition to the native backend in a git social network world would look.

                                                                                                        I also think that the operation log of jujutsu not being sharable is a limitation that would be nice to cram into some hidden data structures in the actual git repo, but then you have a chicken and egg problem of how to store that operation…

                                                                                                    2. 1

                                                                                                      So, it seems the phrase “the most basic” was unclear in my comment above: I meant that from a theory point of view, the most basic problem is “what is a version control system?”, and that is what we tried to tackle with Pijul.

                                                                                                  1. 11

                                                                                                    The graph that compares the various formats after gzip is kind of a killer - it seems that compression makes the differences between the various formats more-or-less irrelevant, at least from the perspective of size. For applications where size is the most important parameter and a binary format is acceptable, I think I might just tend to prefer gzipped JSON and feel happy that I probably won’t need any additional libraries to parse it. If I got concerned about speed of serialization and deserialization I’d probably just resort to dumping structs out of memory like a barbarian.

                                                                                                    1. 6

                                                                                                      The main issue with gzipped documents is that you must unpack them before use. It will hurt badly if you need to run a query on a big document (or a set of documents). I recommend reading BSON and UBJSON specs, which explain it in details.

                                                                                                      1. 4

                                                                                                        The syntax you’ve chosen requires a linear scan of the document in order to query it, too, so it’d be a question of constant-factor rather than algorithmic improvement, I think?

                                                                                                        1. 2

                                                                                                          constant factors matter!

                                                                                                        2. 2

                                                                                                          That makes sense; I had been blinded to that use-case by my current context. I’ve been looking at these kinds of formats recently for the purpose of serializing a network configuration for an IoT device to be transmitted with ggwave. if I was working on big documents where I needed to pick something out mid-stream, I could definitely see wanting something like this.

                                                                                                          1. 2

                                                                                                            i’m currently doing a similar thing, but for OTA upgrade of IoT devices: packing multiple files into Muon and compressing it using Heatshrink. It’s then unpacked and applied on the fly on the device end.

                                                                                                            1. 1

                                                                                                              I think I’ll also publish it in the following weeks

                                                                                                          2. 1

                                                                                                            You can stream gzip decompression just fine. Not all queries against all document structures can be made without holding the whole document in memory but for a lot of applications it’s fine.

                                                                                                          3. 3

                                                                                                            Yeah, json+gzip was so effective at $job, we stopped optimizing there. Another downside not mentioned by the other replies, though: gzip can require a “lot” (64KB) of memory for its dictionary, so for example, you couldn’t use that on an arduino.

                                                                                                            1. 2

                                                                                                              BTW, you can use Heatshrink compression on arduino (as I do currently)

                                                                                                            2. 2

                                                                                                              The main advantage of a binary format isn’t size, but speed and ability to store bytestrings

                                                                                                              1. 1

                                                                                                                I have encountered similar situations and thought the same, but one place this was relevant was ingesting data in WASM. Avoiding serde in Rust, or JSON handling in another language, makes for significantly smaller compiles. Something like CBOR has been a good fit since it is easy/small to parse, but this looks interesting as well.

                                                                                                                1. 9

                                                                                                                  It’s nice to see people use ‘no-code’ as it should be used - get to market quickly, move on when it no longer works for your use-case.

                                                                                                                  1. 4

                                                                                                                    This is cool - hearkens back to an era when Mozilla was being rewarded for betting on new technology.

                                                                                                                    1. 10

                                                                                                                      This is so great - Zig covers a crucial part of the design space that other languages haven’t executed as well or as seriously.

                                                                                                                      I have my hopes that some language will attempt to compete and add memory safety in the next five years, but I still think that Zig is adding tons of value to the landscape right now.

                                                                                                                      1. 7

                                                                                                                        I have, unfortunately, not had time to play with Zig yet, and it’s been on my list forever. I put Rust ahead of it because it appears to have some, erm, green pa$ture$ and you know how the$e thing$ are.

                                                                                                                        But what I see from the outside is extremely encouraging. I see non-trivial programs (interpreters, allocators, compilers) with decent failsafe mechanisms, efficiently written in a language much smaller than many of its alternatives in the systems space (C++, Rust, even D, I think, although I haven’t written D in practically forever…), without it having spawned content mills that explain the fifteenth right way to return error codes shortly before it, too, goes out of fashion, or how to get the compiler to not choke on a three-line program.

                                                                                                                        This tells me it’s a language (along with a standard library) that hits close to that sweet spot where it has enough features that you don’t have to reinvent too many wheels, but also has few enough features that keeping up with the language isn’t a full-time job that you have to take on top of your existing full-time job lest your code becomes irrelevant by next year. Also that it’s likely to age well, as programs that use the standard library will not be littered with failed experiments in architecture astronautics.

                                                                                                                        I dunno, I hope I’m right :-).

                                                                                                                        1. 4

                                                                                                                          I guess, but the part of the design space that they’ve chosen to cover is use after free vulnerabilities. Even C++ has better memory management through smart pointers.

                                                                                                                          This is critical, as while yes out-of-bound errors do happen, the majority of security bugs of the last decade have involved use-after-free vulnerabilities as their primary vector. Explicitly, and deliberately - it’s a design goal - maintaining the malloc/free model that has accumulated decades of evidence that it results in bugs that at least cause data loss, and are frequently exploitable - even indirectly: the notorious “goto fail” bug was because manual memory management frequently requires handling said memory in error cases, so requiring things like fail: labels.

                                                                                                                          In terms of OoB errors modern C++ provides std::span, custom data types (see WebKit’s WTF::Vector, etc), etc; all can/do perform bounds checking and don’t require re-writing everything in a new language (that gets rid of smart pointers making things even worse).

                                                                                                                        1. 3

                                                                                                                          If anyone is interested, I have a Talos II that isn’t getting much use in my home - send me a message if you’d be interested in it.

                                                                                                                          1. 2

                                                                                                                            Interesting article. Do we have similar things in other languages?

                                                                                                                            The concept of promises I am relatively familiar with, and I have done quite a bit with observables.

                                                                                                                            1. 3

                                                                                                                              There are structured concurrency libraries for other languages (C, Python, Swift, maybe Kotlin are the mature implementations I know about).

                                                                                                                              The originator of the “structured concurrency” label summarized progress since their seminal post back in 2018, but I think it’s come a lot farther since then: https://250bpm.com/blog:137/

                                                                                                                              It’s linked from the article, but “Notes on structured concurrency” is probably the best summary of the idea yet written: https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/

                                                                                                                              1. 3

                                                                                                                                Do we have similar things in other languages?

                                                                                                                                This article and the Swift proposal are very, very close to how Ada built-ins handles concurrency.

                                                                                                                                You don’t deal with threads and promises, the language provides tasks which are active and execute concurrently, and protected objects which are passive and provide mutual exclusion and allow complicated guard conditions to shared data. Tasks are like threads, but have procedure-like things (called “entries”) you can write and call which block until that task “accepts” them. You can implement the concurrency elements common in other languages (like promises) using these features, but you usually don’t need to.

                                                                                                                                Execution doesn’t proceed out of a block until all tasks declared in the scope are complete, unless those tasks are allocated on the heap. You can declare one-off tasks or reusable tasks even within functions that can share regular state. Tasks doesn’t just have to accept a single “entry”–queueing and selection of one of many entries is built-in, and this select block also supports timeouts, delays and proceeding if no entry is available. For long-running tasks which might not complete on time, there’s also a feature called “asynchronous transfer of control” which aborts a task if a computation exceeds a time threshold. Standard library functions provide pinning of tasks to CPUs, prioritization, and controlling which CPUs a task runs on using “dispatching domains”.

                                                                                                                                I’ve spent days of my life debugging async/await in other languages, but I feel like the Ada concurrency built-ins help describe the intent of what you’re trying to accomplish in a very natural way.

                                                                                                                              1. 5

                                                                                                                                Both powerpc64 (for Talos II workstations) and the M1 platform have support in this release. Feels like the future!

                                                                                                                                1. 2

                                                                                                                                  In case it’s useful, https://asahilinux.org/support/ links to both Patreon and GitHub Sponsors donation links for the project.

                                                                                                                                  1. 2

                                                                                                                                    Seems like the author cherry picked a few examples that didn’t pan out, but I don’t think the generalization works. IME, “developer tools” have improved tremendously in the last twenty years that I’ve been working as a developer. Even the languages themselves have been moving in directions that make dev lives easier.

                                                                                                                                    The biggest problem, IMO, is that by the time a language and its ecosystem has been around long enough to develop mature tooling, the trend followers have moved on to the next new thing. C++, Java, Lisp, and even Python have a ton of dev tools. The newest, most bleeding edge Javascript dialect probably not so much.

                                                                                                                                    1. 5

                                                                                                                                      Hmm, I’m not sure I agree.

                                                                                                                                      The author picked tools from one lineage (Java and the community of researchers who attempted to improve it).

                                                                                                                                      You could look at tools from another lineage (e.g. Smalltalk, Lisp) and note that big, useful ideas (e.g. the Smalltalk browser, Gemstone version control system, Common Lisp condition/restart exception handling, Interface Builder) tend to go by the wayside or emerge into popular practice hamstrung in crucial ways (Interface Builder’s interactivity subjugated to the edit-compile-run cycle).

                                                                                                                                      I’m more familiar with the Lisp stuff than the Smalltalk stuff so I’ve provided links for reference for those.

                                                                                                                                      1. 4

                                                                                                                                        What is the best reference to read about the Gemstone system you mention?

                                                                                                                                        1. 2

                                                                                                                                          I don’t have a reference handy, I’m sorry :(

                                                                                                                                          I learned about it from someone I met at bangbangcon a few years ago - they worked (many years ago) in finance on an all-Smalltalk system and we talked about it at length over lunch.

                                                                                                                                          1. 2

                                                                                                                                            I used ENVY for version control, but only knew of Gemstone as an object database. I think they later morphed into a distributed object system with persistence ability.

                                                                                                                                            1. 2

                                                                                                                                              NP. I will dig around. Smalltalk inspired quite a few neat projects.

                                                                                                                                              1. 1

                                                                                                                                                I likely conflated Gemstone with ENVY, see mtnygard’s sibling comment!

                                                                                                                                                I also found this collection of descriptions of ENVY and how people used it - from this description it doesn’t seem obvious to me what specific technical feature was awesome about it.

                                                                                                                                                Several of those interviewed talk about method-at-a-time versioning reducing conflicts and the subjective experience of the “liveness” of the whole codebase.

                                                                                                                                                https://paulhammant.com/2017/09/01/smalltalk-envy/

                                                                                                                                        2. 1

                                                                                                                                          C++, Java, Lisp, and even Python have a ton of dev tools. The newest, most bleeding edge Javascript dialect probably not so much.

                                                                                                                                          Let’s not ignore the advances that have been made relative to those languages, with regard to community, debugging, legibility, safety, ergonomics… etc. I admit, many people’s function for what to use is basically use?(thing, threshold) = (thing.hotness > threshold). But there are some legitimate reasons to use newer languages.

                                                                                                                                          If using your newfangled tool required using a language whose inception took place at least 25 years ago, wouldn’t you consider that a problem for adoption? Would the kind of person to use such a well-established tool really be interested in trying out your new, shiny invention?

                                                                                                                                        1. 10

                                                                                                                                          So to be clear, this means that there will be Rust in the next-ish Linux kernel?

                                                                                                                                          That’s amazing. Linus has been loathe to put anything but C in there. This is an enormous step. How is that going to work for platforms where Linux is supported but not Rust? (I don’t know what the disjointedness of those sets are, maybe none.)

                                                                                                                                          1. 13

                                                                                                                                            From https://lwn.net/Articles/849849/:

                                                                                                                                            Appearance in linux-next generally implies readiness for the upcoming merge window, but it is not clear if that is the case here; this code has not seen a lot of wider review yet.

                                                                                                                                            Re: arch support: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Documentation/rust/arch-support.rst

                                                                                                                                            1. 1

                                                                                                                                              Yeah that’s going to limit what rust can be used for. At least at first.

                                                                                                                                              1. 1

                                                                                                                                                I think I saw in a mailing list somewhere that the plan was to “aim for” this merge window, but expect to miss it… with the idea that that would improve their odds of actually getting into the next merge window.

                                                                                                                                              2. 12

                                                                                                                                                It’s not clear from the title, but this is specifically support for writing device drivers in Rust.

                                                                                                                                                initial support for writing device drivers in the Rust language

                                                                                                                                                (from the same https://lwn.net/Articles/849849/)

                                                                                                                                                1. 8

                                                                                                                                                  The GCC project is working on a Rust front end. That would presumably support Rust on any platform that Linux cares about. https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Documentation/rust/arch-support.rst

                                                                                                                                                  1. 2

                                                                                                                                                    How is that going to work for platforms where Linux is supported but not Rust?

                                                                                                                                                    For the foreseeable future, it remains possible to build the kernel without rust. That means rust support is limited to things like device drivers (this patch set) or I suppose architecture specific code.

                                                                                                                                                  1. 7

                                                                                                                                                    I’m a bit jaded about talks but Mike Acton’s “Data-Oriented Design” from CPPCon in 2014 was probably the last one to markedly reframe my approach to software design:

                                                                                                                                                    https://www.youtube.com/watch?v=rX0ItVEVjHc

                                                                                                                                                    Pairs well with Graydon Hoare’s review of Computer Architecture: A Quantitative Approach (the Hennessy and Patterson book): https://graydon2.dreamwidth.org/233585.html

                                                                                                                                                      1. 16

                                                                                                                                                        Unfortunately, the comparison is written in such a clearly biased way that it probably makes fossil sound worse than it is (I mean you wouldn’t need to resort to weasel words and name-calling if fossil was valid alternative whose benefits spoke for themselves .. right?). Why would anyone write like that if their aim is to actually promote fossil?

                                                                                                                                                        1. 5

                                                                                                                                                          The table at the top is distractingly polemic, but the actual body of the essay is reasonable and considers both social and technical factors.

                                                                                                                                                          My guess is that author is expecting the audience to nod along with the table before clicking through to the prose; it seems unlikely to be effective for anyone who doesn’t already believe the claims made in the table.

                                                                                                                                                          1. 4

                                                                                                                                                            This is what’s turned me off from even considering using it.

                                                                                                                                                          2. 12

                                                                                                                                                            “Sprawling, incoherent, and inefficient”

                                                                                                                                                            Not sure using a biased comparison from the tool author is useful. Even then, the least they could do is use factual language.

                                                                                                                                                            This is always something that gripes me reading the recurring fossil evangelism: git criticism is interesting and having a different view should give perspective, but the fossil author always use this kind of language that makes it useless. Git adapts to many kind of teams and workflow. The only thing I take from his comparison is that he never learnt to use it and does not want to.

                                                                                                                                                            Now this is also a very valid criticism of git: it is not just a turn-key solution, it needs polish and another system needs to put forth a specific work organization with it. That’s a choice for the project team to make. Fossil wants to impose its own method, which of course gives a more architected, polished, finish, but makes it impossible to use in many teams and projects.

                                                                                                                                                            1. 2

                                                                                                                                                              Maybe they don’t care about widely promoting fossil and just created that page so people stop asking about a comparison?

                                                                                                                                                            2. 5

                                                                                                                                                              One of the main reasons for me for not using Fossil is point 2.7 on that list: “What you should have done vs. What you actually did”. Fossil doesn’t really support history rewrites, so no “rebase” which I use nearly daily.

                                                                                                                                                              1. 2

                                                                                                                                                                This is also a problem with Git. Like you, I use rebase daily to rewrite history, when that was never really my objective; I just want to present a palatable change log before my changes are merged. Whatever happens before that shouldn’t require something as dangerous as a rebase (and force push).

                                                                                                                                                                1. 4

                                                                                                                                                                  I don’t think it makes any sense to describe rebases as ‘dangerous’, nor to say that you want to present a palatable change log without rewriting history unless you’re saying you want the VCS to help you write nicer history in the first place?

                                                                                                                                                                  1. 2

                                                                                                                                                                    Rebase is not dangerous. You have the reflog to get back to any past state if needed, you can rewrite as much as you need without losing anything.

                                                                                                                                                                    Now, I see only two ways of presenting a palatable change log: either you are able to write it perfectly the first time, or you are able to correct it. I don’t see how any VCS would allow you to do the first one. If you use a machine to try to present it properly (like it seems fossil strives to do), you will undoubtedly hit limitations, forcing the dev to compose with those limitations to write something readable and meaningful to the rest of the team. I very much prefer direct control into what I want to communicate.

                                                                                                                                                                    1. 2

                                                                                                                                                                      I think whether rebase is dangerous depends on the interface you are using Git with. The best UI for Git is, in my opinion, Magit. And when doing a commit you can choose from a variety of options, one of them being “Instant Fixup”.

                                                                                                                                                                      I often use this when I discover that I missed to check-in a new file with a commit or something like that. It basically adds a commit, does an interactive rebase, reorders the commits so that the fixup-commit is next to the one being fixed and executes the rebase pipeline.

                                                                                                                                                                      There are other similar options for committing and Magit makes this straight-forward. So much, indeed, that I have to look up how to do it manually when using the Git CLI.

                                                                                                                                                                      1. 4

                                                                                                                                                                        I prefer to work offline. Prior to Git I used SVK as frontend for SVN since it allowed offline use. However, once Git was released I quickly jumped ship because of its benefits, i.e. real offline copy of all data, better functionality (for me).

                                                                                                                                                                        In your linked document it states “Never use rebase on public branches” and goes on to list how to use rebase locally. So, yes, using rebase on public branches and force-pushing them is obviously only a last resort when things went wrong (e.g. inadvertently added secrets).

                                                                                                                                                                        Since I work offline, often piling up many commits before pushing them to a repo on the web, I use rebase in cases when unpushed commits need further changes. In my other comment I mentioned as example forgotten files. It doesn’t really make sense to add another commit “Oops, forgotten to add file…” when I just as easily can fixup the wrong commit.

                                                                                                                                                                        So the main reason for using rebase for me is correcting unpushed commits which I can often do because I prefer to work offline, pushing the latest commits only when necessary.

                                                                                                                                                                        1. 2

                                                                                                                                                                          In addition to what @gettalong said, keep in mind the original use-case of git is to make submitting patches on mailing lists easier. When creating a patch series, it’s very common to receive feedback and need to make changes. The only way to do that is to rebase.

                                                                                                                                                                    1. 1

                                                                                                                                                                      Thinking out loud:

                                                                                                                                                                      Distributing your work across {fnlam}pipes kills productivity.

                                                                                                                                                                      Corollary: one process programming might net you productivity wins.

                                                                                                                                                                      Computers are fast; you can run StackOverflow on a handful of servers.

                                                                                                                                                                      Maybe you shouldn’t switch to a mainframe, but you might want to consider whether better utilizing your hardware might shrink your problem to a “one process” problem.

                                                                                                                                                                      From the mainframe essay linked above:

                                                                                                                                                                      Yet another problem is that many companies don’t want to switch [from mainframes], because it saves them money to run a small, expensive machine instead of a small, expensive cluster. In The Economist article above, Eurocontrol reported saving 50% on software costs. The NYT article reports that Radixx switched from a cluster to an IBM mainframe and saved 50%.

                                                                                                                                                                      1. 9

                                                                                                                                                                        Something that I’ve been thinking about a lot is that the way that most software is distributed is really hostile to modifications - this post touches on that, but doesn’t go very deep into it. A question I’ve been asking recently is - what would a package manger that’s designed with end-users patching software as a first-class concern look like? And a somewhat more challenging version for some systems - what would it look like to allow end-users to patch their kernels as a first-class concern?

                                                                                                                                                                        NixOS has the start to an answer for this (fork the nixpkgs repo, make whatever changes you like), but it still doesn’t seem ideal. I guess maybe gentoo is a sort of answer to this as well, but it doesn’t seem like gentoo answers the question of “how do you keep track of the changes that you’ve made”, which is I think a really important part of systems like this (and the thing that’s missing in most current package managers support for patching)

                                                                                                                                                                        1. 13

                                                                                                                                                                          Note you don’t need to fork nixpkgs to edit individual packages, you can just add them to an overlay with an override eg I was trying to debug sway recently so I have:

                                                                                                                                                                          (sway.overrideDerivation (oldAttrs: {
                                                                                                                                                                                src = fetchFromGitHub {
                                                                                                                                                                                  owner = "swaywm";
                                                                                                                                                                                  repo = "sway";
                                                                                                                                                                                  rev = "master";
                                                                                                                                                                                  sha256 = "00sf8fnbj8x2fwgfby6kcrxjxsc3m6w1yr63kpq6hv94k3p510nz";
                                                                                                                                                                                };
                                                                                                                                                                              }))
                                                                                                                                                                          

                                                                                                                                                                          One of the guix devs gave a talk called Practical Software Freedom (slides) where the core point was that it’s not enough to have foss licensing if editing code is so much of a pain that noone does it. It looks like guix has a pretty nice workflow for editing installed packages, and since the tooling is all scriptable I bet you could streamline it even more - eg add a command for “download the source, fork it to my personal repo, add the fork to my packages list”.

                                                                                                                                                                          1. 6

                                                                                                                                                                            I only very briefly used guix, but its kernel configuration system is also just scheme. It’s so much nicer than using any of the kernel configurators that come with the kernel, and I actually played around with different configurations rather than opting to play it safe which is what I normally do since it’s such a faff if I accidentally compile out support for something important.

                                                                                                                                                                            1. 2

                                                                                                                                                                              How often do you find your overlays breaking something in strange ways? My impression is that most packages don’t come with tests or other guardrails to give early warning if I break something. Is that accurate?

                                                                                                                                                                              1. 3

                                                                                                                                                                                I haven’t had any problems so far. I guess if you upgrade a package and it changes the api then you might be in trouble, but for the most part it seems to just work.

                                                                                                                                                                            2. 6

                                                                                                                                                                              One thing this brought to mind was @akkartik’s “layers” script: http://akkartik.name/post/wart-layers

                                                                                                                                                                              The post is short, but the money quote is here:

                                                                                                                                                                              We aren’t just reordering bits here. There’s a new constraint that has no counterpart in current programming practice — remove a feature, and everything before it should build and pass its tests

                                                                                                                                                                              If we then built software such that reordering commits was less likely to cause merge conflicts, you could imagine users having a much easier time checking out a simpler version of the system and adding their functionality to that if the final version of the system was too complex to modify.

                                                                                                                                                                              1. 4

                                                                                                                                                                                I’ve gotten close with rpm and COPRs. I can fairly quickly download a source RPM, use rpmbuild -bp to get a prepped source tree where I can build a patch, track it, and add it back to the spec, then push it to a COPR which will build it for me and give me a yum repository I can add to any machines I want. Those pick up my changed version of the package instead of the upstream with minimal config fiddling.

                                                                                                                                                                                It’s not quite “end users patching as a first class concern” but it is really nice and in that ballpark.

                                                                                                                                                                                1. 3

                                                                                                                                                                                  At that point the package system would just be a distributed version control system, right?

                                                                                                                                                                                  1. 1

                                                                                                                                                                                    Interesting point! And I guess that’s kinda the approach that Go took from the start, with imports directly from GitHub. Except, ideally, you’d have a mutable layer on top. So something like IPFS, where you have the mutable IPNS namespace that points to the immutable IPFS content-addressed distributed filesystem.

                                                                                                                                                                                    Still, unlike direct links to someone elses GitHub repo, you would want to be able to pin versions. So you would want to point to your own namespace, and then you could choose how and when to sync your namespace with another person’s.

                                                                                                                                                                                    1. 3

                                                                                                                                                                                      This is how https://www.unisonweb.org/ works. Functions are content-addressed ie defined by the hash of their contents, with the name as useful metadata for the programmer. The compiler has a bunch of builtin refactoring tools to help you splice changes into an existing graph of functions.

                                                                                                                                                                                      1. 2

                                                                                                                                                                                        Just watched the 2019 strangeloop talk. Absolutely brilliant. Only drawback I see is that it doesn’t help with code written in other languages. So dependency hell is still a thing. But at least you’re not adding levels to that hell as you write more code (unless you add more outside dependencies).

                                                                                                                                                                                  2. 2

                                                                                                                                                                                    what would a package manger that’s designed with end-users patching software as a first-class concern look like?

                                                                                                                                                                                    If you just want to patch your instance of the software and run it locally, it is very straightforward in Debian and related distributions, using apt-src. I do it often, for minor things and pet peeves. Never tried to package these changes and share them with others, though.

                                                                                                                                                                                    1. 1

                                                                                                                                                                                      All the stuff package managers do doesn’t seem to help (and mostly introduces new obstacles for) the #1 issue in modifying software: “where are the sources?” It should be dead easy to figure out where the sources are for any program on my system, then to edit them, then to run their tests, then to install them. Maybe src as the analogue to which?

                                                                                                                                                                                      I appreciate npm and OpenBSD to a lesser extent for this. Does Gentoo also have a standard place for sources?

                                                                                                                                                                                      1. 1

                                                                                                                                                                                        I believe Debian is trying to streamline this with Salsa. That’s the first place I look when I’m looking for source code of any package on my system.

                                                                                                                                                                                      2. 1

                                                                                                                                                                                        what would a package manger that’s designed with end-users patching software as a first-class concern look like?

                                                                                                                                                                                        A DVCS. (It’s Christmas, “Away in a package manger, no .dpkg for a bed…”)

                                                                                                                                                                                        We drop in vendor branches of 3rd party libraries into our code as needed.

                                                                                                                                                                                        We hack and slash’em as needed.

                                                                                                                                                                                        We upstream patches that we believe upstream might want.

                                                                                                                                                                                        We upgrade when we want, merging upstream into our branch, and the DVCS (in our case mercurial) tells us what changed up stream vs what changed locally. Some of the stuff that changed upstream is our patches of improvements on them, so 99% of the time we take the upstream changes, and 0.9% of time take our changes and 0.1% of the time have to think hard. (ps: 99% of stats including this one are made up (from gut feel) on the spot. What makes these stats special is I admit it.)

                                                                                                                                                                                        A related topic is the plague of configuration knobs.

                                                                                                                                                                                        Every time any programming asks, “What should these configuration value be? He says, dunno, make it an item (at best) in the .conf file or at worst, in the UI.”

                                                                                                                                                                                        The ‘net effect is the probability of the exact configuration ever having been tested by anyone else on the planet is very very very low and a UI that you’d either need a Phd to drive (or more likely is full of wild arsed guesses)

                                                                                                                                                                                        A good habit is, unless there is a screaming need, by default put config items in a hidden (to the user) file and that is not part of the user documentation.

                                                                                                                                                                                        If anybody starts howling that they need to alter that knob, you might let them into the secret for that knob, and consider moving it into a more public place.