Threads for cgenschwap

  1. 11

    Delightful isn’t the word I would use for it, and as much as I want to hate on Javascript and its ecosystem, I do think there is something really awesome about it. The sheer amount of options and choice, the number of new frameworks and new approaches, the variety in builders, transpilers, formatters, linters; it’s all so insane but also incredible. People are just building stuff, new stuff, non-stop.

    There is clearly something to JavaScripts “anything goes” mentality, and it’s far more like the wild-west than languages like Haskell or Rust, and because of this I think lends to a creative freedom that many developers love. No compiler to tell you you’re doing it wrong, just a JS engine that will try its best to execute what you wrote, and will do something if what you wrote makes no sense.

    JavaScript is not the language for me, and I wish there were alternatives to frontend development that weren’t works-in-progress. The WASM dream is still far away if it comes at all. But there is something delightful about JS, in a way that is more akin to punk than anything. I think it’s easy to judge the ecosystem as buggy and incomplete, but what other ecosystem offers the same sheer amount of variety? Not even Python comes close. There is something really cool about that.

    1. 1

      there is something delightful about JS, in a way that is more akin to punk than anything.

      Yes…. punk just about sums it up!

    1. 3

      Given that TLA+ supports detection of liveness issues, and Raft has been modeled in TLA+, was this an issue the modeling showed, or only discovered after?

      This covers shortcomings in the initial Raft paper, and how they’ve been solved, but it also relies on human reasoning rather than an automated proof system pointing out the issues (and when it’s been solved). How do we know no liveness issues remain?

      1. 4

        From the Raft paper (page 116):

        We leave specifying and proving cluster membership, log compaction, and client interaction to future work, along with liveness and availability properties of the basic Raft algorithm.

        So these basic liveness properties were never verified, just the basic safety properties. If you look at the official TLA+ spec there aren’t any safety or liveness properties specified.

        1. 1

          Thanks for the clarification! That is a bit surprising no liveness properties are specified in the official spec.

          1. 1

            I was real surprised too!

      1. 34

        I think that’s a harsh conclusion from the outcome of a single team out of many at Amazon.

        1. 18

          I don’t speak for Amazon, but my experience has been that this kind of analysis and architectural refitting is essentially constant and that’s a good thing. As volume and scope change, different approaches are needed. Monoliths are great for some problems and for some length of time; same with SOA, microservices. The lifetime of the problem usually sees one or all of those approaches as conditions change.

          1. 4

            Completely agree with you. It’s all about the trade offs, and sometimes a problem is not understood well enough at the start to properly analyze those trade offs.

            Given how flexible and malleable software is, it always amazes me the reluctance to refactor at scale, especially architecture. Electrical and mechanical (let alone civil!) all have massive up-front cost in manufacturing and existing stock needs to be used/trashed, yet revisions are common. Software has none of that overhead and everyone overreacts to revisions…

            1. 1

              Software does have an up front cost in testing that the replacement is equivalent. If you can’t fully simulate production without risking production, is it any surprise it’s hard to get rework through?

              1. 1

                If we’re using other engineering disciplines as our comparison point, the testing requirements are the same (and much higher for electrical/mechanical/civil). We can get far closer to simulating a true production use-case with software, it’s an unfortunate part of our industry that integration testing is mostly an afterthought.

          2. 4

            For video processing nonetheless.

          1. 5

            Nice summary.

            What seems to be better at resolving SSRF than an allow-list in the software, is to implement that allow list in a web proxy and require the software to always go through there.

            P.S.: the whatwg URL standard strips newlines in the middle of URLs because <a href="… may break at anytime according to html (and how people used to write this in 80 column-wide terminals).

            1. 4

              P.S.: the whatwg URL standard strips newlines in the middle of URLs because <a href=”… may break at anytime

              Similarly, IETF RFC 3986 appendix C says:

              In some cases, extra whitespace (spaces, line-breaks, tabs, etc.) may have to be added to break a long URI across lines. The whitespace should be ignored when the URI is extracted.

              For robustness, software that accepts user-typed URI should attempt to recognize and strip both delimiters and embedded whitespace.

              1. 2

                I had no idea. I thought this was a HTML specialty. Huh. Thanks!

              2. 3

                What seems to be better at resolving SSRF than an allow-list in the software, is to implement that allow list in a web proxy and require the software to always go through there.

                At my last company the security team built this out and we transitioned to it. It’s a clever way of preventing SSRF. Does this idea have a name?

                1. 1

                  What seems to be better at resolving SSRF than an allow-list in the software, is to implement that allow list in a web proxy and require the software to always go through there.

                  It’s just outbound filtering/ egress firewalling. It’s awesome and it’s crazy how often it doesn’t get deployed - it absolutely fucks attackers up.

              1. 6

                This is a topic I’ve become interested in investigating lately, after Austral made me look at linear types again and ask “why would you want to use these anyway?” I still haven’t particularly found a use case for them and was hoping that this would provide some, but most of the ones listed involve async, which I am slowly becoming convinced really fits best into a language either built around it or at least one with automatic memory management.

                The one non-async-related use case they list is basically “more powerful destructors”. Which… now that I think about it, is a useful thought. Currently in Rust destructors are used for all sorts of nice things, via guard objects. They unlock mutexes, they free hardware resources like OpenGL textures, I’ve used them to push/pop stacks of contextual data so they never get out of sync with the things using them, etc. But not being able to call them with arguments is a limitation, especially for API’s like vulkan or explicit memory allocators that need a context object passed to the destructor function. Right now the easiest workaround is basically to stuff such context objects into globals, but it would be nice if an object could be marked “must be destroyed explicitly” so that you can pass arguments to the destructor function. A compile-time-checked defer statement, essentially. There’s no way for the compiler to to do this for you, as it does with Drop calls, because it doesn’t know what arguments you want to pass to it.

                …which is exactly what this kind of linear types does for you. Hmm. Hmmmmmm!

                1. 8

                  For my money the biggest issue linear types solve is faillible destructors, and there are lots of those. Although whether you can recover from a failing destructor tends to be a complicated question, knowing that a destructor can fail, and that you will be told, can be absolutely critical (see the fsync mess from 2019 for instance).

                  1. 3

                    The idea of chaining together fallible destructors is also interesting; you can make a function call that you are required to call multiple times in case of failure (just return the original object back on an error), or which runs some alternative-destruction (enforce a safety mechanism).

                    I think this would be handy in many embedded contexts, where you need to clean up registers in various orders.

                  2. 2

                    One compelling use-case are Zig-style Unmanaged collections, where the allocator is passed-in to specific methods which require it (including drop). If you do that, then suddenly a whole bunch of code needs destructors with arguments.

                    1. 1

                      Rust doesn’t have any ability to ensure that the same allocator instance is passed for every use. I presume this is going to be a hard tradeoff for safety vs cost of requiring each allocator to defend itself from possibility of receiving some other allocator’s address.

                    2. 1

                      not being able to call them with arguments is a limitation

                      The workaround I know of is to have a “teardown” or “close” method that does what the destructor does but takes parameters, and leaves the object in a state where the actual destructor will be a no-op.

                      You can’t create a compile-time guarantee that users of the type call the teardown method, but I haven’t found that to be a problem; if it isn’t called, the destructor just does whatever it would have done with default arguments. Or else it can panic.

                      1. 1

                        Or else it can panic.

                        Maybe not for long. There’s a proposal to completely remove ability to panic in drop.

                    1. 1

                      There is an interesting balance to be found in type system complexity. On one hand, the idea of proper linear types in Rust to enforce more invariants at compile time sounds awesome. Enforcing invariants in the type system leads to more correct code and fewer tests that have to be written. As an advanced user of Rust, I would love a feature like this.

                      On the other hand, ?Drop types will make Rust more complicated and difficult to learn. A person new/beginner with Rust would not like a feature like this. And unfortunately I can already imagine the Rust-detractors comments on the matter if this were added to the language. The mindset that Rust is complicated and hard to learn actually makes it harder for people to learn Rust because they can assume complexity where there might not be any.

                      This is such a fundamental problem in modern language design, and I think it’s unfortunate. We should expect tooling for professionals to allow for advanced functionality. It seems only in programming languages is there this push for the opposite, and it makes me sad.

                      And I get it. I have a friend who loves Go, and his reasoning is that he doesn’t have to wade through crazy abstractions that some coworker wrote that didn’t quite work out for what it was intended to do. And I’ve had that frustration myself. Advanced tools in many professions are used improperly all the time, and can even cause real bodily harm, but it shouldn’t stop us from using them.

                      Back to the article, I think the proposal for the ?Drop trait looks good, especially if it can be done in a way that doesn’t “infect” other code (such as lifetime annotations can). Unfortunately I don’t feel confident this proposal will actually go anywhere, as the use-case is too niche and the complexity too high. So few languages have affine/linear types at all, and to expect users of Rust to understand the subtle differences between them is going to a very hard thing to sell.

                      1. 4

                        The mindset that Rust is complicated and hard to learn actually makes it harder for people to learn Rust because they can assume complexity where there might not be any.

                        I don’t know, I think the popular conception of how difficult it is to learn rust is pretty well-calibrated. Attempting to learn it via searching stackoverflow for your error will fail 100% of the time, whereas that approach succeeds with nearly every other non-functional language. I didn’t understand rust well enough to be productive until I went through every one of the 90ish rustlings course exercises. Even after that I didn’t really feel I understood it until spending a few weeks seeing how far I could push the type system encoding program constraints at compile time. Actually, the sheer quantity of people who have learned rust is incredibly impressive. It’s the closest thing our profession has had to a mass-consciousness-raising moment in recent history.

                        On the other hand I learned rust just before the release of chatGPT, which understands rust fairly well. Maybe it’s become a lot easier to learn recently.

                        Regarding the topic of the post itself, one issue I’ve run into when programming in this sort of pattern is you often will be returning tuples of a calculated value along with the moved item, then destructuring that on the caller side. It looks a bit messy on both ends.

                      1. 7

                        This article captures my experience in big tech quite accurately. Especially the section about trading productivity for predictability. In my current role, not finishing tasks in a sprint is a Big Deal (we have a deadline to hit!), and so task bloating is made crazy because of it. And have to love not being able to bring additional work to a sprint. Of course this is far better than my previous role at BigCo where there were so many people there quite literally wasn’t enough work at a given time for everyone…

                        Startups and small companies are different, and one of the main reasons I’m looking to transition back to them. The author talks about productive work, and what it looks like, and it closely resembles what I’ve seen as well. Autonomy and purpose are huge motivators, and the secret to so many overworked early-stage startup employees is that the work is really fun. Many of us got into this field because we enjoy it, so it’s no wonder when given the autonomy we make the most of it.

                        Two things that I’ve also found which weren’t mentioned in the article are (1) promotion-oriented developers and (2) promotion-oriented managers. At BigCo these two are far more prevalent, and some big reasons for complexity bloat (slowing things down) and people bloat (slowing things down). If a developer is promoted based on the complexity of projects, this is to be expected, and the same with a manager who is promoted based on number of reports.

                        The whole system is absurd, but I view it as a generally good thing. Small companies will be able to out-compete big companies (aside from anti-competitive practices), which provides for general progress. Of course, the biggest issue right now are those anti-competitive practices which don’t seem to be getting regulated any time soon.

                        1. 3

                          I think “how many historical bugs would this have prevented” is a really good way of judging a programming language or feature

                          I think this is really not a good way to judge a programming language or feature. You just don’t pour a little of the new language/feature over the existing project and bugs simply disappear. You change the structure of the project on a fundamental level, or actually you just create a whole new project. This might avoid the bugs that have appeared in the past, but it will tell you nothing about the complexity and corresponding bugs that get introduced by the new language or feature.

                          1. 6

                            You just don’t pour a little of the new language/feature over the existing project and bugs simply disappear.

                            This can sometimes literally be the case. Introducing Kotlin to a large Java app has shown me exactly this: NullPointerException practically disappears in the Kotlin code.

                            1. 4

                              If one of the aspects you value in a language is preventing bugs, this seems like a great model to evaluate a language with. No model is perfect, but I think evaluating whether it would prevent historical bugs is one of the best we have.

                              In the case of Rust and C, they are both imperative. Idiomatic Rust can look different from C, simply because Rust provides more tools for expressive power, but you can write Rust which reads very closely to C (minus the memory unsafety). Of course, it won’t be a 1-1 translation, but it also isn’t a fundamental structural rewrite (unless there are fundamental memory errors).

                            1. 3

                              I mostly agree, though I don’t think OO and FP are very far apart. Related comment here:

                              Good OO is also principled about state and I/O, just like FP.


                              1. 1

                                Can I further your argument and say that I don’t think any of them are far apart? I think any good design is principled about state and I/O, and my hope was to show that each programming philosophy just provides different ways of approaching it.

                                I think the fact that FP pushes so hard on state-reduction it feels like any state-minimization thinking is a functional viewpoint. But being principled about state is necessary for properly using any development style, FP is just the most direct about it.

                              1. 5

                                I think saying that every paradigm is about state is taking things a bit too far. Sure state management is a cross-cutting concern that can interact with a lot of things, but that doesn’t mean these things are defined purely in their interaction with states.

                                OO and FP are indeed mostly about states, but OO is also about inheritance, which in turn can be about computation rather than state. FP is also about higher-order functions, which again show up in computations more frequently than in state management.

                                Declarative vs imperative can also be about computation.

                                Services can also be about computations. In fact, in a lot of microservice-based architectures, you will have a large number of almost stateless services [1] that talk to each other, with one or a few databases at the very bottom that manages all the states. In other words microservices are more often about distributing computation rather than distributing states.

                                [1] Cache is a notable exception, but cache is arguably a performance optimization and does the change the stateless nature

                                1. 7

                                  Moreover I feel the description of OO vs FP and declarative vs imperative seem to imply that the amount of states to manage is given, and the different paradigms are just different ways to manage them. While it’s true that every system has inherent states, there can also be a lot of accidental states and different paradigms can have very different amount of accidental states.

                                  1. 2

                                    there can also be a lot of accidental states and different paradigms can have very different amount of accidental states

                                    Completely agree! This was really just a short article showing a different way you can view programming philosophies, and maybe evaluate them based on the trade offs they make about state management. I didn’t really dive into how certain approaches may introduce additional state, and whether that tradeoff is worth it, but it’s an important thing to consider when designing systems.

                                  2. 7

                                    but OO is also about inheritance, which in turn can be about computation rather than state

                                    Alan Kay, who coined the term, would disagree. More modern OO languages are increasingly moving away from the idea that subtyping should be coupled to implementation inheritance, because it turned out to be a terrible idea in practice. Even in OO languages that support it, the mantra of ‘prefer composition over inheritance’ is common.

                                    1. 3

                                      I think more important than Kay, is what the inventors of object-orientation would say - Nygaard and Dahl. They have sadly passed, but I imagine what they would say (particularly Nygaard) is that there is a huge problem space to which some aspects of a language might be applied, and others not. To lump everything together as the same for all is problematic. Not all software is the same. For example, implementation inheritance has been very successful for a type of problem - the production of development and runtime frameworks that provide default behavior that can be overridden or modified.

                                      1. 1

                                        Alan Kay doesn’t get to decide what words mean for everybody else, especially when he’s changed his mind in the 40+ years since he originally coined the term.

                                        1. 1

                                          Since Hillel is being too circumspect, I’ll cite the piece:

                                      2. 1

                                        It’s more of a conceptual model that can be applied to the various programming philosophies, and shows how they differ but also how they are similar. It’s focusing on the trade offs which is important, rather than a “one true way.”

                                        I’m slightly confused by what you mean by “computation.” Inheritance and higher-order functions both are ways of structuring code with the goal to get it more correct.

                                        I did leave out the topic of performance from this article, and I think that is a very important axis in which the philosophies have varying takes as well. It felt like it would have made the article too unwieldy if I tried to tackle that as well.

                                      1. 5

                                        Are about state? This fallacy of logic is akin to making the observation that humans are made up of atoms, therefore humans are about atoms.

                                        Observing that a collection all have a common characteristic or attribute does not mean that is what they are ‘about’.


                                        1. 6

                                          Read the article, rather than just the headline. The thesis is that the difference between the philosophies can be boiled down to how they work with state. Seems right to me.

                                          1. 3

                                            Author here, I used the phrase “about state” pretty loosely. Wasn’t particularly thinking about it.

                                            But I stand by what I wrote: as dtgriscom wrote, this article is about boiling down the different philosophies into how they manage state, and that really is the primary difference between them.

                                            It’s not a perfect fit (it does leave out the axis of computation performance) but I think a useful conceptual model.

                                          1. 12

                                            Minimize state! Use functional programming practices!

                                            But seriously, my work with systems recently has been all about how to reduce the state that exists on a server. If you think this is solved with containers, I welcome you to the Jenkins container guidance of “put all that stuff in a volume mount”.

                                            1. 8

                                              There is a better way! A middle path, that captures the best of both worlds!

                                              If a function is the sole owner of an object, and mutates it before returning it, that is indistinguishable from it never mutating anything at all, and instead returning a brand new object with the modifications.

                                              If you pass a mutable reference to some data into a function, and only that function has the rights to modify that data, this is the same as never using mutation at all and instead having that function return the updated data, and using that updated data in place of the original data after the function call.

                                                  fn with_mutation(arg: &mut T);
                                                  with_mutation(&mut my_var);
                                              === fn without_mutation(arg: &T) -> T;
                                                  my_var = without_mutation(my_var);

                                              You can only do this safely and clearly with a type system that distinguishes sole ownership from shared references, and ensures there for every piece of data at every time, there is at most one function with permission to mutate that data. This is what Rust gives, and it’s what Val gives more clearly, though Val is still a prototype. And Haskell gives it too. Mutation without the spooky action at a distance.

                                              1. 3

                                                If you pass a mutable reference to some data into a function, and only that function has the rights to modify that data, this is the same as never using mutation at all and instead having that function return the updated data, and using that updated data in place of the original data after the function call.

                                                One issue is that some operations may not be idempotent. If you had a function call that took a mutable reference, mutated the object, and then performed an operation that could fail you might not be able to retry the whole function safely, unless the mutation was idempotent itself.

                                                For example, if you took a mutable reference to a dictionary, popped a value out of the dictionary, used that value to make an http request and that request failed you wouldn’t be able to rerun the whole function with the original reference.

                                                You would be able to to that with immutable objects.

                                                1. 2

                                                  Would linear typing also qualify as a means to reach this middle path? I learned about it recently, and some of the solutions you mentioned reminded me of the the whole “each value can only be used once” restriction.

                                                  1. 2

                                                    Rust does use linear typing. It has four different rules for different kinds of values:

                                                    • Small types like i32 and bool and friends are so cheap to copy it would be silly not to. These implement Copy, and are except from the rules of the next bullet point because the compiler will implicitly copy them when needed.

                                                    • T is an owned value of type T (unless T is a small type like i32 and implements Copy). E.g. String is an owned heap-allocated string. This uses linear typing to enforce sole ownership. If you have a variable x: String, it’s a type error to say let y = x; let z = x;. (Pedants will point out that Rust uses affine types. All that that means is that you’re allowed to mention x zero times, while that’s technically disallowed by linear types.)

                                                    • &T is a shared reference to T, there can be many simultaneous copies of it but it’s immutable.

                                                    • &mut T is a mutable reference to T, there can be only one copy of it.

                                                    The first (owned typed) use linear types. The second two use Rust’s borrow checker. It’s the combination of the two that makes the magic.

                                                    Haskell gets to a similar endpoint (it allows mutation, but controls it well) by completely different means, using state monads.

                                                    Most other languages pass around shared, mutable references, either explicitly, or under the hood. Which leads to confusion and bugs, and is the reason that people get up in arms about state. My point is that if you remove the sharing, then the state becomes harmless.

                                                    1. 3

                                                      Pedants will point out that Rust uses affine types.

                                                      Yes, we shall. Rust’s affine types are useful, but they are not linear types, the lack of which prevents encoding in the type system that, e.g., some object given out by one’s module must be passed back to that module and not just dropped on the floor, and leads to kludges like types that prevent their values from being discarded unused by unconditionally panicking (throwing an exception or aborting the process) in their destructors.

                                                  2. 1

                                                    Not familiar with Rust…. is there a fundamental conceptual difference between this and the OO pattern of private data that can only be mutated by a public method?

                                                    1. 3

                                                      The main difference with Rust is that in the typical OO language you can have two aliases to the same object, that can both call the public method, so there’s still room for spooky action at a distance.

                                                    2. 1

                                                      Clojure has a mechanism for this too. You can make a variable mutable, and it will error out if you pass it to a function that expects non-mutable data. This allows mutation for speed and going back to being immutable when passing data around.

                                                      1. 3

                                                        Not the same! The property I’m referring to is global, not local. It’s not the sort of thing you can bolt on to an existing type system.

                                                        E.g. say that you have the object graph:

                                                          /   \
                                                         B     C
                                                          \   /

                                                        and A passes in two mutable references to a function: a mutable reference to B and one to C. Bam, invariant has been broken, spooky action at a distance may occur! If the function modifies B.D, then spookily it has also modified C.D.

                                                      2. 1

                                                        Good article and comment about that recently:



                                                        i.e. borrow checking can be seamless for stateless apps, but is more complex for stateful apps

                                                        Stateless programs like command line tools or low-abstraction domains like embedded programming will have less friction with the borrow checker than domains with a lot of interconnected state like apps, stateful programs, or complex turn-based games

                                                        1. 1

                                                          If you pass a mutable reference to some data into a function, and only that function has the rights to modify that data, […]

                                                          You can only do this safely and clearly with a type system that […] ensures there for every piece of data at every time, there is at most one function with permission to mutate that data. This is what Rust gives, […]

                                                          Rust’s &mut exclusive references mean that “only that function has the rights to modify” and read that data — letting one function mutate the data without barring other functions from reading the data at the time suggests a race condition.

                                                          1. 1

                                                            Another way to put this is that only the combination of shared mutable data is bad. If you remove / control sharing, mutation can be totally safe.

                                                            I love this model, and am very interested in Val for that reason.

                                                            1. 1

                                                              (D calls this “weakly pure”)

                                                            2. 2

                                                              Completely agree! Minimizing state is paramount. I’ve found functional programming pushes the idea of state reduction, but it can apply to any philosophy and is the first step to solving a problem. I’ve found over time my feedback on design reviews has just been concerned with the design of state and how it’s being handled.

                                                            1. 4

                                                              This is an exciting project, and one that I think has an immense amount of potential. I read through sections 3, 4 and 5, and will take a look at the rest at some point, but overall the language and compiler seem feature-full given how new this is! It already supports a good portion of what many people use Latex for (text, alignment, equations and code).

                                                              I haven’t played around with it yet, but the language looks well designed and you can definitely see the Rust influence. I think the biggest hurdle to replacing Latex is going to be library support, and the emphasis on making a real language vs. macros will hopefully help jumpstart things. I haven’t seen what the errors look like, but can’t get worse than latex.

                                                              I hope this project takes off, a user-friendly alternative to Latex with the same power is needed. It will definitely take a long time to get to feature-parity with Latex, but this seems like a fantastic start!

                                                              1. 3

                                                                Hear, hear! My only concern is this wording in the about page:

                                                                We will publish Typst’s compiler source code as soon as our beta phase starts. From then on, it will form the Open Core of Typst, and we will develop and maintain it in cooperation with the community.

                                                                I sure hope they plan to publish something

                                                                • at least as usable as latex (i.e. all the parts needed to compile a paper to common output formats are there)
                                                                • under a license that allows inclusion in mainstream distributions

                                                                People who write papers, articles and books care a lot about not losing their work (e.g. if the startup goes under). I don’t write papers any more but I’m sure I wouldn’t have considered typst before those conditions were met.

                                                                Thanks for taking this on!

                                                                1. 5

                                                                  That is the plan! The open source part will cover the whole compiler and CLI and the license will probably be a permissive one (e.g. MIT/Apache-2). We’re only keeping the web app proprietary, but don’t want to lock anybody into it, so you will always be able to download your projects and compile them locally.

                                                                  1. 1

                                                                    How simple is the compiler to re-implement? What’s the ETD (estimated development time)?

                                                              1. 69

                                                                I read this and felt weird about it initially but couldn’t quite put my finger on it. From my experience, using Rust has lead to faster developer velocity (even for CRUD apps) simply because the strong type system allows you to encode invariants to be checked at compile time, and the tooling and libraries are amazing. This article seems to be about a project that was being developed right around async/await release, which was a rough time for using libraries. Most of the sharp edges are gone now, but I don’t doubt that the state of the ecosystem at that point affected this person’s experience.

                                                                However, I do think there is a situation where this advice holds true (even for someone as heavily biased as I am), which is for a very specific kind of startup: a hyper-growth startup with 100%+ YoY engineer hiring. The issue with Rust is not that its slower to develop in, I don’t think that is true, its that in order to develop quickly in Rust you have to program in Rust. And frankly, most new developers to Rust have no idea how to program in Rust because so many languages do not feature strong type systems. And the problem is that if your influx of new developers who need to learn Rust is too large, you won’t be able to properly onboard them. Trying to write Java using Rust is horrible (I’ve worked with a number of colleagues who I’ve had to gently steer away from OO design patterns that they were used to, simply because they make for really difficult Rust code, and are largely obsoleted by the type system).

                                                                It isn’t even lifetimes or borrowing that are necessarily tricky, in my experience issues with lifetimes are fairly rare for people and they almost always immediately seek out an experienced Rust dev for guidance (you only need a handful to deal with all questions on Lifetimes; my current team only has me and its been a non-issue). The bigger problems are around how to structure code. Type-driven development is not something most people have experience with, so they tend to stick to really simple Structs and Enums, to their detriment.

                                                                For instance, I commonly see new Rust developers doing something like this:

                                                                fn double_or_multiply(x: i32, y: Option<i32>, double: bool) -> Result<i32> {
                                                                    if double {
                                                                        if y.is_some() {
                                                                            return Err("Y should not be set");
                                                                        x * 2
                                                                    } else {
                                                                        if y.is_none() {
                                                                             return Err("Y should be set");
                                                                        x * y.unwrap()

                                                                Yes, I know its a completely contrived example, but I’m sure you’re familiar with that kind of pattern in code. The issue is that this is using the shallow aspects of Rust’s type system – you end up paying for all of Rust but only reaping the benefits of 10% of it. Compare that to what you could do by leveraging the type system fully:

                                                                enum OpKind {
                                                                   Multiply(x, y),
                                                                fn double_or_multiply(input: OpKind) -> i32 {
                                                                    match input {
                                                                         Double(x) => x * 2,
                                                                         Multiply(x, y) => x * y,

                                                                Note how the error has disappeared as there is no way to call this function improperly. That means fewer tests to write, less code to maintain, and APIs that can’t be used improperly. One of the most interesting questions I get commonly when I promote this style of code is “how do I test that it fails then?”; its always fun to explain that it can’t fail[1] and there is no need to write a test for the failure. The developer efficiency benefits from this style of thinking everywhere is massive and more than pays for the cost of using Rust.

                                                                But developers from other languages take time to get to this point, and it does take time and effort from experienced Rust developers to get everyone on the same page. Too many new people and I can see more and more code leaking in as the first example, which means you get minimal benefits of Rust with all the cost.

                                                                I can’t argue with this person’s experience, as much as I love Rust and think it has features that make it an incredibly productive language, but I think the core issue is that the majority of developers do not have experience with strong type-system thinking. As more languages start adding types I’m hopeful this problem becomes less prevalent, because the productivity differences between developers who understand type-driven development vs. those who don’t is large (in languages with strong type-systems).

                                                                [1] Technically it can panic, which I do explain, but for the majority of cases that is a non-issue. Generally if there is a panic-situation that you think you might have to handle you use a different method/function which returns a Result and bubble that up. Panics are largely unhandled (and for good reason; they aren’t exceptions and should be considered major bugs).

                                                                1. 11

                                                                  FWIW Matt Welsh is a very experienced programmer and former computer science professor at Harvard:


                                                                  (I know of him from his research on SEDA, an influential web server architecture, as well as having a good blog.)

                                                                  So this comments strikes me as a bit out in left field … I don’t doubt it’s your experience, but it doesn’t seem relevant to the article

                                                                  1. 9

                                                                    I’m not familiar with who Matt Welsh is, but I found his post to be well written and accurate to his experience. My comment was simply a reflection of my own experience, which I think differs.

                                                                    I don’t see how my comment isn’t relevant to the article, but I am open to feedback if there is something specific you felt made my comment out of left field!

                                                                    1. 7

                                                                      Not GP, and I hope this doesn’t come off as too negative, but your comment is pretty dismissive of the idea that Matt Welsh could have substantive issues with the design of Rust. You seem to imply that the problems he ran into stem from a lack of experience:

                                                                      I think the core issue is that the majority of developers do not have experience with strong type-system thinking.

                                                                      Your example about double_or_multiply is great, but IMO it’s a pretty elementary example for most readers here, as well as Matt Welsh.

                                                                      The general vibe is like this: someone complains that a particular proof in graduate-level mathematics is hard to read, and then you respond with a tutorial on pre-calculus. I like your comment, and it is relevant, but it doesn’t feel like it responds to the article, or takes the author seriously.

                                                                      1. 5

                                                                        Thanks for the explanation, I do appreciate it. I hoped to not make my comment be dismissive of the article, and more as a refinement of which aspect it was talking about. I think the fundamental issue is that I completely disagree with Matt Welsh that programming in Rust lowers developer efficiency, and so my comment was exploring a reason for why he may feel that way.

                                                                        The example was simple, and more just for having something to ground my argument in (ie. Rust makes you faster because you write less code overall), as well as having something for developers who are unfamiliar with type-driven development to see. I didn’t mean to imply it as Matt Welsh doesn’t know that, but more that type-driven development is a completely different style of programming, and onboarding devs to a new language is easy but to a new style is hard.

                                                                        Clearly my point wasn’t made as clear as I had hoped, and thank you for pointing out where it felt like I was off-base. I do think it’s important to disagree, and I’ve never been one to defer to another just because of their accomplishments or prestige.

                                                                        I’m thinking it might make sense for me to write my own post on my thoughts on developing quickly in Rust, and hopefully I can take these ideas and your feedback and make something that pushes the conversation forward on what makes a productive language without coming across as dismissive of others experience :)

                                                                        1. 1


                                                                          …a hyper-growth startup with 100%+ YoY engineer hiring… in order to develop quickly in Rust you have to program in Rust. And frankly, most new developers to Rust have no idea how to program in Rust because so many languages do not feature strong type systems. And the problem is that if your influx of new developers who need to learn Rust is too large, you won’t be able to properly onboard them.


                                                                          We hired a ton of people during my time at this company, but only about two or three of the 60+ people that joined the engineering team had previous experience with Rust.

                                                                          This was in two years, which he claims was 10x growth in headcount, so from ~6 people to 60 in two years, with only 3 people who knew Rust. Basically, well above your 100% YoY hiring threshold for being able to onboard Rust engineers.


                                                                          I completely disagree with Matt Welsh that programming in Rust lowers developer efficiency

                                                                          I don’t think you do disagree with him :)

                                                                          My interpretation of his claim is that in a rapidly growing organisation that needs to ship product rapidly, and where you don’t have a lot of Rust expertise already, Rust may not be a good fit. What you are claiming is that given enough experience with Rust, it can accelerate your day to day development. But that to do so you need to give developers sufficient time to onboard to Rust such that they can become familiar and efficient with it (which of course is true for any language, but Rust onboarding likely takes a lot longer than, say, Go, or Python, or C#).

                                                                          Both of these claims can be true because they are discussing different aspects of software engineering at different scales.

                                                                          Related, I’d be interested in hearing what your experience has been on how long it takes to onboard a complete Rust novice, to the point they are at close to 100% productivity.

                                                                          1. 3

                                                                            My interpretation of his claim is that in a rapidly growing organisation that needs to ship product rapidly, and where you don’t have a lot of Rust expertise already, Rust may not be a good fit.

                                                                            You’re right, I do agree with this claim. My understanding of Matt’s article is that he argues this is a function of Rust as a language, whereas I argue this is a function of a general lack of understanding of type-driven programming.

                                                                            It is effectively semantics, but important semantics, because I don’t think the solution is to avoid using Rust/Haskell/OCaml/etc except in niche situations, it should be to educate developers in type-driven development. But of course that is a much bigger problem, and not easy, so I can see where one might think I’m just arguing a point entirely separate from what the article is talking about.

                                                                            I think you’re right, the point I am making is mostly tangential, and I could have made that more clear. Thanks for explaining your view!

                                                                            Related, I’d be interested in hearing what your experience has been on how long it takes to onboard a complete Rust novice, to the point they are at close to 100% productivity.

                                                                            I’ll take the easy way out and say it depends :P but it is hard to put an exact number on it simply because I don’t have a ton of data to really say. Rust is still fairly niche, even at companies using it (ie. Rust for core services, but most devs work on services in other langs talking to the Rust core), and usually devs wanting to work on the Rust stuff already know Rust to an extent. My current team is getting ~6 or 7 devs without Rust experience next quarter (who have no idea they’ll be working in Rust) so hopefully I’ll have much better data soon! (Absurd bureaucratic policies maybe have good side effects)

                                                                            From my limited experience onboarding people with no Rust experience, I’ve seen devs without type-driven design experience pick up a productive level in two to three months (including new-grads), but take a lot longer to get to being comfortable with using the type system to its full power (maybe a year to get mostly comfortable, but still have gaps). However, they’ve largely been passionate and qualified devs I’ve worked with (even the ones that were new-grads), so I think my opinion is biased here.

                                                                            I’d guess that a developer with strong type system experience can pick up a near 100% productivity in a few weeks, but I have yet to onboard a dev with this background who doesn’t already know Rust, so this is largely from my own experience learning Rust and talking to others.

                                                                            I really wish I had experience with a hyper growth startup using Rust, and seeing first-hand the failure modes onboarding that many people to Rust. But I agree with Matt’s assessment that it’s the wrong language for that situation given the average software developer’s experience, and I have my doubts about the efficacy of hyper growth startups in the first place.

                                                                  2. 8

                                                                    On a related note, I’m very curious to see what happens with the recent Twitter situation. If Twitter manages to thrive, I think many companies are going to take notice and cut back on developers. The easy pay-day for software engineers could be at an end, and fewer developers will have to handle larger and larger systems. In that case, I’d imagine building things which are robust will outweigh building things quickly. If you have 10% the number of engineers you want to minimize the amount of incident response you are doing (1 out of 100 devs dealing with oncall every week is very different from 1 out of 10 devs dealing with oncall every week in terms of productivity; now the buggy systems have a 10% hit on productivity rather than 1%).

                                                                    I’m both worried (large scale cutbacks to mimic Twitter would not be fun) but also somewhat optimistic that it would lead to more reliable systems overall. Counter-intuitively, I think Rust/Haskell/Ocaml and friends would thrive in that world, as choosing a language for quick onboarding of hundreds of devs is no longer a constraint.

                                                                    1. 19

                                                                      I draw the exact opposite conclusion:

                                                                      Tighter pursestrings mean less resources allocated to engineers screwing around playing with new languages and overengineering solutions to stave off boredom.

                                                                      There will probably be cases where a business truly uses Rust or something else to differentiate itself technically in performance or reliability, but the majority of folks will go with cheaper tooling with easier-to-replace developers.

                                                                      1. 11

                                                                        I agree. People will go where the libraries are. If you have 1/10 the number of people you aren’t going to spend your time reimplementing the AWS SDK. You are going to stick to the beaten path.

                                                                        1. 2

                                                                          I’m sure you meant that as more of a general example than a specific one, but:

                                                                          1. 1

                                                                            Yeah, I meant it generically. More niche languages are missing a lot of libraries. If you have fewer people you probably want to spend less time reinventing the wheel.

                                                                            I know for any one language people will probably come out of the wood work and say “I don’t run into any issues.” but it’s more about perception in the large.

                                                                        2. 2

                                                                          You make a really good point, and I’ve been mulling on it. My logic was based on the idea that if the software industry suddenly shrank to 10% of its size, the engineers maintaining buggy systems would burn out, while those maintaining robust systems would not. Sort of a forced evolution-by-burnout.

                                                                          But I think you’re right, tighter purse strings means less experimentation, so the tried-and-true would benefit. So who knows! Hopefully it’s not something we will ever learn the answer to :)

                                                                          1. 2

                                                                            The department I run, after over 50% casualty rate this year, has made it a major focus to consolidate, simplify, and emphasize better-documented and smaller systems specifically to handle this. :)

                                                                            I hope it works out, but these are going to be interesting times whatever happens. I just personally wish engineers in tech as a culture hadn’t overplayed their hand.

                                                                        3. 5

                                                                          I can probably set your mind at ease about Twitter (but not the other tech companies having layoffs, nor the new management there who is utterly clueless). Since at least 2009, Twitter’s implicit/unspoken policy was that engineers are cheaper than machines. In other words, it’s more cost-effective to hire a team to optimize the bejeezus out of some service or another, if they can end up cutting server load by 10%. If their policy was based on any real financial data (I have no idea), good dev and devops people will continue to be in high demand, if only to reduce overall machine costs.

                                                                        4. 3

                                                                          Any recommend way to learn about that from your experience (except than being lucky enough to have an experienced Rust programmer to help you out)?

                                                                          Maybe something like

                                                                          1. 4

                                                                            I’m a fan of trial-by-fire, and if you really want to understand type-driven development then learning and using Haskell is what I’d recommend. Rust is a weird language because it seems really novel, but it really only has the ownership model as unique (and even then, Ada + spark had it first). Everything else is just the best bits borrowed from other languages.

                                                                            Haskell’s type system is more powerful, and the docs for libraries heavily lean into the types-as-documentation. I’m not good enough at Haskell to write production software in it, but getting to that “aha!” moment with the language has paid dividends in using Rust effectively.

                                                                            1. 3

                                                                              Rust is a weird language because it seems really novel, but it really only has the ownership model as unique (and even then, Ada + spark had it first

                                                                              Ada/SPARK did not have Rust’s affine types ownership model.

                                                                            2. 4

                                                                              Just wanted to mention that it is actually for others’ sake. Didn’t want to let that go unnoticed as it is a wonderful platform for learning!

                                                                              1. 2

                                                                                Elm! If you want a beginner-friendly way to epiphany, work through the Elm tutorial. That was my first, visceral, experience of the joy of sum types / zero runtime errors / a compiler that always has my back.

                                                                                Why via Elm? Because it’s a small and simple language that is user-friendly in every fibre of its being. This leaves you with lots of brain cycles to play with sum types.

                                                                                • Friendly error messages that always give you hints on what to do next. (Elms error messages were so spectacularly good, and that goodness was so novel, that for a while there was a whole buzz in all sorts of language communities saying “our error messages should be more like Elm’s”. Rust may be the most prominent success.)
                                                                                • You’re building a web page, something you can see and interact with.
                                                                                • Reliable refactoring, if your refactor is incomplete the compiler will tell you.
                                                                              2. 2

                                                                                fn double_or_multiply(x: i32, y: Option, double: bool)

                                                                                my 2c: I know it’s a contrived example, but even outside of rust it’s generally (not always) a bad idea (e.g. it’s a code smell) to have a function that does different thing based on a boolean.

                                                                                Also, a good linter/code review should help with the kind of issue you’re pointing to.

                                                                                1. 2

                                                                                  In hopes it’s instructive, your code samples are an instance of parse don’t validate where you push all your error checking logic to one place in the code.

                                                                                  1. 3

                                                                                    Yes it is :) I’m a huge fan of that article, though I’ve found it can sometimes be difficult for someone who isn’t familiar with strong types already. Thank you for sharing the link, I think it’s a great resource for anyone interested in reading more!

                                                                                  2. 1

                                                                                    I’m new to Rust, could you provide an example of calling your second function? I’ve only just passed the enum chapter of the book and that is the exact chapter that made me excited about working with Rust.

                                                                                    1. 7

                                                                                      Of course! You would call it like so:

                                                                                      double_or_multiply(OpKind::Multiply(33, 22));

                                                                                      Its good to hear your excitement from enums in Rust, as I think they are an under-appreciated aspect of the language. Combining structs + enums is super powerful for removing invalid inputs, especially nesting them into each other. The way I think about designing any API is: how can I structure the input the user supplies such that they can’t pass in something incorrect?

                                                                                      I wish I could find a source for how to design APIs, as there is some place out there which lists the different levels of quality of an API:

                                                                                      • low: the obvious way to use the API is incorrect, hard to use correctly
                                                                                      • medium: can be used incorrectly or correctly
                                                                                      • high: the obvious way to use the API is correct, hard to use incorrectly
                                                                                      • best: no way to incorrectly use the API, easy to use correctly
                                                                                      1. 6

                                                                                        You may be thinking of Rusty Russell’s API design levels.

                                                                                        1. 1

                                                                                          Yes! That was exactly what I was looking for, thank you!

                                                                                      2. 3
                                                                                        double_or_multiply(Double(2)) // = 2*x = 4
                                                                                        // Or
                                                                                        double_or_multiply(Multiply(3,7)) // = 3 * 7 = 21
                                                                                      3. 1

                                                                                        Can you explain how it could panic?

                                                                                        1. 2

                                                                                          Multiplication overflow, which actually would only happen in debug mode (or release with overflow checks enabled). So in practice it likely couldn’t panic (usually nobody turns on overflow checks) (see below)

                                                                                          1. 6

                                                                                            It’s not that uncommon. Overflow checks are generally off because of perceived bad performance, but companies interested in correctness favor a crash over wrapping. Example: Google Android…


                                                                                            Overflow checking is on by default in Android for Rust, which requires overflow operations to be explicit.

                                                                                            1. 2

                                                                                              I stand corrected! I’m curious what the performance impact is, especially in hot loops. Though I imagine LLVM trickery eliminates a lot of overflow checks even with them enabled

                                                                                              1. 2

                                                                                                I remember numbers flying around on Twitter, most of what I hear is that it is in neglectible ranges. Particularly that if it becomes a problem, there‘s API for actually doing wrapping ops.

                                                                                                Sadly, as often, I can‘t find a structured document that outlines this, even after a bit of searching. Sorry, I‘d love if I had more.

                                                                                            2. 1

                                                                                              So, it’s specific for this example, if the enum was over structs with different types and the function did something else, it wouldn’t necessarily panic, right?

                                                                                              Is there a way to make this design panic-proof?

                                                                                              1. 5

                                                                                                Yes, the panicking is specific to the example. And you can make it panic-proof if none of the function calls within can panic. IIRC its still an open design problem of how to mark functions as “no panic” in rust so the compiler checks it [1][2]. There are some libraries to do some amount of panic-proofing at compile-time[3] but I haven’t used them. I thought there was a larger RFC ticket for the no-panic attribute but I can’t find it right now.

                                                                                                [1] [2] [3]

                                                                                        1. 24

                                                                                          Hm, from the comments here, it seems that people generally think of this as a dichotomy:

                                                                                          • Either you provide a single command line the user just blindly copy-pastes into the terminal
                                                                                          • Or you teach the user how the thing actually works, so that the required spell becomes obvious to them

                                                                                          I advocate for neither of those things. Rather, I want two command lines I can blindly copy-paste into the terminal, where the first one prints yay! or nay :( depending on whether the second command is need. The workflow I want:

                                                                                          • paste the first command, see nay :(
                                                                                          • paste the second command
                                                                                          • paste the first command again, see yay!
                                                                                          1. 5

                                                                                            So your point is that you want some type of verification step included with a “fix” that is posted?

                                                                                            I’m not sure how useful that would be during the “troubleshooting” step, since you may get false negatives due to some complex chain of dependencies/situations and end up applying the wrong “fixes” for the actual problem. For example, there are an endless number of reasons why you might lose connectivity to the internet, and a check for “is DNS configured” might fail because packets aren’t leaving your NIC, you might have the wrong subnet configured, there may not be any configured route to the desired DNS server, your firewall might be blocking it, and so on. The “fix” for resolving any of those is likely to be very different. So someone who doesn’t have the ability to troubleshoot the problem much deeper than running some “yay or nay” script to confirm a problem (that may not really be the underlying cause for the problem they are having…), applying a fix for the wrong problem may not result in the “yay or nay” passing after, and they now have an additional problem on their hands (reverting the previous fix successfully).

                                                                                            Anyways, I think you still need at least two brain cells to rub together in order to troubleshoot any system, Linux or otherwise. While it would be nice to have a test case accompany every single fix or workaround, it’s not very plausible. A human is still necessary to provide some amount of reasonable amount of confidence that you’re about to try and fix the thing that might actually be your problem. Scripting that out for all possible situations, and having fellow humans select the right “yay or nay” to run, would be hard.

                                                                                            A verification like that is useful after you’ve determined the root cause of the problem you are facing and have applied a fix, but if you’re capable of doing that then it’s almost always trivial to come up with your own “test” for that. E.g. in the DNS example, if the root cause was systemd-resolved had crashed because of some malformed configuration file, you can make sure the service is running after you fix the broken config, and use dig or nslookup to test that name resolution is working again.

                                                                                            1. 2

                                                                                              Rather, I want two command lines I can blindly copy-paste into the terminal, where the first one prints yay! or nay :( depending on whether the second command is need.

                                                                                              I think that’s partly because what you want can only work in a rather narrow set of situations. Like others, I think I just extrapolated from that.

                                                                                              The example in your post is pretty straightforward, as in, it’s pretty easy to check if a device with that particular dev id was detected, and that the associated module either hasn’t been probed, or that the kernel command line doesn’t have the right incantation to make the module play ball with that device. And even in this case, I think you can’t come up with a comprehensive enough command that prints yay! or nay. For example, there are other mechanisms to force a device ID match (I don’t know if this is specific to the i915 driver though – maybe they’re not relevant?) besides the kernel command line.

                                                                                              But there are plenty of cases where that’s just not gonna work, not even remotely. If this weren’t Intel but an ARM board without a “real” device probing interface, where it’s all done via a user-supplied device tree, there are thousands of reasons why a device wouldn’t take off. Like, I think there are tens of thousands of dollars billed every day around the world for nothing other than figuring out why that happens.

                                                                                              “Teach the user how the thing actually works so that the spell is obvious” is obviously extreme. But “provide enough context information so that the user can understand what the problem might be, how to check if they have it, and how the spell would fix it” is doable, and while the first two parts might be easy to automate at times, in my experience, that’s rarely the case.

                                                                                              1. 1

                                                                                                Oops! I’m definitely to be partially blamed here. I guess we read what we want to read 🙂

                                                                                                I see what you’re saying now, and I think it makes sense. My apologies for derailing the conversation.

                                                                                                1. 2

                                                                                                  Yeah! To be fair, what you are saying also makes sense: as the last paragraph of the TFA says, after all these years I still have zero skills in debugging these sorts of issues from the first principles, and that is potentially also a problem, just a different one!

                                                                                                  1. 2

                                                                                                    I wonder if there is a larger point to be made about the spectrum of learning.

                                                                                                    • Do X if you are seeing Y.
                                                                                                    • Do X if you are seeing Y and it is due to Z. You can verify you have Z by checking A.
                                                                                                    • Do X if you are seeing Y and it is due to Z. You can verify you have Z by checking A. The reason this is the case blah blah blah.

                                                                                                    Essentially, applying a fix blindly is the lowest form of learning. Applying a fix after verifying the exact problem is slightly better. But understanding the problem and why it occurs before applying the fix is the best. But of course it is the inverse for the amount of effort required on behalf of the answer provider. The second bit provides more learning, and a better debugging experience, at minimal effort to the answer provider, which is a good argument for trying to apply that everywhere.

                                                                                                    Its also interesting that it almost seems exponential in effort for the answer provider.

                                                                                              1. 18

                                                                                                [Edit] I realize I read what I wanted from the article rather than what was actually written — I’m leaving this here but recognize that it’s not quite what the article is about.

                                                                                                I really like this, and I think this advice extends beyond just Linux troubleshooting. It’s really advice on how to teach people and how people learn. Answers are 20% of the learning process, 80% is understanding how to get to the answer, and it’s so critical for developing skills. I could rant about the US education system teaching-to-the-test which is focusing on that 20% and how terrible it is.

                                                                                                One of my roles at my current job is helping people learn Rust, and when someone comes to me with a confusing type error I always make an effort to explain how to read the error message, why the error message occurs, and the various ways to fix it. If I instead just provided the fix, there would be no learning, no growth and no development of self-sufficiency. It takes longer, and sometimes people just want an answer, but I stay firm on explaining what is going on (partially because I don’t want to be fixing everyone’s basic type errors). I wonder if part of the issue with Linux troubleshooting advice is that it doesn’t have that same feedback mechanism — someone not learning doesn’t affect the author of the advice in any way so there is no real push for building self-sufficiency.

                                                                                                Anyway, I think this post was really short and to the point, and I completely agree with the message, but I also think it’s interesting just how much it extends beyond Linux diagnostics and into learning everywhere.

                                                                                                1. 3

                                                                                                  I agree, it does work as “How to write good troubleshooting advice” in general (which IHMO would be a better title anyways)

                                                                                                  1. 1

                                                                                                    Dave Jones (EEVBlog) does an excellent job of this in the world of electronics, a playlist of his troubleshooting videos and his approach:

                                                                                                  1. 83

                                                                                                    I feel like this entire post reinforces just how difficult Python dependency management is. I’m a huge fan of Python, despite all of its flaws, but the dependency management is horrible compared to other languages. Nobody should have to learn the intricacies of their build tools in order to build a system, nor should we have to memorize a ton of flags just for the tools to work right. And this isn’t even going into the issue where building a Python package just doesn’t work, even if you follow the directions in a README, simply because of how much is going on. It is incredibly hard to debug, and that is for just getting started on a project (and who knows what subtle versioning mistakes exist once it does build).

                                                                                                    I think Cargo/Rust really showed just how simple dependency management can be. There are no special flags, it just works, and there are two tools (Cargo and rustup) each with one or two commands you have to remember. I have yet to find a Rust project I can’t build first try with Cargo build. Until Python gets to that point, and poetry is definitely going down the right path, then Python’s reputation as having terrible dependency management is well deserved.

                                                                                                    1. 20

                                                                                                      Completely agree. I’ve been writing Python for 15 years professionally, and it’s a form of psychological abuse to keep telling people that their problems are imaginary and solved by switching to yet another new dependency manager, which merely has a different set of hidden pitfalls (that one only uncovers after spending considerable time and energy exploring).

                                                                                                      Every colleague I’ve worked with in the Python space kind of feels jaded by anyone who promises some tool or technology can make life better, because they’ve been so jaded by this kind of thing in Python (not just dependency management, but false promises about how “just rewrite the slow bits in C/Numpy/multiprocessing/etc” will improve performance and other such things)–they often really can’t believe that other languages (e.g., Go, Rust, etc) don’t have their own considerable pitfalls. Programmers who work exclusively in Python kind of seem to have trust issues, and understandably so.

                                                                                                      1. 13

                                                                                                        The problem is that no matter how good Poetry gets, it still has to deal with deficiencies that exist in the ecosystem. For example, having lockfiles are great, but they don’t help you if the packages themselves specify poor/incorrect package version bounds when you come to refresh your lockfiles (and this is something I’ve been bitten by personally).

                                                                                                        1. 11

                                                                                                          That’s not a python-specific issue though. It’s not even python-like issue. You’ll have the same problem with autoconf / go.mod / cargo / any other system where people have to define version bounds.

                                                                                                          1. 21

                                                                                                            if I create a go.mod in my repo and you clone that repo and run “go build” you will use the exact same dependencies I used and you cannot bypass that. I cannot forget to add dependencies, I cannot forget to lock them, you cannot accidentally pick up dependencies that are already present on your system

                                                                                                            1. 15

                                                                                                              Keep in mind that Go and Rust get to basically ignore the difficulty here by being static-linking-only. So they can download an isolated set of dependencies at compile time, and then never need them again. Python’s import statement is effectively dynamic linking, and thus requires the dependencies to exist and be resolvable at runtime. And because it’s a Unix-y language from the 90s, it historically defaulted to a single system-wide shared location for that, which opens the way for installation of one project’s dependencies to conflict with installation of another’s.

                                                                                                              Python’s venv is an attempt to emulate the isolation that statically-linked languages get for free.

                                                                                                              1. 4

                                                                                                                I described the situation for Go during build time, not during runtime.

                                                                                                                1. 3

                                                                                                                  And my point is that a lot of the things people complain about are not build-time issues, and that Go gets to sidestep them by being statically linked and not having to continue to resolve dependencies at runtime.

                                                                                                                  1. 2

                                                                                                                    I don’t get the importance of distinguishing when linking happens. Are there things possible at build time that are not possible at runtime?

                                                                                                                    1. 7

                                                                                                                      Isolation at build time is extremely easy – it can be as simple as just downloading everything into a subdirectory of wherever a project’s build is running. And then you can throw all that stuff away as soon as the build is done, and never have to worry about it again.

                                                                                                                      Isolation at runtime is far from trivial. Do you give each project its own permanent isolated location to put copies of its runtime dependencies? Do you try to create a shared location which will be accessed by multiple projects (and thus may break if their dependencies conflict with each other)?

                                                                                                                      So with runtime dynamic linking you could, to take one of your original examples, “accidentally pick up” things that were already on the system, if the system uses a shared location for the runtime dynamically-linked dependencies. This is not somehow a unique-to-Python problem – it’s the exact same problem as “DLL hell”, “JAR hell”, etc.

                                                                                                                      1. 4

                                                                                                                        Isolation at runtime is far from trivial. Do you give each project its own permanent isolated location to put copies of its runtime dependencies? Do you try to create a shared location which will be accessed by multiple projects (and thus may break if their dependencies conflict with each other)?

                                                                                                                        But the same issues exist with managing the source of dependencies during build time.

                                                                                                                        1. 4

                                                                                                                          Yeah, I’m not seeing anything different here. The problem is hard, but foisting it on users is worse.

                                                                                                                          The project-specific sandbox vs disk space usage recurs in compiled langs, and is endemic to any dependency management system that does not make strong guarantees about versioning.

                                                                                                                          1. 3

                                                                                                                            No, because at build time you only are dealing with one project’s dependencies. You can download them into an isolated directory, use them for the build, then delete them, and you’re good.

                                                                                                                            At runtime you may have dozens of different projects each wanting to dynamically load their own set of dependencies, and there may not be a single solvable set of dependencies that can satisfy all of them simultaneously.

                                                                                                                            1. 1

                                                                                                                              You can put them into an isolated directory at runtime, that’s literally what virtualenv, Bundler’s deployment mode or NPM do.

                                                                                                                              And at build time you don’t have to keep them in an isolated directory, that’s what Bundler’s standard mode and Go modules do. There’s just some lookup logic that loads the right things from the shared directories.

                                                                                                                              1. 2

                                                                                                                                The point is that any runtime dynamic linking system has to think about this stuff in ways that compile-time static linking can just ignore by downloading into a local subdirectory.

                                                                                                                                Isolated runtime directories like a Python venv or a node_modules also don’t come for free – they proliferate multiple copies of dependencies throughout different locations on the filesystem, and make things like upgrades (especially security issues) more difficult, since now you have go track down every single copy of the outdated library.

                                                                                                              2. 8

                                                                                                                It might be possible to have this issue in other languages and ecosystems, but most of them avoid them because their communities have developed good conventions and best practices around both package versioning (and the contracts around versioning) and dependency version bound specification, whereas a lot of the Python packages predate there being much community consensus in this area. In practice I see very little of it comparatively in say, npm and Cargo. Though obviously this is just anecdotal.

                                                                                                                1. 1

                                                                                                                  Pretty sure it’s not possible to have this issue in either of your two examples; npm because all dependencies have their transitive dependencies isolated from other dependencies’ transitive dependencies, and it just creates a whole tree of dependencies in the filesystem (which comes with its own problems), and Cargo because, as @mxey pointed out (after your comment), dependencies are statically linked into their dependents, which are statically linked into their dependents, all the way up.

                                                                                                                  This has been a big problem in the Haskell ecosystem (known as Cabal hell), although it’s been heavily attacked with Stack (a package set that are known to all work together), and cabal v2-* commands (which builds all the dependencies for a given project in an isolated directory), but I don’t think that solves it completely transitively.

                                                                                                                  1. 2

                                                                                                                    @mxey pointed out (after your comment), dependencies are statically linked into their dependents, which are statically linked into their dependents, all the way up.

                                                                                                                    That’s not true for Go. Everything that is part of the same build has their requirements combined, across modules. See for the process. In summary: if 2 modules are part of the same build and they require the same dependency, then the higher version of the 2 specified will be used (different major versions are handled as different modules). My point was only that it’s completely reproducible irrelevant of the system state or the state of the world outside the go.mod files.

                                                                                                                    1. 1

                                                                                                                      Ah, I misunderstood your comment and misinterpreted @ubernostrum’s response to your comment. Thanks for clarifying. Apologies for my lack of clarity and misleading wording.

                                                                                                                    2. 1

                                                                                                                      To be clear, I’m not talking about transitive dependencies being shared inappropriately, but the much simpler and higher level problem of just having inappropriate dependency versioning, which causes the packages to pick up versions with breaking API changes.

                                                                                                                      1. 1

                                                                                                                        Ah, I reread your original comment:

                                                                                                                        For example, having lockfiles are great, but they don’t help you if the packages themselves specify poor/incorrect package version bounds when you come to refresh your lockfiles (and this is something I’ve been bitten by personally).

                                                                                                                        Are you talking about transitive dependencies being upgraded with a major version despite the parent dependency only being upgraded by a minor or patch version because of the parent dependency being too loose in their version constraints? Are you saying this is much more endemic problem in the Python community?

                                                                                                                        1. 2

                                                                                                                          Well, it fits into one of two problem areas:

                                                                                                                          • As you say, incorrect version specification in dependencias allowing major version upgrades when not appropriate - this is something I rarely if ever see outside Python.

                                                                                                                          • A failure of common understanding of the contracts around versioning, either by a maintainer who doesn’t make semver-like guarantees but downstream consumers who assume they do, or the accidental release of breaking changes when not intended. This happens everywhere but I (anecdotally) encounter it more often with Python packages.

                                                                                                                      2. 1

                                                                                                                        npm because all dependencies have their transitive dependencies isolated from other dependencies’ transitive dependencies

                                                                                                                        npm has had dedupe and yarn has had --flat for years now.

                                                                                                                        Go handles it by enforcing that you can have multiples of major versions but not minor or patch (so having both dep v1.2.3 and v2.3.4 is okay, but you can’t have both v1.2.3 and v1.4.5).

                                                                                                                        1. 1

                                                                                                                          npm has had dedupe and yarn has had --flat for years now.

                                                                                                                          I was unaware of that, but is it required or optional? If it’s optional, then by default, you wouldn’t have this problem of sharing possibly conflicting (for any reason) dependencies, right? What were the reasons for adding this?

                                                                                                                2. 11

                                                                                                                  I have mixed feelings about Poetry. I started using it when I didn’t know any better and it seemed like the way to go, but as time goes on it’s becoming evident that it’s probably not even necessary for my use case and I’m better served with a vanilla pip workflow. I’m especially bothered by the slow update and install times, how it doesn’t always do what I expected (just update a single package), and how it seems to be so very over-engineered. Anthony Sottile of anthonywritescode (great channel, check it out) has made a video highlighting why he will never use Poetry that’s also worth a watch.

                                                                                                                  1. 5

                                                                                                                    If you have an article that summarizes the Poetry flaws I’d appreciate it (I’m not a big video person). I’ll defer to your opinion here since I’m not as active in Python development as I was a few years ago, so I haven’t worked with a lot of the newer tooling extensively.

                                                                                                                    But I think that further complicates the whole Python dependency management story if Poetry is heavily flawed. I do remember using it a few years back and it was weirdly tricky to get working, but I had hoped those issues were fixed. Disappointing to hear Poetry is not holding up to expectations, though I will say proper dependency management is a gritty hard problem, especially retrofitting it into an ecosystem that has not had it before.

                                                                                                                    1. 15

                                                                                                                      Sure, here’s what he laid out in the video from his point of view:

                                                                                                                      • he ran into 3 bugs in the first 5 minutes when using it for the first time back in 2020, which didn’t bode well
                                                                                                                      • it pulls in quite a few dependencies (45 at the time of writing this, which includes transitive dependencies)
                                                                                                                        • create virtual environment
                                                                                                                        • pip install poetry
                                                                                                                        • pip freeze --all | wc -l
                                                                                                                      • it by default adds dependencies to your project that automatically would result in updates up to either a major or minor version bump, depending on the initial version
                                                                                                                        • for example python = "^3.8", which is equivalent to >= 3.8, <4
                                                                                                                        • this causes conflicts with dependencies of libraries that are often updated and with those that aren’t
                                                                                                                          • he mentions requests specifically
                                                                                                                      • pip already has a dependency resolver and a way to freeze requirements and their very specific versions
                                                                                                                        • i.e. use == and not use caret or tilde versioning
                                                                                                                        • he also shouts out ‘pip-tools’ here, which I haven’t used myself for the sake of keeping things simple
                                                                                                                      • the maintainers of Poetry have done something weird with how they wanted to deprecate an installer, which has eroded trust (for him)
                                                                                                                        • they essentially introduced a 5% chance that any CI job that used (their old way of installing Poetry) would fail to get people to move away from using that script and if you weren’t in CI then the script would just fail
                                                                                                                        • this is terrible because it introduces unnecessary flakiness in CI systems and does not give people time to actually migrate away in their own time, but rather forces it upon them
                                                                                                                      1. 6

                                                                                                                        I have used pip-tools and it is my favorite way of doing dependency management in Python, but it’s also part of the problem because I have a solution for me, so it doesn’t matter that the core tools are user hostile. The Python core team should really be taking ownership of this problem instead of letting it dissolve into a million different little solutions.

                                                                                                                        1. 5

                                                                                                                          the maintainers of Poetry have done something weird with how they wanted to deprecate an installer, which has eroded trust (for him)

                                                                                                                          I don’t wish to ascribe malice to people, but it comes off as contemptuous of users.

                                                                                                                          Infrastructure should be as invisible as possible. Poetry deprecating something is Poetry’s problem. Pushing it on all users presumes that they care, can act on it, and have time/money/energy to deal with it. Ridiculous.

                                                                                                                          1. 1

                                                                                                                            Absolutely, very unprofessional. Is the tool deprecated? Just drop the damn tool, don’t bring down my CI! You don’t want future versions? Don’t release any!

                                                                                                                      2. 1

                                                                                                                        I wanted to just settle on Poetry. I was willing to overlook so many flaws.

                                                                                                                        I have simply never gotten it to work on Windows. Oh well.

                                                                                                                      3. 5

                                                                                                                        Poetry is here though and is ready to use. There are good reasons to not make things included and frozen in upstream distribution. For example rubygems is separate from ruby. Cargo is separate from the rust compiler. The Python project itself doesn’t have to do anything here. It would be nice if they said: this is the blessed solution, but it doesn’t stop anyone now.

                                                                                                                        1. 9

                                                                                                                          Another commenter posted about the issues with Poetry, which I take as it not being quite ready to use everywhere. I think not having a blessed solution is a big mistake, and one that the JS ecosystem is also making (it’s now npm, yarn, and some other thing) — it complicates things for no discernible reason to the end user.

                                                                                                                          While Cargo and rubygems may be separate from the compiler/interpreter, they are also closely linked and developed in sync (at least I know this is the case for Cargo). One of the best decisions the Rust team made was realizing that a language was its ecosystem, and investing heavily in the tooling that was best in class. Without a blessed solution from the Python team I feel as though the dependency management situation will continue as-is.

                                                                                                                          1. 4

                                                                                                                            There was a time in the beforefore, when we didn’t have bundler, and ruby dependency management was kind of brutal as well. I guess there is still hope for python if they decide to adopt something as a first-class citizen and take on these problems with an “official” answer.

                                                                                                                        2. 2

                                                                                                                          I tried to add advice about dependency and packaging tooling to my code style guide for Python. My best attempt exploded the size of the style guide by 2x the word count, so I abandoned the effort. I recently wrote about this here:


                                                                                                                          I’d really like to understand Rust and Cargo a little better, but I’m not a Rust programmer at the moment. Any recommendations to read about the cargo and crate architecture?

                                                                                                                        1. 1

                                                                                                                          This seems to be a modern alternative to TLA+ (unless I’m misunderstanding something). Is anyone familiar with the pros/cons between the two? I was hoping to see a comparison page on the site but didn’t see one.

                                                                                                                          1. 2

                                                                                                                            I don’t know enough about the two technologies yet to comment. Although DS is an area I’m passionate about, I am still learning about the FM approaches to these sorts of problems.

                                                                                                                            1. 1

                                                                                                                              TLA+ checks the entire state space of an algorithm. P explores the state space of a program guided by some random strategy. TLA+ can be used to prove an algorithm is correct, while P is just a test tool. Recently I’m working to verify the design of a storage system at work. I tried to specify a simple version of the design in PlusCal which transpiles to TLA, but blocked by the state explosion problem (TLA+ checker runs out of memory and too slow). Then I switched to P. P allows me to put more details of the system into the specification. The state space is even larger than the PlusCal version. But P does not exhaust all the states during test. Usually I run a few hundreds of thousands of test iterations to ensure there is no big issue in the design. P also looks much familiar to an average programmer.

                                                                                                                            1. 4

                                                                                                                              Having just read the recent SQLite paper about past, present and future, as well as the SQLite/HE paper, I read this post with my mind primed. It seems like modern Databases are reaching a point of efficiency where the only way to improve is by having specific functionality and algorithms for specific data access patterns.

                                                                                                                              For instance, SQLite/HE is a separate SQLite query execution engine and data store specifically for queries it identifies as OLAP queries, which brings impressive performance gains to that specific subset of potential queries. And with roaring bitmaps, which I haven’t checked, but I imagine SQLite is using as well, it’s the same concept. Specific containers and join-algorithms depending on the data present, for impressive space efficiency.

                                                                                                                              I don’t really have much of a point to any of this, other than to be consistently amazed by all of the technology behind databases. There is so much at work to make my poor, inefficient SQL statements return quickly :)

                                                                                                                              1. 2

                                                                                                                                I think that’s been one reason for the success of SQL: as a declarative language, it leaves the execution strategy up to the implementation. That allows for a wide variety of strategies focused on different scales and domains, which can all be hidden away from the SQL programmer.

                                                                                                                              1. 5

                                                                                                                                This is a good list of gotchas in Rust, but for anyone unfamiliar with Rust, these are very different from gotchas in other languages. These are gotchas that the compiler will refuse to compile, and will throw a relevant error message with a link to why it won’t compile.

                                                                                                                                I think it’s important to note these are not gotchas similar to JavaScript or C++ gotchas, which are not caught by the compiler/interpreter ahead of time and cause unexpected behavior.

                                                                                                                                Just wanted to put a disclaimer for anyone who is interested in learning Rust, but is put off by this list. It’s real, but it’s also not something you have to keep in your mind while working and the compiler will check it for you :)