Threads for sdboyer

  1. 8

    I’m largely in agreement with what Peter has pointed out in a few other places in this thread - talking about MVS often isn’t that helpful, so much as how some of its requirements, and the poor assumptions they’re based on about how software gets written, amplify out into tooling. I won’t rehash those points, but will instead try to focus more narrowly on MVS itself.

    MVS has turned out pretty much as i expected. Its best feature is its predictability, as others have pointed to in this thread. It’s not the first version selection algorithm to be predictable/avoid NP-hard search - Maven’s had already been around for years before MVS, and there may be other, older ones in this family of which i’m unaware. Maven makes a different tradeoff - “rootmost declaration wins,” vs. “largest version number wins” in MVS.

    The scenario i was most concerned about was a social one: that the baked-in premise of “maintainers are expected to promptly adapt to the inevitable, unavoidable changes-that-break-them in their dependencies in order to keep MVS’ assumptions true” would become justification to pile onto maintainers even more and demand free labor from them. That hasn’t happened AFAICT, and in retrospect, it seems a bit silly to have been so concerned about it. (I was feeling quite momma-bear, wanting to protect the community from the same kinds of “use you for your labor then discard you” feels i was in.)

    The biggest problem with MVS is ambiguity: it exits 0 - false positive - whether there’s an obvious, reasonably-knowable, or non-obvious problem in the set of dependency versions its picked. Existing algorithms generally in the bundler/cargo/dep/pub/npm7 family (NP-hard search; preferring the latest version and avoiding known incompatibilities) clearly can do better at avoiding some such problems, at the cost of predictability. And, of course, they’re also still ambiguous - they’re not omniscient, and thus can’t avoid all false positives.

    (Aside, a pet peeve: “predictability” != “determinism.” i am not aware of any nondeterministic version selection algorithms in production tooling.)

    The price of this ambiguity is visible in more to the sort of issues Peter’s mentioned. It’s felt more in larger projects with deeper dependency graphs - e.g., k8s and anything depending on it. When things seem to work “well enough” with that exit 0 from MVS, you accept it and push ahead, because who’s gonna spend an hour spelunking a huge dep graph when everything at least seems fine? The result is what i called contagion failure in slow motion, manifesting like this. And once the seal’s broken, does doing it more really matter?

    Ultimately, my view is that dealing with the information and ambiguity problems is the main path forward in this problem space. In that light, MVS is a local optimum, and the ideal version selection algorithms are on the NP-hard side, assuming we can also get predictability under control at the same time as ambiguity. (Clearly, i think that’s feasible.) Once we have those, the only sound basis for choosing MVS anymore will be as an easy-to-implement stepping stone. My guess is that Maven’s algorithm and encoding (the POM equivalent to require statements) will be preferable for that purpose.

    1. 2

      I’m very glad to see Lockfiles in the design docs, but please consider alternatives to SemVer!

      See Rich Hickey’s Spec-ulation talk for problems and my alternative scheme proposal for potential solutions.

      In short, SemVer provides clients (limited) warning of breaking changes. Alternatively, consider a scheme that provides guidance for upgrading without breakage. Concretely, this would mean replacing name@major.minor.rev with name@major-timestamp and integrating deprecation warnings in to go vet.

      More verbosely, “breaking changes” should never happen in stable libraries without at least one major version’s worth of deprecation warnings for smooth migration. Adding new functionality should never cause breakage (Go has some unique problems here). SemVer formalizes bad library behavior as adds the unnecessary burden of fiddling with minor and revision numbers for every little change.

      1. 2

        (one of the dep team members, here)

        It’s very unlikely that we use something other than semver. I’ve spent plenty of time worshipping at the Altar of Hickey, but this is one case where I’d say his arguments don’t add a lot new to the discussion that we didn’t already know.

        I’ve been hoping to find some time to write this up in greater detail, but it’s been a busy couple months. Simplest, shortest version I can conjure: Hickey, like most folks, seems to be expecting more out of fundamentally meaningless numbers than is really wise.

        1. 1

          Hickey, like most folks, seems to be expecting more out of fundamentally meaningless numbers than is really wise

          That’s a strange conclusion. He literally uses the words “not meaningful” when talking about version numbers. Half of his argument against SemVer is that it codifies non-useful information in to two thirds of its values.

          The bulk of my proposal is that you simply replace minor and revision numbers with a timestamp. People who want to annoy their users with breaking changes can still bump the major version to do that. They just no longer have to fiddle with minor version numbers every time they make non-breaking changes.

          1. 1

            That’s a strange conclusion. He literally uses the words “not meaningful” when talking about version numbers. Half of his argument against SemVer is that it codifies non-useful information in to two thirds of its values.

            Indeed, but acknowledging that doesn’t preclude expecting unhelpful things from them. But like I said - shortest possible explanation. Not necessarily one that makes prima facie sense :) Maybe I’ll have time for a fuller writeup, sometime.

            The bulk of my proposal is that you simply replace minor and revision numbers with a timestamp. People who want to annoy their users with breaking changes can still bump the major version to do that. They just no longer have to fiddle with minor version numbers every time they make non-breaking changes.

            It might be better. Certainly, that’s encoding less subjective information. The crucial question is whether the benefit is large enough to merit the social disruption that would come from displacing a widely used standard. That’s where I’m dubious.

            1. 1

              The crucial question is whether the benefit is large enough to merit the social disruption that would come from displacing a widely used standard.

              To me, this is the only point worth discussing. I’m firmly in the camp that SemVer is actively harmful and trivially eliminated in any community that does not yet have a standardized package manager. What risks to you perceive?

      1. 0

        I hope the prototype gets a better, more searchable name. So far, Go has done a great job of being terrible for Googling. Name of the most popular web toolkit? Gorilla. For that reason, any time I refer to Go, I instead make a point to call it Golang.

        1. 1

          FWIW: this isn’t a troll, but a legitimate concern from someone uses Go a lot. go dep makes sense as a name, but I hope that the package itself on Github does not end up being named dep. As one of the early adopters of Golang in production, it was a real problem in the early day to help get the other programmers I worked with to find the right repo when talking about them.

          1. 1

            We went back and forth over naming a fair bit internally, with the poor searchability of “dep” being a major driver. Ultimately, we couldn’t settle on an alternative name to the rather generic “dep,” so it’s sufficing for now.

            On the plus side, assuming all goes well and the proposal is accepted, the planned trajectory for the tool is to integrate it seamlessly into the go toolchain - the name will go away. So, the name we use for now hopefully won’t matter too much.

        1. 8

          I’m generally a bit suspicious of both humans and machines modifying the same files, although comment preserving parser writers is something I have seen become more popular lately.

          For myself, I think it just adds complexity that is unnecessary and probably supporting a less-good decision made earlier.

          1. 5

            I’m generally a bit suspicious of both humans and machines modifying the same files

            Go is no stranger to this, though. See: gofmt.

            1. 7

              Gofmt is a trivial example, go fix, and the various go refactoring tools like the ones from x/tools/refactor and grind are more complex examples. Go programmers are used to tools that modify user-writable code.

            2. 2

              I agree on the whole – though upon seeing the headline, I was thinking it was probably because they were up to something goofier, like retroactively redefining what a comment is to include things with actual semantic significance (as they’ve done before). At least it isn’t more of that.

              1. 1

                I also think that’s a great general rule. In this particular case, I think - hope? - we’ve defended against it adequately.

              1. 9

                The wort part for me here is the people who simply ignore all points made by the author, simply saying “Let’s just stick with JSON” or “I don’t think comments are necessary”. It feels like they just saw the headline and commented without reading the text.

                1. 1

                  As the author, I appreciate that…though I probably should have expected as much :)