1. 34

Inspired proximally by last week’s thread on Linux not breaking userland, particularly @JordiGH’s comment which introduced me to Steve Losh.

  1.  

  2. 14

    It seems the entire premise of this post rests on the fact that some package managers always use the latest version by default. The go dep tool you mentioned in the footnotes will to my experience use the newest version but then pin to it. Additionally, go community relies on several tools like gopherpit or gopkg.in to pin major versions via branches.

    Semantic Versioning isn’t broke, it’s misused, yes, but if applied correctly it’s a good methodology to manage breakage in machine-visible APIs.

    Using digest hashes or renaming when behaviour changes is not a method that can be easily understood by both humans and machines; digest hashes have no meaning to a human and renaming a method will require pulling up the changelog and/or documentation.

    I think it’s a bit unfair to only observe default behaviour when proper usage can have much more power. If I bother to pin versions in RubyGems or NPM, then the entire argument kinda collapses. Including where you merge the minor and revision numbers because “[…] the distinction between minor versions and patch levels is moot”.

    1. 4

      I think it’s a bit unfair to only observe default behaviour when proper usage can have much more power. If I bother to pin versions in RubyGems or NPM, then the entire argument kinda collapses.

      That’s not what I’m saying at all. We are in agreement about what “proper usage” is in today’s world. However, this “proper usage” of everyone narrowly just protecting themselves leads to the issues observed by Steve Losh and Rich Hickey. Rich Hickey and Steve Losh are observing that trying to provide short-term flexibility is polluting the well for everyone. I’m just trying to flesh out a couple of details on how an alternative might work.

      1. 1

        Agree. Package managers should use the exact versions specified in the build, otherwise they are not package managers but “random bits from the internet downloaders”.

        Something that is often overlooked is that there is a lot of things library developers can do to provide smooth migration paths:

        Deprecation

        Deprecations should not only be versioned to tell people when things will changed/be removed, but also precisely describe what is going to be deprecated.

        Migration

        There is no reason why library authors shouldn’t be able to ship a description about how existing code needs to be changed that can be read by tools which apply those changes.

        Taking these two things together, you end up with something like

        @willChange(what: Change, when: Version, why: String, how: Tree => Tree)
        sealed trait Change
        object Change {
          Removal, NoInheritance, NoOverriding, Behavior, ...
        }
        

        This will not work in every case, but even just automating away 60% of the changes will have a profound way on how people can deal with dependency updates and changes.

      2. 9

        Leiningen for Clojure once again defaults to the latest version.

        Leiningen doesn’t default to any latest version as far as I know. Leiningen does

        1. reproducible dependency resolution: For the same dependency list, you always get the same dependencies ¹
        2. in no way “default” to anything, as inserting dependencies and their versions is the user’s responsibility

        Versioning/pinning is not only about having an API-compliant library though, it’s also about being sure that you can build the exact same version of your program later on. Hyrum’s Law states that any code change may effectively be a breaking one for your consumers. For example:

        • Fixed a bug in your library? Someone will depend on the buggy behaviour, or attempt to fix the bug downstream while it’s still an issue. If you forget to quote apostrophes, for example, fixing that in a bugfix release may cause some tools to double escape them.
        • Fixed an edge case/security issue? You’ve most likely introduced some checks which will have some overhead. If your library is used in a hot spot for a consumer, then it may lead to performance degradation of the program they’ve made.
        • User complains that an old version of your software is buggy/breaks, but you cannot reproduce on HEAD and you want to know what fixed it? That’s hard if you cannot use old dependencies. If git-bisect doesn’t test a commit with the dependencies you used at the commit time, you’re not testing your software as it was at that point in time. And if the bug is upstream, it’s hard to figure out what dependency caused it and how it was fixed.

        Of course, pinning is not a panacea: We usually want to apply security issues and bugfixes immediately. But for the most part, there’s no way we can know a priori that new releases will be backwards compatible for our software or not. Pinning gives you the option to vet dependency updates and defer them if they require changes to your system.

        1: Unless you use version ranges or dependencies that use them. But that happen so infrequently and is strongly advised against – I don’t think I’ve ever experienced it in the wild.

        1. 3

          Hyrum’s Law

          FYI, Hyrum finally made http://www.hyrumslaw.com/ with the full observation. Useful for linking. :)

          1. 2

            Hmm, perhaps I misunderstood the doc I read. I’m having trouble finding it at the moment. I’m not a Clojure user. Could you point me at a good link? Do library users always have to provide some sort of version predicate for each dependency?

            Your point about reproducing builds is a good one, but it can coexist with my proposal. Imagine a parallel universe where Bundler works just like it does here and maintains a Gemfile.lock recording precise versions in use for all dependencies, but we’ve just all been consistently including major version in gem names and not foisting incompatibilities on our users. Push security fixes and bugfixes, pull API changes.

            Edit: based on other comments I think I’ve failed to articulate that I am concerned with the upgrade process rather than the deployment process. Version numbers in Gemfile.lock are totally fine. Version numbers in Gemfile are a smell.

            1. 3

              Oh, yes, sorry for not being clear: I strongly agree that version “numbers” might as well be serial numbers, checksums or the timestamp it was deployed. And I think major versions should be in the library name itself, instead of in the version “number”.


              In Leiningen, library users always have to provide some sort of version predicate for each dependency, see https://github.com/technomancy/leiningen/blob/master/doc/TUTORIAL.md#dependencies. There is some specific stuff related to snapshot versions and checkout dependencies, but if you try to build + deploy a project with those, you’ll get an error unless you setup some environment variable. This also applies to boot afaik ; the functionality is equivalent with how Java’s Maven works.

              1. 2

                Thanks! I’ve added a correction to OP.

                1. 1

                  Hmm, I’ve been digging more into Leiningen, and growing increasingly confused. What’s the right way to say, “give me the latest 2.0 version of this library”? It seems horrible that the standard tutorial recommends using exact versions.

                  1. 3

                    There’s no way to do that. The Maven/JVM dependency land always uses exact versions. This ensures stability.

            2. 7

              I also agree with major version belonging in the name. For version 4 and 5 of SBJson I renamed all classes, enums, and identifiers so that you can install version 3.x.x, 4.x.x and 5.x.x in the same app without conflicts. I did this because I wanted to ease the upgrade path for people. If they use SBJson in different parts of their app (which is likely, in big apps) this allows them to upgrade parts of their app at a time, rather than be forced to upgrade all uses in one go. More importantly though: it also allows people to upgrade their own usage in their app, even as dependencies they rely on have not yet upgraded their usage of the library.

              1. 5

                The Apache Commons Java libraries practise your method and I think it’s fantastic for precisely the reasons you mention. Guava does not and that last sentence of yours is a huge ticket in Hadoop.

                1. 2

                  That sounds more like a workaround to avoid the issues of runtimes not being able to handle versions and the lack of reasonable migration tooling.

                  1. 2

                    I disagree somewhat. Renaming was simple to do, and is simple to understand & deal with for users and machines alike. There’s no special case migration tooling or runtime support required at all. One could argue that requiring a runtime that is able to handle versions of libraries and requiring migration tooling is a workaround for poor versioning on behalf of library authors. However, I’ll admit renaming has its problems too. It would make back porting fixes across major versions much more annoying, but luckily my project is mature and small enough that it has not been a problem.

                2. 6

                  I’ve been using semantic versioning for my own projects now for a few years, and in my experience with it, it’s great (if used correctly as tscs37 mentions) for libraries (or modules) but less so for applications.

                  My gripe with semantic versioning is that it allows pre-release and build metadata as part of the version string. I think this is wrong as it’s ill-defined and complicates dependency tracking (I’m also not a fan of pre-releases—why prolong the release? It won’t get much additional testing because of a general reluctance of using pre-releases). Just use X.Y.Z.

                  Also, for an extended meditation on backwards compatibility, read The Old New Thing, which explains the lengths that Microsoft goes to to maintain backwards compatibility (and the problems that can cause).

                  1. 4

                    Build metadata SHOULD be ignored when determining version precedence.

                    Pre-release versions have a lower precedence than the associated normal version.

                    So, only the pre-release versions do, and most implementations I know of only support using a pre-release version if you explicitly ask for it.

                    1. 1

                      I’m also not a fan of pre-releases—why prolong the release? It won’t get much additional testing because of a general reluctance of using pre-releases).

                      Our customers extensively test our release candidates. Linux kernel release candidates are also pretty thoroughly tested. And no, I don’t think there is a reasonable alternative to RCs.

                    2. 4

                      I like the idea of making things easier for users, but this approach seems like it would come at the cost of preventing users from being able to recover from developer mistakes. One example I can think of is when Thoughtbot renamed FactoryGirl to FactoryBot. They thought it was a proper minor change, but it actually broke backwards compatibility. The publisher has no way over knowing that a “backwards-compatible change” is backwards compatible for everyone. For the minority that it breaks stuff for, they need a way to pin.

                      1. 3

                        I completely agree with major version belongs into name.

                        You still need being pinning, if a package accidentally breaks backwards compatibility.