I feel that SemVer has two major problems, neither of which are really the fault of the spec:
In the case the author gives, where a tiny breaking change to fix undocumented (wrong) behavior, I wonder if it should really not be intentionally kept as a minor patch, with maybe a follow-up patch to alleviate the issue (either accepting the wrong behavior as official, or planning another change down the road). Because if every major version bump is likely to not affect most users, because it’s triggered by edge-case breaking changes, you will end up with very slow adoption as each major version takes consideration from end-users.
On the other side you have data science libraries like pandas that have major breaking changes (unintentionally or not) for minor versions regularly, and that’s not better either. But having this strict interpretation of major versions is good for stability, bad for adoption.
I think the problem is that most OSS is in fact on version 0 but no one wants to admit it. If you don’t have an API that can be kept stable for years and multiple maintainers with commit access, your project is at v0. But people use v1 to mean “production ready” instead which is different. It’s production ready if it can solve a problem reliably in production, but that’s not the same as v1.
I should write a blog about this somewhere so I can cite it and stop repeating it, but the core problem with SemVer is that it is used to version implementations, not interfaces. You cannot do graceful deprecation with SemVer. In a project with a good support cycle, you have three states for interfaces within an implementation:
Each release will cycle interfaces through this little state machine. You cannot express this if you’re using SemVer for the implementation. If your library supports an interface Foo, you have three versions in SemVer:
1.1 to 2.0 is not a breaking change for anyone that moved from Foo to Bar, but there’s no way, if you are using SemVer for implementations to indicate this. You may even have more complicated things such as
Now moving from 1.1 to 2.0 is a breaking change for everyone, but moving from 1.2 to 2.0 is not for anyone who is heeding their deprecation warnings. The thing that you want is to use SemVer for interfaces, where each version of the implementation has a tuple of interface versions. Now the flow is easy:
Now, if your dependency resolution first says ‘I need 1.x’ then it will match the first three versions. When you get to the third, it will say ‘by the way, there’s a newer thing you might want to migrate to’. Then you update it to say 2.0 and it still works with the third one, but will allow you to move to the fourth.
There are more subtle problems that relate to how richer type systems interact with the guarantees in SemVer. For example, anything that does pattern matching on structural types makes adding or removing a feature a breaking change.
I’ve mentioned it before, but I think Django’s approach – which is not semver – is a good one.
Django does three feature releases per major version: X.0, X.1, X.2. So over the past few years there’s been Django 3.0, 3.1, 3.2, then 4.0, 4.1, 4.2, and now 5.0 is approaching release.
The Django API compatibility policy is that every third feature release (the X.2) is an LTS, and the nice upgrade path is LTS-to-LTS. If your app is currently running on an LTS, and emits no deprecation warnings, the same codebase will run unmodified on the next LTS. So if you had an app running on 3.2 LTS, you could clear any deprecation warnings it emits and then jump direct to 4.2 LTS.
It’s not semver because the major version number does not tell you anything about breaking changes; the rule is that a piece of API that’s going to go away will emit deprecation warnings for two releases, and then it’s gone, and that happens in every feature release, not just major version bumps.
Interesting. In terms of web APIs, my thinking is good to do
As opposed to having /api/v1/… and /api/v2/… because that way you can handle the lifecycle for endpoints individually.
Yup. That’s precisely what good versioning looks like and it works because you’re doing SemVer on interfaces, not on implementations.
Very good point.
On top of that, for type-safe languages, I’d prefer another nuance: breaking from of a compile-error due to a breaking change is annoying. But from a runtime error because of a breaking change is way worse. It would be nice if I can expect which one will happen by looking at how the version changed.
I wrote this a couple years back which might help framing things:
And here is a good follow-up response to the idea: https://rys.io/en/156.html
I like this approach, as well as what django does (per @ubernostrum’s reply), in theory. But it requires careful up-front planning, and my experience is that a lot of projects are driven by the request for performance or features, whether private professional work or OSS.
And often those features cannot “wait” for multiple versions until you’ve paved a smooth upgrade path with deprecation warnings, which leaves you with two choices: Strict SemVer (lots and lots of major versions, often) or a more loose approach to versioning where you say “okay, we are introducing a new feature and tweaking some bits” and calling it a minor version.
Both django and python itself, as well as other packages, has versioning that is not classic SemVer, but they have a strict meaning of what a version number means anyway.
The silver bullet would be a system that caters to people who want to do right by their users, but do not have carefully planned interfaces or feature roadmaps. I am not sure if such a system logically can exist or makes sense, but it would be nice.
The question is: what is a breaking change anyways? Even fixing a bug can break a client that relied on that bug. (mandatory semi-related xkcd)
On the one hand, it’s a continuum. If you have some public method but it turns out that no one in the world was using it, you’re not actually breaking anyone by removing it, so it’s not a “breaking change” and vice versa if there is some undocumented internal algorithm that you change and it breaks people, it is a “breaking change.” But I think more realistically, it’s about setting appropriate boundaries and expectations: if you do X, Y, Z, and not A, B, C, we promise we will try to make sure our updates don’t break your software for as long as possible. Most languages have a certain amount of culturally defined boundaries, like you can use public methods, but not methods with underscore or transitive dependencies or whatever.
I like the intent of the author in this post. I notice many projects that I follow adding features in their patch versions on a regular basis. Could this be a solution? Unfortunately, I can’t seem to find the author using ComVer currently.
One of the many things that brought me around to the value of static typing was Elm’s enforced semantic versioning. Unison has an even more novel approach to this problem with content-addressable code.
Thinking is hard.
A while back I daydreamed a bit about trying to make versioning ~automatic and drive it with testing changes. The semantics would obviously just be different, but I wondered whether making whatever signal it bears more reliable could net out positively.
The devil’s obviously in the tooling. I picked at a git-based approach a little to explore the idea, but IIRC I felt like the idea would be tough to employ well without syntax-aware implementations.
Fundamentally, the attempt to cram a ton of information about APIs, implementations, commit hashes, and compatibility into a small set of alphanumeric characters is misguided. A “version” will never properly support all these use cases.
We should instead be asking ourselves, what communication tool(s) would better serve these purposes?
Some of this could be handled automatically, like determining API and documentation changes, there’s just not any kind of mechanism yet to communicate the relevant parts to users.