1. 15
  1.  

  2. 10

    It doesn’t recognise that software breaks are common and not all equal.

    I disagree with both parts.

    Not all breakage is equal in the sense that library developers should be even more reluctant to introduce a change that breaks say 50% of existing projects than a change that only break 0.5%.

    However, for downstream users, all breaking changes are equal. If it’s my code that breaks after updating from libfoo 1.0.0 to 2.0.0, I don’t care how many other projects have the same problem since it doesn’t reduce the need to adapt my code.

    If you define breaking change severity as the number of affected API functions, the picture doesn’t really change. A small number of affected functions doesn’t always mean a simple fix.

    I agree that software breakage is common, but the real solution is to stop making breaking changes without a really good reason, not to invent versioning schemes that make breaking changes less visible in the major version number.

    1. 8

      As a consumer of many libraries in dynamically typed languages “this easily grappable thing has a different call signature, same results” is qualitatively different from “this API disappeared/is now semantically entirely different”.

      Sure don’t break things without good reasons. But that’s true in any respect!

      1. 5

        However, for downstream users, all breaking changes are equal.

        Disagree on that. If a breaking change fixes a typo in a function name, renames something, or even just deprecates something with a clear and easy replacement path, I don’t care as a user. It’s just grooming.

        The distinction between major and minor breaking changes fits my expectations both as a user and as a developer.

        1. 3

          However, for downstream users, all breaking changes are equal.

          A breaking change that doesn’t break my code is facially less bad than one that does.

          This emerging notion that breaking changes are an Ur-failure of package authors which must be avoided at all costs biases so strongly toward consumers that it’s actually harmful for software development in the large. Software needs to have breaking changes over time, to evolve, in order to be healthy. Nobody gets it right the first time, and setting that as the baseline expectation is unrealistic and makes everyone’s experience net worse.

          1. 4

            That notion is biased towards the ecosystem as a whole. We all are both producers and consumers.

            If there are two libraries with comparable functionality and one makes breaking changes often while the other doesn’t, it’s the latter that brings more benefit to the ecosystem by saving developer time. In essence, compatibility is a feature.

            I’m not saying that people should never make breaking changes, only that it requires a really good justification. Fixing unfortunate function names, for example, doesn’t require a breaking change—keeping the old name as an alias and marking it deprecated is all that’s needed, and it adds too few bytes to the library to consider that “bloat”.

            1. 2

              I’m not saying that people should never make breaking changes, only that it requires a really good justification. Fixing unfortunate function names, for example, doesn’t require a breaking change—keeping the old name as an alias and marking it deprecated is all that’s needed, and it adds too few bytes to the library to consider that “bloat”.

              The amount of justification required to make a breaking change isn’t constant, it’s a function of many project-specific variables. It can easily be that the cost of keeping that deprecated function around — not just in terms of size but also in API coherence, future maintainability, etc. — outweighs the benefit of avoiding a breaking change.

              1. 1

                A lot of time the “maintaining API coherence” argument is a euphemism for making it look as if the design mistake was never made. Except it was, and now it’s my responsibility as the maintainer to ensure minimal impact on the people who chose to trust my code.

                I completely agree that those arguments can be valid, but breaking changes I see in the wild tend to be of the avoidable variety where a small bit of effort on the maintainer’s side was all that needed to fix it for future users without affecting existing ones.

                Breaking changes unannounced and perhaps even unnoticed by the authors are even worse.

                1. 2

                  design mistake

                  I just can’t get on board with this framing. “Design mistakes” are an essential and intractable part of healthy software projects. In fact they’re not mistakes at all, they’re simply stages in the evolution of an API. Our tools and practices have to reflect this truth.

                  1. 2

                    Both in this thread and every other time I’ve seen you talk about this topic on this website you’re always on the rather extreme side of “it’s impossible not to have completely unstable software”.

                    There are many examples of very stable software out there, and there’s plenty of people who are very appreciative of very stable software. There are very stable software distributions out there too (debian) and there are plenty of people who are very appreciative of the fact that if they leave auto-updates on for 4 years and leave things alone that there’s a very small chance that they will ever have to fix any breakages, and when they eventually have to update, that things will also be very well documented and they likely won’t have to deal with a subtle and unnoticed breakage.

                    Yes, it is a reality that new and bleeding edge concepts need to go through a maturation stage where they experiment with ideas before they can be solidified into a design, but pretending like this is true for every software project in existence and that therefore instability must just be accepted is just flat out wrong.

                    We have had pretty decent stability for quite some time and plenty of people have been very happy with the tradeoff between stability and the bleeding edge.

                    What I’m saying here is really that nobody is asking you to take your extremely immature project for which you don’t have the experience or foresight (yet) to stabilise and stabilise it prematurely. Of course there will always be projects like that and like anything they’re going to be bleeding-edge avant-garde ordeals which may become extremely successful and useful projects that everyone loves or end up in a dumpster. What I am saying here is that it’s okay to have an unstable project, but it’s not okay to mislead people about its stability and it is not okay to pretend like nobody should ever expect stability from any project. If your project makes no effort to stop breaking changes and is nowhere near being at a point where the design mistakes have been figured out then it should make that clear in its documentation. If your project looks like it uses a 3 part versioning scheme then it should make it clear that it’s not semver. And most importantly, just because your project can’t stabilise yet doesn’t mean that nobody’s project can stabilise.

                    1. 1

                      there are plenty of people who are very appreciative of the fact that if they leave auto-updates on for 4 years and leave things alone that there’s a very small chance that they will ever have to fix any breakages

                      Let me distinguish software delivered to system end users via package managers like apt, from software delivered to programmers via language tooling like cargo. I’m not concerned with the former, I’m only speaking about the latter.

                      With that said…

                      you’re always on the rather extreme side of “it’s impossible not to have completely unstable software”

                      I don’t think that’s a fair summary of my position. I completely agree that it’s possible to have stable software.

                      Let’s first define stable. Assuming the authors follow semver, I guess the definition you’re advancing is that it [almost] never increments the major version, because all changes over time don’t break API compatibility. (If that’s not your working definition, please correct me!)

                      I’m going to make some claims now which I hope are noncontroversial. First, software that exhibits this property of stability is net beneficial to existing consumers of that software, because they can continue to use it, and automatically upgrade it, without fear of breakage in their own applications. Second, it is a net cost to the authors of that software, because maintaining API compatibility is, generally, more work than making breaking changes when the need arises. Third, it is, over time, a net cost to new consumers of that software, because avoiding breaking changes necessarily produces an API surface area which is less coherent as a unit whole than otherwise. Fourth, that existing consumers, potential/new consumers, and authors/maintainers are each stakeholders in a software project, and further that their relative needs are not always the same but can change depending on the scope and usage and reach of the project.

                      If you buy these claims then I hope it is not much of a leap to get to the notion that the cost of a breaking change is not constant. And, further, that it is at least possible that a breaking change delivers, net, more benefit than cost, versus avoiding that change to maintain API compatibility. One common way this can be true is if the software is not consumed by very many people. Another common way is if the consumers of that software don’t have the expectation that they should be able to make blind upgrades, especially across major versions, without concomitant code changes on their side.

                      If you buy that, then what we’re down to is figuring out where the line is. And my claim is that the overwhelming majority of software which is actively worked-on by human beings is in this category where breaking changes are not a big deal.

                      First and foremost, because the overwhelming majority of software is private, written in market-driven organizations, and must respond to business requirements which are always changing by their very nature. If you bind the authors of that software to stability guarantees as we’ve defined them, you make it unreasonably difficult for them to respond to the needs of their business stakeholders. It’s non-viable. And it’s unnecessary! Software in this category rarely has a high consumer-to-producer ratio. The cost of a breaking change is facially less than, say, the AWS SDK.

                      By my experience this describes something like 80-90% of software produced and maintained in the world. The GNU greps and the AWS SDKs and those class of widely-consumed things are a superminority of software overall. Important! But non-representative. You can’t define protocols and practices and ecosystem expectations for general-purpose programming languages using them as exemplars.

                      Second, because consumers shouldn’t have the expectation that they can make blind upgrades across major versions without code changes on their side. It implies a responsibility that authors have over their consumers which is at best difficult in closed software ecosystems like I described above, and actually literally impossible in open software ecosystems like the OSS space. As an OSS author I simply don’t have any way to know how many people are using my software, it’s potentially infinite; and I simply can’t own any of the risk they incur by using it, it doesn’t scale. I should of course make good faith effort toward making their lives easier, but that can’t be a mandate, or even an expectation, of the ecosystem or it’s tooling.

                      And, thirdly, because stability as we’ve defined it is simply an unreasonable standard for anything produced by humans, especially humans who aren’t being supported by an organization and remunerated for their efforts. Let me give you an example. Let’s say I come up with a flag parsing library that provides real value versus the status quo, by cleanly supporting flag input from multiple sources: commandline flags, environment variables, and config files. Upon initial release I support a single config file format, and allow users to specify it with an option called ConfigParser. It’s a hit! It delivers real value to the community. Shortly, a PR comes in to support JSON and YAML parsers. This expands the cardinality of my set of parsers from 1 to N, which means that ConfigParser is now insufficiently precise to describe what it’s controlling. If I had included support for multiple config files from the start, I would have qualified the options names: PlainConfigParser, JSONConfigParser, YAMLConfigParser, and so on. But now I’m faced with the question: do I avoid a breaking change, leave ConfigParser as it is, and just add JSONConfigParser and YAMLConfigParser in addition? Or do I rename ConfigParser to PlainConfigParser when I add the other two, in order to keep the options symmetric?

                      My initial decision to call the option ConfigParser was not a design mistake. The software and its capabilities evolved over time, and it’s not reasonable to expect me to predict all possible future capabilities on day one. And the non-breaking change option is not necessarily the right one! If I have 100 users now, but will have 100,000 users in a year’s time, then breaking my current users — while bumping the major version, to be clear! — is the better choice, in order to provide a more coherent API to everyone in the future.

                      Some people would say that this class of software should stay at major version 0 until it reaches a point of stability. This, too, isn’t reasonable. First, because stability is undefinable, and any judgment I make will be subjective and wrong in someone’s eyes. Second, because it strips me of the power to signal breaking vs. non-breaking changes to my users. Semver stipulates the semantics of each of its integers, but nothing beyond that — it doesn’t stipulate a mandatory rate of change, or anything like that. It leaves stability to be defined by the authors of the software. To be clear, more stable software is easier to consume and should be preferred when possible. But it can’t be the be-all end-all metric that software must optimize for.

                      1. 2

                        So let’s get some things out of the way.

                        I don’t care about private spaces, and I don’t care about what people do in them.

                        I also don’t care about de jure enforcement of stablity through limitations or tooling or whatever. My interest is specifically in de facto stability expectations.

                        For ease of my writing, let’s define the term interface to mean something which you could reasonably write a standard to describe. This means things like protocols, APIs or programming languages. Let’s define producer to be someone who produces an interface. Let’s define consumer to be someone who uses an interface.

                        I am of the opinion that in the programming world (encompassing both consumers and producers of interfaces) the following expectations should be de jure:

                        • If you are the author of an interface which you publish for users, it is your responsibility to be honest about your stability promises with regards to that interface.

                        • People should expect stability from commonly used interfaces.

                        Just like people enjoy the benefits of the stability of debian, there are plenty of people who enjoy the benefits of the stability of a language like C. C programs I wrote 5 years ago still compile and run today, C programs I will write today will still compile and run in 5 years. I obviously have to follow the standard correctly and ensure my program is written correctly, but if for some reason my program cannot be compiled and cannot be ran in 5 years, I have nobody to blame but myself. This is an incredibly useful quality.

                        The fact that I can’t expect the same of rust completely discounts rust as a viable choice of software for me.

                        There is no reason why I should have to be excluded from new features of an interface just to avoid breaking changes.

                        Moreover, in the modern world where security of software is an ever-growing concern, telling consumers to chose between interface stability and security is an awful idea.

                        Now with that out of the way, let’s go over your points (I have summarised them, if you think my summary is wrong then we may be talking past each other, feel free to try to address my misunderstanding).

                        Interface stability is beneficial for existing consumers.

                        Yes.

                        Interface stability is a cost to interface producers.

                        Yes.

                        Interface stability is a cost for new consumers.

                        No.

                        Keeping your interface stable does not prevent you from expanding it in a non-breaking way to improve it. It does not prevent you from then changing the documentation of the interface (or expanding on its standard) to guide new consumers towards taking the improved view of the interface. It does not prevent you from stopping further non-essential development on the old portions of the interface. It does not stop you from creating a new major version of the interface with only the new parts of the interface and only maintaining the old interface insofar as it requires essential fixes.

                        Interface consumers and producers are both interface stakeholders but have different needs.

                        Sure.

                        The cost of a breaking change is not constant.

                        Sure.

                        It is possible that a breaking change delivers more benefit than cost versus avoiding the breaking change.

                        I think this is disputable because of what I said earlier about being able to improve an interface without breaking it.

                        But I can see this being the case when a critical security makes consumers of your interface vulnerable through normal use of the interface. This is a rare corner case though and doesn’t really make the whole idea or goal of interface stability any less worthwhile.

                        This can happen your interface is not used by many consumers.

                        Yes, you obviously should feel less obliged to keep things stable in this case, but you should still be obliged to make consumers well aware of just how stable your interface is.

                        This can happen if consumers do not have high stability expectations.

                        Which you can ensure by making them well aware of what stability expectations they should have.

                        But I would argue that in general, consumers should be allowed and encouraged to have high stability expectations.

                        especially across major versions, without concomitant code changes on their side

                        I don’t see how this is relevant given that you already defined stability a certain way.

                        Of course if your interface follows semver and you increment the major then you’ve made your stability expectations clear and if the consumers do not understand this then they can only blame themselves.

                        The majority of interfaces belong in a category which excludes them from high stability expectations.

                        Yes, as interfaces follow a pareto distribution, very few of them have a significant number of consumers where breaking changes will have a significant impact.

                        The majority of interfaces are private

                        Actually this is irrelevant, you can just categorise this as “tiny userbase”.

                        That being said, there’s two sides to this coin.

                        Google mail’s internal APIs may have a “tiny userbase” (although google is a big company so maybe not so), on the other hand, google mail’s user facing interface has a massive userbase.

                        If you force producers of those interfaces to guarantee stability as defined above then you make it unreasonably difficult to respond to the needs of their business stakeholders.

                        Nobody is forcing or suggesting forcing anyone to do anything.

                        You can’t use interfaces with enormous amounts of consumers as baselines for how people should handle interface stability of their interfaces with fewer users.

                        I’d say this is irrelevant.

                        Your hello world script doesn’t need interface stability. That is one extreme. The C standard needs extreme stability, that is the other extreme. If you consider the majority of the things I am actually thinking of when we talk about “things being horribly unstable” then they definitely don’t fall 1% point away from hello world. I talk about things like rust, nim, major libraries within those languages, etc.

                        Those things have significant numbers of cosumers yet their interfaces seem to come with the same stability expectations as the hello world program.

                        Yes, stability is a scale, it’s not binary, but the current situation is that a lot of people are treating it as completely optional in situations where it’s certainly shouldn’t be. I don’t think it’s an unreasonable expectation to expect more stability from some of these projects. It is also not an unreasonable expectation to

                        Because consumers shouldn’t high interface stability expectations across major versions this implies producers have a responsibility which is at best difficult in closed software ecosystems described above.

                        I’m not actually sure how to interpret this but it seems to talk about things irrelevant to the point. Feel free to clarify.

                        As the producer of an open source interface I don’t have any way to know how many consumers I have and therefore I cannot own any risk incurred by the consumers consuming my interface.

                        Nobody is asking anyone to own any risk (especially in the open source ABSOLUTELY NO WARRANTY case). I am simply saying that if your project is popular and you break stability expectations when you have made it clear that people should have them, you should expect people to distrust your interface, stop using it and potentially distrust future interfaces you produce. I think this is only fair, and really the most I can ask of people.

                        Moreover, you do have many ways of estimating how many people are using your software. If enough people are using your software for interface stability to matter then you will know about it (or you’re some expert hermit who manages to successfully maintain key project while never knowing anything about the outside world).

                        Stability as defined above is an unreasonable and unreachable standard for human producers of interfaces.

                        No, I don’t think it is, or rather, I thing you have produced a false dichotomy. Correct me if I’m wrong but you seem to think that because perfect stability is impossible, imperfect stability is no longer worthwhile. I think this is just a poor way of looking at things, imperfect stability is achievable to a high level and is worthwhile in many cases.

                        … especially humans who aren’t being supported by an organization and remunerated for their efforts.

                        Let’s put it this way, just because you decided to volunteer to do something doesn’t mean you have no responsibility. You take on responsibility by making the risk of creating something which people might come to rely on and use. If you do not wish to take on this responsibility, then it is at the very least your responsibility to make THAT clear.

                        I would say it is childish to suggest that just because you are doing something for free that therefore you have no responsibility for the consequences of what happens when people come to rely on that free activity. There are a million and one real world examples of how this does not pan out.

                        Let me give you an example. Let’s say I come up with a flag parsing library that provides real value versus the status quo, by cleanly supporting flag input from multiple sources: commandline flags, environment variables, and config files. Upon initial release I support a single config file format, and allow users to specify it with an option called ConfigParser. It’s a hit! It delivers real value to the community. Shortly, a PR comes in to support JSON and YAML parsers. This expands the cardinality of my set of parsers from 1 to N, which means that ConfigParser is now insufficiently precise to describe what it’s controlling. If I had included support for multiple config files from the start, I would have qualified the options names: PlainConfigParser, JSONConfigParser, YAMLConfigParser, and so on. But now I’m faced with the question: do I avoid a breaking change, leave ConfigParser as it is, and just add JSONConfigParser and YAMLConfigParser in addition? Or do I rename ConfigParser to PlainConfigParser when I add the other two, in order to keep the options symmetric?

                        I would say that it’s actually worthwhile to avoid breaking API here. If your bar for breaking API is this low then you’re going to be breaking API every day of the week. Moreover, you won’t learn anything about API design from your project.

                        There is real value in living with your mistakes for a while and letting things like this accumulate for a while. Simply accepting every breaking change as it comes is going to teach you less about which parts of your API are actually good or not than waiting for a good point to create config_parser_of_the_snake_case_variety a few years down the road with all the things you’ve learned incorporated. The end result will be a far better design than whatever you could cobble together with a weekly breaking change.

                        You’re also ignoring the possibility of just making a new API (also, surely the correct option is not a name-per-format but rather an option) and wiring the old API to just forward to the new one. You can begin to think about future proofing your API while keeping the old one around (to everyone’s great benefit).

                        Finally, if you’ve just written a library, why is it 1.0.0 already? Why would you do that to yourself? Spend a few months at least with it in a pre-1.0.0 state. Stabilize it once you iron out all the issues.

                        My initial decision to call the option ConfigParser was not a design mistake.

                        Of course not, the mistake was pretending that JSON and YAML are config formats. (This is a joke.)

                        The software and its capabilities evolved over time, and it’s not reasonable to expect me to predict all possible future capabilities on day one.

                        Of course not. But I think you’re being unreasonably narrow-minded with regards to possible options for how to evolve a project while keeping the interface stable.

                        And the non-breaking change option is not necessarily the right one! If I have 100 users now, but will have 100,000 users in a year’s time, then breaking my current users — while bumping the major version, to be clear! — is the better choice, in order to provide a more coherent API to everyone in the future.

                        It’s also extremely unlikely that you will be able to predict the requirements of your library one year in the future. Which is possibly why your library shouldn’t be a library yet and should wait a few years of usecases through before it solidifies into one.

                        There’s too many tiny libraries which do half a thing poorly out there, at least part of this stability discussion should be around whether some interfaces should exist to begin with.

                        Some say this class of interface should stay at major version 0.

                        Yes, I think this is synonymous with my position that it shouldn’t be a library to begin with.

                        This is unreasonable because stability is undefinable.

                        No, you already defined it a couple of times in a way I would agree with. A stable interface is an interface where a consumer should not expect things to break if they update the software to the latest version (of that particular major version).

                        Any stability judgement I make will be questioned by someone.

                        Why leave the house then? You might die. This comes back again to the all-or-nothing mentality I mentioned earlier where you seem to suggest that just because perfect stability is impossible, imperfect stability is not worthwhile.

                        Second, because it strips me of the power to signal breaking vs. non-breaking changes to my users.

                        It doesn’t do that. You have literally all the tools at your disposal including your changelog and documentation. Moreover, if your project is at 0.x.y then you are completely free to tell people that if x increases then the interface will have changed significantly and if y increases then the interface probably hasn’t changed. Semver does not specify how x and y are to be interpreted for 0.x.y, only for M.x.y where M > 0.

                        Semver explains what the numbers mean but nothing else. It does not specify the rate of change. It leaves stability to be defined by the authors of the software.

                        Yes, these are true statements.

                        Stability should not be the the most important goal for all software.

                        Again, a true statement. My point is and always has been that the issue is not that all interfaces aren’t perfectly stable, it’s that there’s a lot of interfaces which are far less stable than they reasonably should be and that normalising this means that software is going to get a lot more confusing, a lot more insecure and a lot more broken as time goes on.

                        Overall I think we’re in agreement on most parts except for how easy stability and how important it is relatively speaking.

                        1. 1

                          There is real value in living with your mistakes for a while and letting things like [ConfigParserX] accumulate for a while. Simply accepting every breaking change as it comes is going to teach you less about which parts of your API are actually good or not than waiting for a good point to create config_parser_of_the_snake_case_variety a few years down the road with all the things you’ve learned incorporated. The end result will be a far better design than whatever you could cobble together with a weekly breaking change.

                          I don’t know how better to say this: the decisions in this example are not mistakes. They are made by a human being at a particular point in the evolution of a software project.

                          Finally, if you’ve just written a library, why is it 1.0.0 already? Why would you do that to yourself? Spend a few months at least with it in a pre-1.0.0 state. Stabilize it once you iron out all the issues.

                          1.x.y in the semver nomenclature represents a notion of stability which I as the author get to define. It isn’t a universal notion of stability, it’s nothing more than what I define it to be. There is no single or objective denomination of “when I iron out all of the issues” which I can meet in order to qualify a 1.x.y version identifier. It’s all subjective!

                          just because you decided to volunteer to do something doesn’t mean you have no responsibility. You take on responsibility by making the risk of creating something which people might come to rely on and use. If you do not wish to take on this responsibility, then it is at the very least your responsibility to make THAT clear.

                          Semver provides all of the mechanics I need to signal what you demand. If I make a breaking change, I increment the major version number. That’s it! There’s nothing more to it.

                          1. 2

                            I don’t know how better to say this: the decisions in this example are not mistakes. They are made by a human being at a particular point in the evolution of a software project.

                            If it was reasonable to assume that the API would need that feature eventually then yes it’s a mistake. If it was not reasonable to assume that the API would need that feature (I would err on this side since neither YAML nor JSON are actually meant for configuration) then there’s two possibilities: a, that the design change is unnecessary and belongs in an unrelated project or b, that the design change should be made with no need for breaking the existing interface.

                            1.x.y in the semver nomenclature represents a notion of stability which I as the author get to define. It isn’t a universal notion of stability, it’s nothing more than what I define it to be. There is no single or objective denomination of “when I iron out all of the issues” which I can meet in order to qualify a 1.x.y version identifier. It’s all subjective!

                            The fact that it’s subjective and you can’t with perfect accuracy determine when it’s appropriate to make a 1.x.y release does not mean that lots of projects haven’t already made good-enough guesses on when this is appropriate and does not mean that a good enough guess is not an adequate substitute for perfect. Once again, you seem to suggest that just because it’s impossible to do something perfectly that it’s not worth doing it at all.

                            Semver provides all of the mechanics I need to signal what you demand. If I make a breaking change, I increment the major version number. That’s it! There’s nothing more to it.

                            Yeah, so do that. I don’t get where the disagreement lies here.

                            *As a completely unrelated side note, the fact that firefox binds ^W to close tab is a hideous design choice.

                            1. 1

                              you seem to suggest that just because it’s impossible to do something perfectly that it’s not worth doing it at all.

                              I am 100% for releasing 1.x.y versions of software projects. What I’m saying is merely that a subsequent release of 2.z.a, and afterwards 3.b.c, and then 4.d.e, is normal and good and not indicative of failure in any sense.

                              If it was reasonable to assume that the API would need that feature eventually then yes it’s a mistake.

                              It is incoherent to claim that this represents a mistake. If you can’t agree to that then there’s no possibility of progress in this conversation; if this is a mistake then you’re mandating perfect omniscient prescience in API design, which is plainly impossible.

                              1. 1

                                I am 100% for releasing 1.x.y versions of software projects. What I’m saying is merely that a subsequent release of 2.z.a, and afterwards 3.b.c, and then 4.d.e, is normal and good and not indicative of failure in any sense.

                                I would say that any interface which has need to change that often is too vaguely defined to be a real interface. Fundamentally rooted in the idea of an interface is the idea of being able to re-use it in multiple places, and the fundamental use in re-using something in multiple places is if adjustments to that thing benefit all users (otherwise why not just copy paste the code). As such, if your interface changes constantly, it does not provide any benefit as an interface and should not be one. Put another way, if every time I try to update your interface I have to change my code, what benefit is there at all in me even calling it a proper interface and not just maintaining my own version (either by periodically merging in new changes to the upstream version or just cherry-picking new features I like)?

                                It is incoherent to claim that this represents a mistake.

                                How so? Are you suggesting it is impossible to have design oversights?

                                If you can’t agree to that then there’s no possibility of progress in this conversation; if this is a mistake then you’re mandating perfect omniscient prescience in API design, which is plainly impossible.

                                Just because foresight is impossible, doesn’t mean you’ve not made a mistake. If foresight is impossible, maybe the mistake is attempting to foresee in the first place? In either case, I don’t understand why you think my stance on whether design mistakes are a real thing or not matters in a discussion about whether it’s appropriate for interfaces to be extremely unstable.

                                Moreover, given the vagueness of your example config parser case, I never actually said if the scenario you presented constituted a design mistake, I merely outlined a set of possibilities, one of which was that there was a design mistake.

                                So far (when it comes to your config parser example), I think that you’re being quite narrow-minded in terms of possible approaches to non-breaking interface changes which would avoid the problem entirely without sacrifices. I also think that the example you provided is extremely contrived and the initial design sounds like a bad idea to begin with (although once again, I don’t think this is the reason why you think there needs to be a breaking change, I think you think there needs to be a breaking change solely because I don’t think you have explored all the possibilities of how a non-breaking change may be implemented, but the example lacks a lot of detail so it’s difficult to understand your reasoning).

                                Maybe if you picked a more realistic design which didn’t have obvious problems to begin with and demonstrated a sensible design change you think would necessitate a breaking change (to avoid compromising on API quality) we could discuss how I would solve that and you can point out the disadvantages, according to you, of my approach to solving the breakage?

                                1. 1

                                  if every time I try to update your interface I have to change my code, what benefit is there at all in me even calling it a proper interface

                                  The benefit that my library provides to you as a consumer is mostly about the capabilities it offers for you, and only a little bit about the stability of its API over time. Updating the version of my library which you use is also an opt-in decision that you as a consumer make for yourself.

                                  I also think that the example you provided is extremely contrived

                                  It is adapted directly from a decision I had to make in my actual flag parsing library and chosen because I feel it is particularly exemplary of the kinds of design evolution I experience every day as a software author.

                                  you can point out the disadvantages, according to you, of my approach to solving the breakage?

                                  I don’t think we need an example. My position is that as long as you abide semver and bump the major version there is almost no net disadvantage to breaking changes for the vast majority of software produced today. And that avoiding breaking changes is a cost to coherence that rarely makes sense, except for a superminority of software modules. I have explained this as well as I think I can in a previous comment so if that’s not convincing then I think we can agree to disagree.

                                  1. 1

                                    The benefit that my library provides to you as a consumer is mostly about the capabilities it offers for you, and only a little bit about the stability of its API over time. Updating the version of my library which you use is also an opt-in decision that you as a consumer make for yourself.

                                    You can call it a library or an interface all day long, I dispute you calling it that more than I dispute its usefulness. It might function and be useful but it does not function as an interface. It is really not an interface if it does not end up shared among components, if it is not shared among components then there is obviously no real problem (at least in terms of stability, since there’s an enormous number of problems in terms of general code complexity, over-abstraction, security issues and auditability issues).

                                    It is adapted directly from a decision I had to make in my actual flag parsing library and chosen because I feel it is particularly exemplary of the kinds of design evolution I experience every day as a software author.

                                    Can you point me to the breaking change?

                                    My position is that as long as you abide semver and bump the major version there is almost no net disadvantage to breaking changes for the vast majority of software produced today.

                                    And like I already explained, talking about “most software” is pointless since almost nobody uses it.

                                    The discussion is about the small fraction of software which people actually use, where there is a clear benefit to avoiding random breaking changes.

                                    And that avoiding breaking changes is a cost to coherence that rarely makes sense, except for a superminority of software modules.

                                    A superminority which anything worth talking about is already part of.

                                    I have explained this as well as I think I can in a previous comment so if that’s not convincing then I think we can agree to disagree.

                                    There is nothing left to explain, it is not an issue of explanation, you are simply making a bunch of assertions including the assertion that, given an interface which people actually use, most normal design changes which change the interface are better* than design changes which preserve the interface. You have yet to provide any evidence of this (although hopefully if you send me the commit for your project we will finally have something concrete to discuss).

                                    *better being defined as “the disadvantages of breaking the interface do not outweigh the disadvantages of lost clarity by expanding keeping the interface”

                                    1. 1

                                      I don’t care about “interfaces” — that’s terminology which you introduced. We’re talking about software modules, or packages, or libraries, with versioned APIs that are subject to the rules of semver.

                                      And like I already explained, talking about “most software” is pointless since almost nobody uses it . . . The discussion is about the small fraction of software which people actually use

                                      The discussion I’m having is about what software authors should be doing. So I’m concerned with the body of software produced irrespective of how many consumers a given piece of software has. If you’re only concerned with software that has enormously more consumers than producers than we are having entirely different discussions. I acknowledge this body of software exists but I call it almost irrelevant in the context of what programming language ecosystems (and their tools) should be concerned with.

                                      1. 1

                                        I don’t care about “interfaces” — that’s terminology which you introduced. We’re talking about software modules, or packages, or libraries, with versioned APIs that are subject to the rules of semver.

                                        If you had an issue with my definition of interface then you should have raised it a bit earlier. That being said, I don’t think that you have a problem with the definition (since it seems completely compatible) but maybe the issue lies somewhere else.

                                        The discussion I’m having is about what software authors should be doing. So I’m concerned with the body of software produced irrespective of how many consumers a given piece of software has. If you’re only concerned with software that has enormously more consumers than producers than we are having entirely different discussions. I acknowledge this body of software exists but I call it almost irrelevant in the context of what programming language ecosystems (and their tools) should be concerned with.

                                        Programming ecosystems are interfaces with large numbers of users, their package handling tools exist to solely serve the packages which have a significant number of users. Making package handling tools work for the insignificant packages is a complete waste of time and benefits basically nobody.

                                        Your package, like it or not, belongs in the group of packages I’m talking about.

                                        Discussions about semver and interfaces and stability and versioning are obviously going to be completely irrelevant if nobody or almost nobody uses your package. More importantly, it’s very unproductive to use this majority of unused or barely used packages to inform design decisions of software tooling or to inform people how they should go about maintaining packages people actually use.

                                        1. 1

                                          [ecosystem] package handling tools exist to solely serve the packages which have a significant number of users

                                          No, they exist to serve the needs of the entire ecosystem.

                                          Your package, like it or not, [is insignificant]

                                          I guess we’re done here.

                                          1. 1

                                            No, they exist to serve the needs of the entire ecosystem.

                                            Focusing on the needs of wannabe libraries which nobody uses is nonsensical. Ecosystems focus on the needs of packages which people use and the people who use those packages.

                                            Your package, like it or not, [is insignificant] I guess we’re done here.

                                            Did you intentionally misrepresent me because you were bored of the discussion? Because I literally said the opposite.

                                            1. 1

                                              I apologize if you were saying that my package represents something with a large number of users. I parsed your sentence as the opposite meaning.

                                              Nevertheless, we’re clearly at an impasse. Your position is that

                                              package handling tools exist to solely serve the packages which have a significant number of users

                                              which is basically the antipode of my considered belief, and it doesn’t appear like you’re open to reconsideration. So I’m not sure there’s much point to continuing this already very long thread :)

                                              edit: I can maybe speculate at the underlying issue here, which is that you seem to be measuring relevant software by the number of consumers it has, whereas I’m measuring relevant software by it’s mere existence. So for you maybe an ecosystem with 1000 packages with 2 consumers each and 2 packages with 1000 consumers each is actually just 2 packages big, so to speak, and those 2 packages’ needs dictate the requirements of the tool; and for me it’s 1002 packages big and weighted accordingly. Does this make sense?

                                              1. 1

                                                I get your point, I don’t think there’s any misunderstanding there. I just can’t understand why when it comes to tools which are designed to facilitate interoperation you care about the packages for which … because almost nobody uses them … interoperation is not as important as the packages which lots of people use. I think the other problem is maybe that you’ve got a skewed idea of the relationship between package producers and users. I would say that it’s extremely likely (and I don’t have numbers to back this up but there’s no real reason why this doesn’t follow the Pareto distribution) that 20% of packages constitute 80% of package usage. These tools are designed to facilitate package use above package creation (since far fewer people will be creating packages rather than using them) therefore focusing on the 20% of packages which which constitute 80% of the usage would surely make sense?

                                                1. 1

                                                  far fewer people will be creating packages rather than using them

                                                  Just to make this explicit, I’m speaking from a context which includes open-source software (maybe 10-20% of all software produced in the world) and closed-source software written and maintained at market-driven organizations (maybe 80% of all software).

                                                  In this context it’s my experience that essentially every programmer is both a consumer and producer of packages. Even if a package has only a single consumer it is still a member of the package ecosystem, and the needs of it’s (singular) author and consumer are relevant! So I don’t agree that far fewer people are creating packages than consuming them. In fact I struggle to summon a single example of someone who consumes but doesn’t produce packages.

                                                  20% of packages constitute 80% of package usage

                                                  I agree that the packages-vs-consumers curve is pretty exponential. I agree that tooling can and should support the needs of those relatively few packages with vastly more consumers than producers. But I don’t think it’s as extreme as you’re describing here. I think the “long tail” of packages consumed by a single-digit number of consumers represents the 80% of overall package consumption, the 80% of the area under the curve.

                          2. 1

                            I want to revisit two things here…

                            I don’t care about private spaces.

                            That’s fine! If you’re not interested in this category of software personally then that’s totally groovy. But “private spaces” hold the overwhelming majority of software produced and consumed in the world. Package management tooling for a general-purpose programming language which doesn’t prioritize the needs of these users doesn’t serve it’s purpose.

                            if your project is popular and you break stability expectations

                            When you say “stability expectations” do you mean a consumer’s expectation that I won’t violate the rules of semver? Or an expectation that I won’t hardly ever increment my major version number? Or something else?

                            As an example, go-github is currently on major version 41 — is this a problem?

            2. 2

              Bollocks. A minor breaking change is “re reversed the order of parameters on this method to match the other methods”. A major breaking change is “switched from callbacks to promises”

              1. 2

                A switch from callbacks to promises that comes without a compatibility interface should be reflected in the library name, not just the version. It’s not even the same library anymore if literally every line of code that is using the old version must be rewritten to use it again.

                1. 1

                  I agree, but we’re in the minority, despite Rich Hickey’s efforts.

            3. 8

              One of the weird outcomes that happens with semver is that the projects with stratosphericly high major version numbers are frequently the least hassle to blindly upgrade. The fact that the developers are bumping the major version so often is a result of the fact that they are paying close attention to backwards compatibility, I suspect.,

              1. 6

                https://pvp.haskell.org/ has two major versions and predates even semver. It works pretty well.

                1. 5

                  A lot of this seems to rely on the premise:

                  Net effect of these issues: most people don’t actually use SemVer consistently[1]

                  …but I don’t think that’s been true in my experience in npm-land, any project that claims to follow semver but publishes breaking changes outside of a major version intentionally would not have many users.

                  [1] If people were using SemVer as per the spec, you’d be seeing a lot more code at version 37.y.z.

                  I do see a fair amount of semver projects at the double-digit major versions. Also, it’s common to try to bundle releasing breaking changes, with the idea being that it’s easier for users to consume several breaking changes all at once, instead of having to digest a major version for every breaking change.

                  Semver is pretty sweet, when used correctly, you have a good idea of what is safe to upgrade and when you really should go read the changelog, just from the version number. Plus, it discourages the useless sentimentality about version numbers, something users oughtta be trained out of. “Break Versioning” solves a problem that I haven’t really encountered.

                  1. 6

                    One thing I don’t like about the way people use semver is the number of projects that stay at 0.x even when they have thousands of users and compatibility is obviously a major concern.

                    I wish such people took responsibility for their API and started versioning it correctly. But it’s not a semver problem, and the proposal doesn’t really solve it—those people are intentionally not putting effort in marking breaking changes.

                  2. 3

                    Another reason to reconsider semver that was not mentioned explicitly: the more effort a developer puts into not making breaking changes, the lower their major version number will be, if they follow the convention strictly. I suspect that this is partially the reason that some of the highest quality software sticks to a 0.x version indefinitely. However especially for end-user software, a lower version number is often perceived as less advanced or less developed than a higher version number by the market.

                    I’m a big fan of decoupling breakage from features in versioning schemes, or in other words decoupling the indication of a bump in commercial value from technical compatibility in some way. I’m not sure that the proposed solution covers that enough though, I’d be very interested in alternative ideas.

                    1. 3

                      Another reason to reconsider semver that was not mentioned explicitly: the more effort a developer puts into not making breaking changes, the lower their major version number will be

                      I guess the problem here is as a potential user if you see a low version number, there is no way to tell at a glance whether the version number is low because the author is very careful and serious about not making breaking changes, or because they’re careless and make breaking changes all the time but don’t identify them correctly.

                    2. 2

                      My big problem with SemVer is that it doesn’t allow for graceful deprecation. SemVer works well as a versioning scheme for interfaces, but it’s used as a versioning scheme for implementations. If a library tries to maintain good compatibility then each version will support multiple different interfaces and give consumers time to move to the new ones before they are removed but SemVer can express only breaking changes relative to the previous version, inductively. If you have three versions of a library, A, B, and C, where B deprecates some functions and adds replacements and C removes them, SemVer says they should be 1.0, 1.1, and 2.0, but if you’ve moved to the new interfaces in B then C isn’t a breaking change for you. Practically, you have A with interface 1.0, B with interfaces 1.0 and 2.0, and C with interface 2.0 only, but there’s no good way of mapping that to a single SemVer for the library.