1. 1

    I did one of these in Python for a collection of amusing solutions to phone-screen problems.

    Others in that repo include Fibonacci generator with no integer literals and no arithmetic operators, and an is_square() that’s only one line (and that I finally got around to updating today because of this post; I’ve had people pose it with “0 is a square” and with “0 is not a square”, and decided the published version should correctly assert 0 is a square).

    Here it is in all its glory:

    from itertools import accumulate, count, takewhile
    
    is_square = lambda n: n == 0 or n > 0 and n in takewhile(lambda x: x <= n, accumulate(filter(lambda n: n & 1, count())))
    
    1. 46

      I’m so tired of rehashing this. Pointing out that SemVer is not 100% infallible guarantee, or that major versions don’t always cause major breakage adds nothing new.

      Lots of projects have a Changelog file where they document major changes, but nobody argues that reading changelogs would hurt you, because it may not contain all tiniest changes, or mention changes that would discourage people from upgrading, staying on insecure versions forever, etc.

      SemVer is just a machine-readable version of documentation of breaking changes.

      1. 23

        Yes, and the article tries to succinctly sum up what value can be derived from that and what fallacies await. I’d have to lie to have ever seen it summed up thought that lens in one place.

        I’m sorry it’s too derivative to your taste, but when the cryptography fire was raging, I was wishing for that article to exist so I can just paste it instead of extensive elaborations in the comments section.

        1. 11

          I thought the same thing initially, but it could also be coming from the perspective of using Rust frequently, which is strongly and statically typed. (I don’t actually know how frequently you use it; just an assumption.)

          A static/strong type system gives programmers a nice boundary for enforcing SemVer. You mostly just have to look at function signatures and make sure your project still builds. That’s the basic promise of the type system. If it builds, you’re likely using it as intended.

          As the author said, with something like Python, the boundary is more fuzzy. Imagine you write a function in python intended to work on lists, and somebody passes in a numpy array. There’s a good chance it will work. Until one day you decide to add a little extra functionality that still works on lists, but unintentionally (and silently) breaks the function working with arrays.

          That’s a super normal Python problem to have. And it would break SemVer. And it probably happens all the time (though I don’t know this).

          So maybe for weakly/dynamically typed languages, SemVer could do more harm than good if it really is unintentionally broken frequently.

          1. 8

            That’s all very true!

            Additionally what I’m trying to convey (not very successfully it seems) is that the reliance on that property is bad – even in Rust! Because any release can break your code even just by introducing a bug – no matter what the version number says. Thus you have to treat all versions as breaking. Given the discussions around pyca/cryptography this is clearly not common knowledge.

            The fact that this is much more common in dynamic languages as you’ve outlined is just the topping.

            I really don’t know what I’ve done wrong to warrant that OP comment + upvotes except probably hitting some sore point/over-satiation with these topics in the cryptography fallout. That’s a bummer but I guess nothing I can do about it. 🧘

            1. 7

              Car analogy time: You should treat cars as dangerous all the time. You can’t rely on seatbelts and airbags to save you. Should cars get rid of seatbelts?

              The fact that SemVer isn’t 100% right all the time is not a reason for switching to YOLO versioning.

              1. 3

                Except that SemVer is not a seatbelt, but – as I try to explain in the post – a sign saying “drive carefully”. It’s a valuable thing to be told, but you still have to take further measures to ensure safety and plan for the case when there’s a sign saying “drive recklessly”. That’s all that post is saying and nothing more.

                1. 2

                  Seatbelts reduce the chance of death. Reading a changelog reduces the chance of a bad patch. Trusting semver does not reduce the chance of an incompatible break.

                  1. 6

                    I really don’t get why there’s so much resistance to documenting known-breaking changes.

                    1. 3

                      I really don’t get why there’s so much resistance to documenting known-breaking changes.

                      I mean you could just…like…read the article instead of guessing what’s inside. Since the beginning you’ve been pretending the article’s saying what it absolutely isn’t. Killing one straw man after another, causing people to skip reading because they think it’s another screech of same-old.

                      I’m trying really hard to not attribute any bad faith to it but it’s getting increasingly harder and harder so I’m giving up.

                      Don’t bother responding, I’m done with you. Have a good life.

                      1. -1

                        mean you could just…like…read the article instead

                        So where in that article do you say why people don’t want to document known breaking changes ?

                        Offtopic: That was really hard to read. Too many fat prints and

                        quotes

                        with some links in between. It just destroyed my reading flow.

                        I also think the title “will not save you” is obviously telling everything about why people are just not reading it. It’s already starting with a big “it doesn’t work”, so why should I expect it to be in favor of it ?

                        1. 4

                          So where in that article do you say why people don’t want to document known breaking changes ?

                          Well, the pyca/cryptography team documented that they were rewriting in Rust far in advance of actually shipping it, and initially shipped it as optional. People who relied on the package, including distro package maintainers, just flat-out ignored it right up until it broke their builds because they weren’t set up to handle the Rust part.

                          So there’s no need for anyone else to cover that with respect to the cryptography fight. The change was documented and communicated, and the people who later decided to throw a fit over it were just flat-out not paying attention.

                          And nothing in SemVer would require incrementing major for the Rust rewrite, because it didn’t change public API of the module. Which the article does point out:

                          Funny enough, a change in the build system that doesn’t affect the public interface wouldn’t warrant a major bump in SemVer – particularly if it breaks platforms that were never supported by the authors – but let’s leave that aside.

                          Hopefully the above, which contains three paragraphs written by me, and only two short quotes, was not too awful on you to read.

                          1.  

                            Thanks, your summary is making a good point, and yes the original blogpost was hard to read, I did not intend this to be a troll.

                            And nothing in SemVer would require incrementing major for the Rust rewrite

                            Technically yes, practically I know that many rust crates do not increment the minimum required rust compiler version until a major version. So fair enough, semver in its core isn’t enough.

                2. 3

                  AFAIU, I think the OP comment may be trying to say that they agree with and in fact embrace the following sentence from your article:

                  Because that’s all SemVer is: a TL;DR of the changelog.

                  In particular, as far as I can remember, trying to find and browse a changelog was basically the only sensible thing one could do when trying to upgrade a dependency before SemVer became popular (plus keeps fingers crossed and run the tests). With the main time waster being trying to even locate and make sense of the changelog, with basically every project showing it elsewhere, if at all. (Actually, I seem to remember that finding any kind of changelog was already a big baseline plus mark for a project’s impression of quality). As such, having a hugely popular semi-standard convention for a tl;dr of the changelog is something I believe many people do find super valuable. They know enough to never fully trust it, similarly as they’d know to never fully trust a changelog. Having enough experience with changelogs and/or SemVer, they however do now see substantial value in SemVer as a huge time saver, esp. compared to what they had to do before.

                  Interestingly, there’s a bot called “dependabot” on GitHub. I’ve seen it used b a team, and what it does is track version changes in dependencies, and generate a summary changelog of commits since last version. Which seems to more or less support what I wrote above IMO.

                  (Please note that personally I still found your article super interesting, and nicely naming some phenomena that I only vaguely felt before. Including the one I expressed in this post.)

                  1. 2

                    I think there is something a bit wrong about the blanket statement that others shouldn’t rely on semver. I suspect that for many projects, trying one’s best to use the API as envisioned by the author, and relying on semver, will in practice provide you with bugfixes and performance improvements for free, while never causing any major problems.

                    I like the parts of this blog post that are pointing out the problems here, but I think it goes way too far in saying that I “need to” follow your prescribed steps. Some of my projects are done for my own enjoyment and offered for free, and it really rubs me the wrong way when anyone tells me how I “should” do them.

                    [edited to add: I didn’t upvote the top level comment, but I did feel frustrated by reading your post]

                    1. 1

                      I’m not sure to respond to that. The premise of the article it that people are making demands, claiming it will have a certain effect. My clearly stated goal is to dissect those claims, so people stop making those demands. Your use case is obviously very different so I have no interest to tell you to do anything. Why am I frustrating you and how could I have avoided it?

                      1. 3

                        My negative reaction was mostly to the section “Taking Responsibility”, which felt to me like it veered a bit into moralizing (especially the sentence “In practice that means that you need to be pro-active, regardless of the version schemes of your dependencies:”). On rereading it more carefully/charitably, I don’t think you intended to say that everyone must do it this way regardless of the tradeoffs, but that is how I read it the first time through.

                  2. 9

                    Type systems simply don’t do this. Here’s a list of examples where Haskell’s type system fails and I’m sure that you can produce a similar list for Rust.

                    By using words like “likely” and “mostly”, you are sketching a sort of pragmatic argument, where type systems work well enough to substitute for informal measures, like semantic versioning, that we might rely on the type system entirely. However, type systems are formal objects and cannot admit such fuzzy properties as “it mostly works” without clarification. Further, we usually expect type-checking algorithms to not be heuristics; we expect them to always work, and for any caveats to be enumerated as explicit preconditions.

                    1. 2

                      Also there were crate releases where a breaking change wasn’t catched because no tests verified that FooBar stayed Sync/Send.

                      1. 1

                        All I meant is that languages with strong type systems make it easier to correctly enforce semver than languages without them. It’s all a matter of degree. I’m not saying that languages like Rust and Haskell can guarantee semver correctness.

                        But the type system does make it easier to stay compliant because the public API of a library falls under the consideration of semver, and a large part of a public API is the types it can accept and the type it returns.

                        I’m definitely not claiming that type systems prevent all bugs and that we can “rely entirely on the type system”. I’m also not claiming that type systems can even guarantee that we’re using a public API as intended.

                        But they can at least make sure we’re passing the right types, which is a major source of bugs in dynamically typed languages. And those bugs are a prominent example of why OP argues that SemVer doesn’t work—accidental changes in the public API due to accepting subtly different types.

                  1. 2

                    You want to claim that version 3.2 is compatible with version 3.1 somehow, but how do you know that? You know the software basically “works” because of your unit tests, but surely you changed the tests between 3.1 and 3.2 if there were any intentional changes in behavior. How can you be sure that you didn’t remove or change any functions that someone might be calling?

                    Semantic versioning states that a minor release such as 3.2 should only add backwards compatible changes.

                    So all your existing unit tests from 3.1 should still be in place, untouched. You should have new unit tests, for the functionality added in 3.2.

                    I stopped reading after this, because the argument seems to boil down to either not understanding Semantic versioning, or not having full unit test coverage.

                    1. 20

                      I stopped reading after this

                      If you stopped reading at 10% of the article, you should probably also have stopped yourself from commenting.

                      not understanding Semantic versioning

                      The fallacy you’re committing here is very well documented.

                      1. 1

                        If you are questioning whether the function you removed/changed is used by anyone when deciding the next version increment, you are not using semantic versioning correctly (unless you always increase the major, regardless of how many people used the feature you modified). As the parent said, if you need to edit 3.1 tests, you broke something, and the semver website is quite clear about what to do on breaking changes.

                        1. 7

                          If you don’t only test the public API, it’s entirely possible to introduce required changes in tests in bugfix versions.

                          More importantly, my point about “no true Scotsman” was that saying “SemVer is great if and only if you follow some brittle manual process to the dot” proves the blog post’s narrative. SemVer is wishful thinking. You can have ambitions to adhere to it, you can claim your projects follow it, but you shouldn’t ever blindly rely on others doing it right.

                          1. 5

                            The question then becomes: why does nobody do it then? Do you truly believe that in a world, where it’s super rare that a major version exceeds “5” nobody ever had to change their tests, because some low-level implementation detail changed?

                            We’re talking about real packages that have more than one layer. Not a bunch of pure functions. You build abstractions over implementation details and in non-trivial software, you can’t always test the full functionality without relying on the knowledge of said implementation details.

                            Maybe the answer is: “that’s why everybody stays in ZeroVer” which is another way of saying that SenVer is impractical.

                        2. 6

                          The original fight about the PyCA cryptography package repeatedly suggested SemVer had been broken, and that if the team behind the package had adopted SemVer, there would have been far less drama.

                          Everyone who suggested this overlooked the fact that the change in question (from an extension module being built in C, to being built in Rust) did not change public API of the deliverable artifact in a backwards-incompatible way, and thus SemVer would not have been broken by doing that (i.e., if you ran pip install cryptography before and after, the module that ended up installed on your system exposed a public API that was compatible after with what you got before).

                          Unless you want to argue that SemVer requires version bump for any change that any third-party observer might notice. In which case A) you’ve deviated from what people generally say SemVer is about (see the original thread here, for example, where many people waffled between “only about documented API” and “but cryptography should’ve bumped major for this”) and B) have basically decreed that every commit increments major, because every commit potentially produces observable change.

                          But if you’d like to commit to a single definition of SemVer and make an argument that adoption of it by the cryptography package would’ve prevented the recent dramatic arguments, feel free to state that definition and I’ll see what kind of counterargument fits against it.

                          1. 1

                            Everyone who suggested this overlooked the fact that the change in question (from an extension module being built in C, to being built in Rust) did not change public API of the deliverable artifact in a backwards-incompatible way

                            I think you’re overlooking this little tidbit:

                            Since the Gentoo Portage package manager indirectly depends on cryptography, “we will probably have to entirely drop support for architectures that are not supported by Rust”. He listed five architectures that are not supported by upstream Rust (alpha, hppa, ia64, m68k, and s390) and an additional five that are supported but do not have Gentoo Rust packages (mips, 32-bit ppc, sparc, s390x, and riscv).

                            I’m not sure many people would consider “suddenly unavailable on 10 CPU architectures” to be “backwards compatible”.

                            But if you’d like to commit to a single definition of SemVer and make an argument that adoption of it by the cryptography package would’ve prevented the recent dramatic arguments, feel free to state that definition and I’ll see what kind of counterargument fits against it.

                            If you can tell me how making a change in a minor release, that causes the package to suddenly be unavailable on 10 CPU architectures that it previously was available on, is not considered a breaking change, I will give you $20.

                            1. 8

                              Let’s take a simplified example.

                              Suppose I write a package called add_positive_under_ten. It exposes exactly one public function, with this signature:

                              def add_positive_under_ten(x: int, y: int) -> int
                              

                              The documented contract of this function is that x and y must be of type int and must each be greater than 0 and less than 10, and that the return value is an int which is the sum of x and y. If the requirements regarding the types of x and y are not met, TypeError will be raised. If the requirements regarding their values are not met, ValueError will be raised. The package also includes an automated test suite which exhaustively checks behavior and correctness for all valid inputs, and verifies that the aforementioned exceptions are raised on sample invalid inputs.

                              In the first release of this package, it is pure Python. In a later, second release, I rewrite it in C as a compiled extension. In yet a later, third release, I rewrite the compiled C extension as a compiled Rust extension. From the perspective of a consumer of the package, the public API of the package has not changed. The documented behavior of the functions (in this case, single function) exposed publicly has not changed, as verified by the test suite.

                              Since Semantic Versioning as defined by semver.org applies to declared public API and nothing else whatsoever, Semantic Versioning would not require that I increment the major version with each of those releases.

                              Similarly, Semantic Versioning would not require that the pyca/cryptography package increment major for switching a compiled extension from C to Rust unless that switch also changed declared public API of the package in a backwards-incompatible way. The package does not adhere to Semantic Versioning, but even if it did there would be no obligation to increment major for this, under Semantic Versioning’s rules.

                              If you would instead like to argue that Semantic Versioning ought to apply to things beyond the declared public API, such as “any change a downstream consumer might notice requires incrementing major”, then I will point out that this is indistinguishable in practice from “every commit must increment major”.

                              1. 1

                                We don’t need a simplified, synthetic example.

                                We have the real world example. Do you believe that making a change which effectively drops support for ten CPU architectures is a breaking change, or not? If not, why not? How is “does not work at all”, not a breaking change?

                                1. 9

                                  The specific claim at issue is whether Semantic Versioning would have caused this to go differently.

                                  Although it doesn’t actually use SemVer, the pyca/cryptography package did not do anything that Semantic Versioning forbids. Because, again, the only thing Semantic Versioning forbids is incompatibility in the package’s declared public API. If the set of public classes/methods/functions/constants/etc. exposed by the package stays compatible as the underlying implementation is rewritten, Semantic Versioning is satisfied. Just as it would be if, for example, a function were rewritten to be more time- or memory-efficient than before while preserving the behavior.

                                  And although Gentoo (to take an example) seemed to be upset about losing support for architectures Gentoo chooses to support, they are not architectures that Python (the language) supported upstream, nor as far as I can tell did the pyca/cryptography team ever make any public declaration that they were committed to supporting those architectures. If someone gets their software, or my software, or your software, running on a platform that the software never committed to supporting, that creates zero obligation on their (or my, or your) part to maintain compatibility for that platform. But at any rate, Semantic Versioning has nothing whatsoever to say about this, because what happened here would not be a violation of Semantic Versioning.

                              2. 7

                                If you can tell me how making a change in a minor release, that causes the package to suddenly be unavailable on 10 CPU architectures that it previously was available on, is not considered a breaking change, I will give you $20.

                                None of those architectures were maintained or promised by the maintainers, but were added by third parties. No matter what your opinion on SemVer is, activities of third parties about whose existence you possibly didn’t even know about, is not part of it.

                                Keep your $20 but try to be a little more charitable and open-minded instead. We all have yet much to learn.

                                1. 0

                                  Keep your $20 but try to be a little more charitable and open-minded instead. We all have yet much to learn.

                                  If you think your argument somehow shows that breaking support for 10 CPU architectures isn’t a breaking change, then yes, we all have much to learn.

                                  1. 8

                                    You still haven’t explained why you think Semantic Versioning requires this. Or why you think the maintainers had any obligation to users they had never made any promises to in the first place.

                                    But I believe I’ve demonstrated clearly that Semantic Versioning does not consider this to be a change that requires incrementing major, so if you’re still offering that $20…

                                    1. 0

                                      Part of what they ship is code that’s compiled, and literally the first two sentences of the project readme are:

                                      cryptography is a package which provides cryptographic recipes and primitives to Python developers. Our goal is for it to be your “cryptographic standard library”.

                                      If your self stated goal is to be the “standard library” for something and you’re shipping code that is compiled (as opposed to interpreted code, e.g. python), I would expect you to not break things relating to the compiled part of the library in a minor release.

                                      Regardless of whether they directly support those other platforms or not, they ship code that is compiled, and their change to that compiled code, broke compatibility on those platforms.

                                      1. 8

                                        Regardless of whether they directly support those other platforms or not, they ship code that is compiled, and their change to that compiled code, broke compatibility on those platforms.

                                        There are many types of agreements – some formal, some less so – between developers of software and users of software regarding support and compatibility. Developers declare openly which parts of the software they consider to be supported with a compatibility promise, and consumers of the software declare openly that they will not expect support or compatibility promises for parts of the software which are not covered by that declaration.

                                        Semantic Versioning is a mildly-formal way of doing this. But it is focused on only one specific part: the public API of the software. It is not concerned with anything else, at all, ever, for any reason, under any circumstances. No matter how many times you pound the table and loudly demand that something else – like the build toolchain – be covered by a compatibility guarantee, Semantic Versioning will not budge on it.

                                        The cryptography change did not violate Semantic Versioning. The public API of the module after the rewrite was backwards-compatible with the public API before the rewrite. This is literally the one, only, exclusive thing that Semantic Versioning cares about, and it was not broken.

                                        Meanwhile, you appear to believe that by releasing a piece of software, the author takes on an unbreakable obligation to maintain compatibility for every possible way the software might ever be used, by anyone, on any platform, in any logically-possible universe, forever. Even if the author never promised anything resembling that. I honestly do not know what the basis of such an obligation would be, nor what chain of reasoning would support its existence.

                                        What I do know is that the topic of this thread was Semantic Versioning. Although the cryptography library does not use Semantic Versioning, the rewrite of the extension module in Rust did not violate Semantic Versioning. And I know that nothing gives you the right to make an enforceable demand of the developers that they maintain support and compatibility for building and running on architectures that they never committed to supporting in the first place, and nothing creates any obligation on their part to maintain such support and compatibility. The code is under an open-source license. If you depended on it in a way that was not supported by the developers’ commitments, your remedy is to maintain your own fork of it, as with any other upstream decision you dislike.

                            2. 4

                              “Should” is the key word here because I haven’t ever contributed to an open source project that has that as part of their policy neither have I observed it’s wide application given the state of third party packages.

                              The article specifically speaks about the divergence between aspiration and reality and what conclusions can be drawn from that.

                              1. 3

                                Unfortunately the aspiration is broken too.

                                1. 2

                                  Baby steps 😇

                              2. 3

                                It sounds like you’re proposing to use unit tests to prove that a minor release doesn’t introduce backwards-compatible changes. However, tests cannot substitute for proofs; there are plenty of infinite behaviors which we want to write down in code but we cannot exhaustively test.

                                All of these same problems happen in e.g. Haskell’s ecosystem. It turns out that simply stating that minor releases should only add backwards-compatible changes is just an opinion and not actually a theorem about code.

                                1. 1

                                  No I think they have a valid point. “Surely” implies that it’s normal to “change” unittests between minor versions, but the term “change” here mixes “adding new” and “modifying existing” in a misleading way. Existing unittests should not change between minor versions, as they validate the contract. Of course, they may change anyway, for instance if they were not functional at all, or tested something wrong, but it should certainly not be common.

                                  edit: I am mixing up unittests and system tests, my apologies. Unit tests can of course change freely, but they also have no relation to SemVer; the debate only applies to tests of the user-facing API.

                                  1. 2

                                    I know people use different terminology for the same things, but if the thing being tested is a software library, I would definitely consider any of the tests that aren’t reliant on something external (e.g. if you’re testing a string manipulation method) to be unit tests.

                                    1. 1

                                      Take any function from the natural numbers to the natural numbers. How do you unit-test it in a way that ensures that its behavior cannot change between semantic versions? Even property tests can only generate a finite number of test cases.

                                      1. 2

                                        I think the adage “code is written for humans to read, and only incidentally for computers to execute” applies to tests especially. Of course you can’t test every case, but intention does count.

                                    2. 1

                                      Aside:

                                      I just recently added a test that exercises the full API of a Rust library of mine, doing so in such a way that any backwards-compatible breaking changes would error if added. (The particular case was that I’d add a member to a config struct, and so anyone constructing that struct without including a ..StructName::default() at the end would suddenly have a compile error because they were missing a field.) This seemed to do the trick nicely and would remind me to bump the appropriate part of semver when making a release.

                                      I work on the library (and in the Rust ecosystem) infrequently so it’s not at the front of my mind. More recently I accepted a PR, and made a new release including it after. Then I got the warning, again, that I’d broken semver. Of course, the failing test was seen by the contributor and fixed up before they submitted the PR, so I never saw the alarm bells ringing.

                                  1. 2

                                    Nobody Has Suggested That Semantic Versioning Will Save Anyone

                                    1. 3

                                      Hey Peter big fan here! Sadly there’s been plenty suggesting that in that particular fiasco. Repeatedly. Even right now on Twitter in my mentions.

                                      There’s still a lot of assumptions about what SemVer can do for someone. I needed to write down the explanation why that’s not the case so I don’t have to repeat myself.

                                      1. 2

                                        Can you link to one of these examples? As I said below, my experience has consistently been that most, or almost all, people in semver’s demographic understand it is an approximate tool and not a panacea.

                                        1. 2

                                          I have to admit that “maintainer of a popular package thinks ‘almost all users’ have a realistic expectation from SemVer” was not on my bingo card!

                                          I suspect the kicker is

                                          people in semver’s demographic

                                          And that your demographic is simply different from mine. Maybe Python vs Go is all that it takes. Who knows. One of the main drivers why I write is to avoid repeating myself and I assure you I wouldn’t taken the time to write it if I didn’t expect to save time in the future.

                                          Can you link to one of these examples?

                                          I don’t want to call out people in public and if in your lived reality this isn’t a problem that’s fair enough.

                                          I state my premise in first paragraph and if it doesn’t apply to you or your users, it’s fair to skip it. Not sure if the sardonic dunk without reading it was necessary though.

                                          1. 3

                                            What makes you think I didn’t read the article?

                                            Like others, I think you’re “dunking” on semver unnecessarily. I generally agree with your description of it as a tl;dr of the changelog — but that’s incredibly valuable! 99% of the time I can trust it’s accurate and that’s a huge boon to my productivity. I understand it’s not necessarily accurate — and I haven’t yet encountered anyone who doesn’t understand it’s not necessarily accurate — but that’s fine, when it fails it’s detected with tests and that’s just an inconvenience more than anything.

                                          2. 2

                                            The entire Haskell ecosystem overly depends on semantic versioning. As a result, there are over 8000 Haskell packages in nixpkgs which are broken:

                                            $ git grep ' broken = true;' pkgs/development/haskell-modules/ | wc -l
                                            8835
                                            
                                        2. 3

                                          Here on this site, several discussions in the original thread about the pyca/cryptography change brought up SemVer and certainly appeared to my eyes to be suggesting that it would have prevented or mitigated the drama.

                                          While it is possible that you personally have never made such claims about SemVer (I have not bothered to check), it is an easily-demonstrated fact that others have, and the OP here reads to me as an argument against those claims as made by those people.

                                          1. 2

                                            Hmm. I remember that thread, and re-skimmed it now. I didn’t find anyone saying semver would have prevented the situation. It certainly would have mitigated it somewhat, though. And I don’t agree that “it is an easily-demonstrated fact” that any significant group of people believe that semver in itself is going to solve any problems. My experience has consistently been that most, or almost all, people in semver’s demographic understand it is an approximate tool and not a panacea.

                                        1. 6

                                          Spoiler: implementation used <ctype.h>, and was cursed by the C locale.

                                          1. 16

                                            It’s more subtle than that. ctype.h include an isascii, which returns true for things < 128. The FreeBSD libc version actually exposes this as a macro: (((c) & ~0x7F) == 0) (true if all of the bits above the low 7 are true). The regex library was using its own isascii equivalent implemented as isprint() || iscontrol(). These are locale-dependent and isprint will return true for a lot of non-ascii characters (most of unicode, in a unicode locale). This is not C locales’ fault, it is incorrect API usage.

                                            The fact that the C locales APIs are an abomination is a tangentially related fact.

                                            1. 2

                                              true if all of the bits above the low 7 are true

                                              Nit: it’s true if any bits above the low 7 are true.

                                              1. 2

                                                (true if all of the bits above the low 7 are true)

                                                false if any of the bits above the low 7 are true

                                              2. 3

                                                To be fair, this is a kind of bug you can run into without needing to hit C locales.

                                                For example, a few times I’ve had to re-teach people regexes in Python, because the behavior today isn’t what it was once upon a time.

                                                To take the most common example I personally see (because of the stuff I work on/with), Django used to only support regexes as the way to specify its URL routing. You write a regex that matches the URL you expect, tell Django to map it to a particular view, and any captured groups in the regex become arguments (keyword arguments for named captures, positional otherwise) used to call the view. Now there’s a simpler alternative syntax that covers a lot of common cases, but regexes are still supported when you need the kind of fine-grained/complex matching rules they provide.

                                                Anyway, suppose you want to build a blog, and you want to have the year, month, and day in the URL. Like /weblog/2021/02/26/post-title. Easy enough to do with regex. Except… most of the examples and tutorials floating around from days of yore are from a Python 2 world, where you could match things like the four-digit year with \d{4}. That only worked because Python 2 was an ASCII world, and \d was equivalent to [0-9]. In Python 3, the world is Unicode, and \d matches anything that Unicode considers to be a digit, which is a larger set than just the nine numerals of ASCII. So every once in a while someone pops up with “why is my regex URL pattern matching this weird stuff” and gets to learn that in a Python 3/Unicode world, if all you really want is [0-9], then [0-9] is what you have to write.

                                                This seems to have been an instance of the same problem, where the decision was deferred to some other system that might have its own ideas about things like “printable”, instead of just writing it correctly from the start to only match what it intended to match.

                                                1. 2

                                                  In Python 3, the world is Unicode, and \d matches anything that Unicode considers to be a digit, which is a larger set than just the nine numerals of ASCII

                                                  TIL!

                                                  1. 1

                                                    Have the Python core devs ever articulated what kind of use cases motivate “\d matches anything that Unicode considers to be a digit”?

                                                    1. 2

                                                      UTS#18 gives a set of recommendations for how regex metacharacters should behave, and its “Standard”-level recommendation (“applications should use this definition wherever possible”) is that \d match anything with Unicode general category Nd. This is what Python 3 does by default.

                                                      So I would presume there’s no need to articulate “use cases” for simply following the recommendation of the Unicode standards.

                                                      If you dislike this and want or absolutely need to use \d as a synonym for [0-9], you can explicitly switch the behavior to UTS#18’s “Posix Compatible” fallback by passing the re.ASCII flag to your Python regex (just as in Python 2 you could opt in to Unicode-recommended behavior by passing re.UNICODE). You also can avoid it altogether by not using str instances; the regex behavior on bytes instances is the ASCII behavior.

                                                1. 2

                                                  I appreciate most of the arguments, but the counter-point around security is missing the spot. For distributions, it is far easier to apply a patch on a single package. Rebuilding or not is not really the difficulty. Now, if many applications are bundling/pinning specific versions, distributions need to patch each version. Some of these versions may be very old and the patch may be more difficult to apply. This is a lot of work. Distributions cannot just bump the dependency as it goes against the stability promise and introduce bugs and changes. Distributions have to support what they ship for around 5 years (because many users use distributions for this exact purpose) while developers usually like to support things for a few months.

                                                  Unfortunately both sides do not want to move an inch. When packaging for Debian, I would appreciate being able to bundle dependencies instead of packaging each single dependency, but there must be some ways to guarantee we are not just multiplying the amount of work we need to provide in the future. However, this is not new. Even with C, many devs do not like distributions freezing their software for 5 years.

                                                  1. 11

                                                    The real “issue” from the distro perspective is that they’re now trying to package ecosystems that work completely differently than the stuff they’re used to packaging, and specifically ecosystems where the build process is tied tightly to the language’s own tooling, rather than the distro’s tooling.

                                                    This is why people keep talking about distros being stuck on twenty-years-ago’s way of building software. Or, really, stuck on C’s way of building software. C doesn’t come with a compiler, or a build configuration tool, or a standard way to specify dependencies and make sure they’re present and available either during build or at runtime. C is more or less just a spec for what the code ought to do when it’s run. So distros, and everybody else doing development in C, have come up with their own implementations for all of that, and grown used to that way of doing things.

                                                    More recently-developed languages, though, treat a compiler and build tool and dependencies/packaging as a basic requirement, and tightly integrate to their standard tooling. Which then means that the distro’s existing and allegedly language-agnostic tooling doesn’t work, or at least doesn’t work as well, and may not have been as language-agnostic as they hoped.

                                                    Which is why so many of the arguments in these threads have been red herrings. It’s not that “what dependencies does this have” is some mysterious unanswerable question in Rust, it’s that the answer to the question is available in a toolchain that isn’t the one the distro wants to use. It’s not that “rebuild the stuff that had the vulnerable dependency” is some nightmare of tracking down impossible-to-know information and hoping you caught and patched everything, it’s that it’s meant to be done using a toolchain and a build approach that isn’t the one the distro wants to use.

                                                    And there’s not really a distro-friendly thing the upstream developers can do, because each distro has its own separate preferred way of doing this stuff, so that’s basically pushing the combinatorial nightmare upstream and saying “take the information you already provide in your language’s standard toolchain, and also provide and maintain one additional copy of it for each distro, in that distro’s preferred format”. The only solution is for the distros to evolve their tooling to be able to handle these languages, because the build approach used in Rust, Go, etc. isn’t going away anytime soon, and in fact is likely to become more popular over time.

                                                    1. 5

                                                      The only solution is for the distros to evolve their tooling to be able to handle these languages

                                                      The nixpkgs community has been doing this a lot. Their response to the existence of other build tools has been to write things like bundix, cabal2nix and cargo2nix. IIRC people (used to) use cabal2nix to make the whole of hackage usable in nixpkgs?

                                                      From the outside it looks like the nix community’s culture emphasizes a strategy of enforcing policy by making automations whose outputs follow it.

                                                      1. 4

                                                        Or, really, stuck on C’s way of building software.

                                                        I think it’s at least slightly more nuanced than that. Most Linux distributions, in particular, have been handling Perl modules since their earliest days. Debian/Ubuntu use them fairly extensively even in base system software. Perl has its own language ecosystem for building modules, distributing them in CPAN, etc., yet distros have generally been able to bundle Perl modules and their dependencies into their own package system. End users are of course free to use Perl’s own CPAN tooling, but if you apt-get install something on Debian that uses Perl, it doesn’t go that route, and instead pulls in various libxxx-perl packages. I don’t know enough of the details to know why Rust is proving more intractable than Perl though.

                                                        1. 6

                                                          I don’t know enough of the details to know why Rust is proving more intractable than Perl though

                                                          There is a big difference between C, Perl, Python on the one side and Rust on the other.

                                                          The former have a concept of “search path”: there’s a global namespace where all libraries live. That’s include path for C, PYTHONPATH for Python and @INC (?) for Perl. To install a library, you put it into some blessed directory on the file system, and it becomes globally available. The corollary here is that every one is using the same version of library. If you try to install two different versions, you’ll get a name conflict.

                                                          Rust doesn’t have global search path / global namespace. “Installing rust library” is not a thing. Instead, when you build a piece of Rust software, you need to explicitly specify path for every dependency. Naturally, doing this “by hand” is hard, so the build system (Cargo) has a lot of machinery for wiring a set of interdependent crates together.

                                                          1. 2

                                                            there’s a global namespace where all libraries live

                                                            Yes, this is one of the biggest differences. Python, Perl, etc. come out of the Unix-y C-based tradition of not having a concept of an “application” you run or a “project” you work on, but instead only of a library search path that’s assumed to be shared by all programs in that language, or at best per-user unique so that one user’s set of libraries doesn’t pollute everyone else’s.

                                                            Python has trended away from this and toward isolating each application/project – that’s the point of virtual environments – but does so by just creating a per-virtualenv search path.

                                                            More recently-developed languages like Rust have avoided ever using the shared-search-path approach in the first place, and instead isolate everything by default, with its own project-local copies of all dependencies.

                                                            (the amount of code generation/specialization that happens at compile time in Rust for things like generics is a separate issue, but one that distros – with their ability to already handle C++ – should in theory not have trouble with)

                                                        2. 4

                                                          This is why people keep talking about distros being stuck on twenty-years-ago’s way of building software. Or, really, stuck on C’s way of building software. C doesn’t come with a compiler, or a build configuration tool, or a standard way to specify dependencies and make sure they’re present and available either during build or at runtime. C is more or less just a spec for what the code ought to do when it’s run. So distros, and everybody else doing development in C, have come up with their own implementations for all of that, and grown used to that way of doing things.

                                                          More than that, I’d say they’re language-specific package managers around autotools+C.

                                                      1. 6

                                                        If you check the PyPI files for Fil, you’ll see there are manylinux2010 wheels, and no source packages at all; because building from source is a little tricky, I only distribute compiled packages.

                                                        A side-note but this bothers me a bit. What if the user is on FreeBSD? Or they are on Linux and prefer to compile from source. This is taking the docker route where people download opaque binary blobs and hope it does what it says on the label. Except that the binary is not even sandboxed.

                                                        1. 4

                                                          I explicitly don’t support FreeBSD, or rather, explicitly only support Linux with glibc and macOS, for two reasons:

                                                          1. It’s… a tricky project, involving LD_PRELOAD (or macOS equivalent), bugs can be like “clean up of thread locals hits wrong code path, leading to segfault”, or “C thread locals on this platform allocate, so I have to use pthread thread locals” instead. Given limited time, I want to focus on fixing issues in platforms people use.
                                                          2. The audience, people doing data science, data engineering, and scientific computing, are most likely to use Windows, macOS, and Linux. Insofar as I was going to spend more time on a third platform, it would be Windows, and maybe musl for Alpine Linux if someone made a compelling case.

                                                          My other projects have source distribution as you’d expect, and should work anywhere.

                                                          1. 1

                                                            What if the user is on FreeBSD?

                                                            First, keep in mind this only applies to packages that include extensions written in non-Python languages. But if it’s not a platform that the wheel format can support, then the answer is they install from a source package (.tar.gz) instead of a pre-compiled .whl, and the installation will include a build step and require a compiler toolchain and any necessary libraries to link against. Pure Python packages don’t have this issue.

                                                            The base problem really is ABI compatibility and stability. The various iterations of the manylinux platform tags have been based around distros which all shipped glibc (which enables built artifacts that can dynamically link against multiple versions), and an extremely conservative set of other shared libraries.

                                                            Or they are on Linux and prefer to compile from source.

                                                            They can install from a source package instead of pre-compiled .whl. In fact this is what Alpine users have to do, since Alpine is not a glibc distro and thus not ABI-compatible with manylinux packages. You can force the use of source package with the --no-binary command-line flag to pip.

                                                            This is taking the docker route where people download opaque binary blobs and hope it does what it says on the label.

                                                            The alternative is to download source you know you’re never going to manually audit, compile it and hope for the best. And your first complaint was “this platform can’t get pre-built binaries”, now your complaint seems to be that pre-built binaries are bad anyway.

                                                            (you can do integrity checking of packages at download time, incidentally, and the Python Package Index also supports attaching cryptographic signatures to uploaded packages, but as someone who for many years produced most of Django’s packages and signed every one of them, I can also count on one hand the number of people who I know actually made use of those signatures)

                                                          1. 8

                                                            As a quick refresher because it isn’t fully explained in the article: the precise restriction of Python’s GIL is that only one thread can be executing Python byteocde and/or accessing the interpreter’s C API at any given time.

                                                            This is a historical tradeoff dating to when most things people wanted threading for were going to be I/O-bound and running on the single-processor/single-core hardware of the era, which did not foresee the CPU-bound tasks and multi-processor/multi-core hardware that are common now. The popular numeric/scientific libraries for Python get around this by being Python wrappers around fast implementations of the mathematical stuff written in C/C++/Fortran/etc. and by releasing the GIL whenever they have a bunch of number-crunching to do that doesn’t require access to bytecode or the interpreter API, which means they can do multithreaded CPU-bound workloads more efficiently.

                                                            1. 8

                                                              A lot of this just seems like going against the grain of the distro when using Docker and wondering why that’s not good.

                                                              1. 19

                                                                Isn’t it how everything goes? People are not satisfied with this distro, they create another. People can’t bother with their distro, they create npm, pip, …. People realize language specific ones are not enough, they create conda. People think conda is bloated, they create miniconda. People can’t bother with installing anything, they use Docker. People still need to install things inside Docker, they choose a distro inside the Docker. Ad infinitum.

                                                                1. 5

                                                                  People realize language specific ones are not enough, they create conda.

                                                                  It is reasonable to ask to manage language-specific packages together with other libraries. Many language packages rely on various C libraries.

                                                                  I think this is mostly a failing of traditional distribution package management. If you still insist on writing a .spec or rules file for e.g. every Rust crate that a developer might use from crates.io [1], you are never going to keep up. Additionally, traditional package managers cannot deal well with installing different versions of a package in parallel. Sure, you can hack around both problems, e.g. by automatically generating .spec files and sticking versions into package names, so that ndarray 0.13.0 is not seen as an upgrade of ndarray 0.12.0. But it’s going to be an ugly hack. And you still cannot pin system libraries to particular versions.

                                                                  So, you either have to accept that your package manager cannot deal with language package ecosystems and project-specific version pinning. Or you have to change your package manager so that it is possible to programmatically generate packages and permit multiple parallel versions. While they may not be the final solution, Nix and Guix do not have this problem and can just generate packages from e.g. Cargo metadata and deal with multiple semver-incompatible crate versions without any issues.

                                                                  [1] Deliberately not using Python as an example here, because Python packaging has many issues by itself, such as not permitting use multiple versions of a package by a single Python interpreter.

                                                                  1. 2

                                                                    This is mostly a failing of the python ecosystem or our software ecosystem as a whole. The very idea of wanting a set of “packages the precise set of things you want” (borrowing from another commenter here) is absurd. Users should pin their blames to those developers releasing packages without any sort of backward compatibility. If there are 10 packages release backward incompatible “updates”, the users got theirselves 1024 choices to choose from. Somehow, people still think it’s a mere package management problem. No.

                                                                    1. 8

                                                                      Users should pin their blames to those developers releasing packages without any sort of backward compatibility

                                                                      No, this is a solved problem. Cargo and npm do work with lots of messy dependencies. It’s the old inflexible package managers that aren’t keeping up. Traditional package managers blame the world for having software not fitting their inflexible model, instead of changing their model to fit how software is actually released.

                                                                      Cargo/npm have solved this by:

                                                                      1. Using Semver. It’s not a guarantee, but it works 99% of the time, and that is way better than letting deps break every time.
                                                                      2. Allowing multiple incompatible versions of the same package to coexist. In large dependency graphs probability of a conflict approaches 1. Package managers that can’t work around conflicts are not scalable, and end up hating people for using packages too much.
                                                                      1. 1

                                                                        Right. Rust and Node solved everything. The rest of the world really can’t keep up. Why don’t we just rewrite everything in Rust and Node? You can have your package links with libA and libB, while libA links with libD.so.0 and libB links with libD.so.1. Wait, the system still has an ancient libc. Right. Those .so files are just relic from the past and it’s a mystery we are still using them. So inflexible.

                                                                        Cargo/npm have solved this

                                                                        It truly made my day. Thanks. I needed this for the weekend.

                                                                        1. 2

                                                                          .so files are just relic from the past

                                                                          Yes, they are. Sonumbers alone don’t help, because package managers also need a whole requires/provides abstraction layer, renamed packages with version numbers in the name, and evidently they rarely bother with this. Lack of namespacing in C and global header include paths complicate it further.

                                                                          To this day I’m suffering distros that can’t upgrade libpng, because it made slightly incompatible change 6 years ago. Meanwhile Cargo and npm go brrrrr.

                                                                  2. 1

                                                                    The problem is at least partially social, not technical. It’s far easier to start a new project than join an existing one, because you have to prove yourself in a new project. For some projects (say, Debian) there’s also significant bureacracy involved.

                                                                  3. 15

                                                                    Docker is irrelevant to the message of the story. They just didn’t have a box with Ubuntu 18.04 around to demonstrate the problems and resorted to a Docker container. A VM or bare metal with 18.04 would have told the same story.

                                                                    1. 5

                                                                      Could you expand? The general issue is the mismatch between distro release cycles and software development cycles: you need to bootstrap a newer toolchain in most languages. Python’s has some specific problems others don’t (e.g. no transitive pinning by default, unlike say Cargo, so new package releases are more likely to break things), but every unhappy family etc..

                                                                      1. 2

                                                                        If you want specific versions of a tool, then perhaps it would be better to switch to a distro that offers that, or remove the distro from the equation.

                                                                        1. 8

                                                                          What if there is no distro in existence that packages the precise set of things you want? The whole point of language packaging ecosystems is to solve that problem; otherwise the options are “find a way to stick to only what your distro decided to package”, or “make your own distro that system-packages the things you want”, or “make a distro that system-packages every version of everything”.

                                                                          And that’s without getting into the fact that distros historically change popular languages and their packages, sometimes in ways that are not compatible with upstream. For example, Django finally gave in and switched the name of its main management command from django-admin.py to django-admin, in part because some distros had absolute hard-line policies on renaming it to django-admin when they packaged Django, leading to a mismatch between what Django’s documentation said and what the distro had installed.

                                                                          And it’s especially without getting into the fact that many Linux distros ship languages like Python because some of the distro’s own tooling is written in those languages, and you want clean isolation between that and your own application code. Which you can’t get if you go the system-package-only route.

                                                                          So yes, even in Docker, you should be using language packages and the language’s package manager. It’s not that hard to add a pip upgrade line to your Dockerfile and just have it run on image build.

                                                                          1. 2

                                                                            What if there is no distro in existence that packages the precise set of things you want?

                                                                            I feel like this is covered by the second half of the statement:

                                                                            or remove the distro from the equation.

                                                                            That is done - for instance - by using language packages.

                                                                        2. 1

                                                                          no transitive pinning by default

                                                                          This is why I decided to go straight to Poetry. The ride has been bumpy (python is a chaotic world when it comes to package metadata), but at least now we have a lock file. Rust really hit a home run with Cargo.

                                                                        3. 3

                                                                          Why does it have to be one or the other?

                                                                          I want to be more up to date on Python and it’s packages than other things like the libc, kernel version, …

                                                                        1. 12

                                                                          I recall hearing about “framework fatigue” 10 years ago, and no doubt it was already a thing 15 years ago (if perhaps in a different way – these frameworks were developed in order to prevent people from needing to learn the already unnecessarily complex details of dynamic web programming in raw javascript, all the browser-dependent CSS behaviors, etc.)

                                                                          It’s trivially true that we can just “choose not to” use frameworks. It’s also practically false, for most professional developers! Because I started off as a backend developer, it’s tolerated when, during my few forays into frontend development, I throw up my hands and say “no frameworks, no jquery – we’re doing this in plain vanilla javascript” – but, just as in my backend work I work with multi-million-line legacy java codebases organized around stupid design trends and relying on insecure and bloated third party libraries that were hyped decades ago & can’t get away with just throwing them out and rewriting them from scratch, my peers who do frontend work full time are stuck in the same situation with frontend trends. (By the time our multi-year migration from GWT to Angular was done, we were expected to move to React, or whatever is hip now. This corporate-mandating hype-chasing prevented the actual frontend people from fixing real bugs and performance problems, which led to shit like backend devs like me swooping in to patch in big ugly blobs of plain javascript.)

                                                                          When we are developing for ourselves, we can use whatever tech stack we like. We can adhere to our own aesthetics. (My personal website uses plain static HTML, generated by a shell script, and only one page has any javascript at all.) But for fully a third of our lives, we are required to share tech stacks with our coworkers and obey the dictates of our employers – in other words, we are pushed in particular directions by the junior devs below us (who are prone to framework and platform hype because they haven’t been around long enough to know better, and who outnumber us) and technical management above us (who are prone to framework and platform hype due to distance from the realities of development, and who have power and authority over us). It’s foolish to dismiss those forces – even as we can sometimes, with effort, work against them.

                                                                          1. 5

                                                                            This corporate-mandating hype-chasing prevented the actual frontend people from fixing real bugs and performance problems

                                                                            I bring this up a lot, but I feel like it’s worth pointing out that, often, someone else who’s pushing the rewrite-in-new-thing is doing it precisely because they know that a lot of what should be regular maintenance work is disallowed as not directly contributing to “velocity”, so selling a ground-up rewrite with shiny bells and whistles (which can be translated into “velocity”) is actually easier and then they hope to clean up some of the old codebase’s issues during the port.

                                                                            It doesn’t always work, of course, but in some organizations a rewrite in a new language or framework is the only way to get accumulated maintenance done (and even then it’s more just throwing out accumulated cruft and starting over, hoping to avoid the same mistakes as last time and instead make exciting new mistakes this time around).

                                                                            1. 2

                                                                              By the time our multi-year migration from GWT to Angular was done, we were expected to move to React, or whatever is hip now. This corporate-mandating hype-chasing prevented the actual frontend people from fixing real bugs and performance problems, which led to shit like backend devs like me swooping in to patch in big ugly blobs of plain javascript.

                                                                              There’s a real tension here for hireability. At one of my first jobs, I was also working on a massive Java monolithic codebase that served a (sadly famous) website. The framework was custom built atop the Servlet API and relied on abstractions that predated MVC, but used the terms Model-View-Controller in ways subtly but importantly different than the currently widely accepted MVC architecture. It made development for me a nightmare, and every time we onboarded a new junior engineer, we had to reintroduce them to pattern that were basically only relevant to our codebase. Eventually we switched to a more “modern” framework, not because the abstractions were leaky, but simply that it was just too costly to spend months teaching our juniors about these abstractions. Moreover a lot of engineers were fairly angry that they were learning a set of abstractions that were not transferable out of our organization in any way.

                                                                              But for fully a third of our lives, we are required to share tech stacks with our coworkers and obey the dictates of our employers – in other words, we are pushed in particular directions by the junior devs below us (who are prone to framework and platform hype because they haven’t been around long enough to know better, and who outnumber us) and technical management above us (who are prone to framework and platform hype due to distance from the realities of development, and who have power and authority over us). It’s foolish to dismiss those forces – even as we can sometimes, with effort, work against them.

                                                                              I want to emphasize that different folks have very different aesthetic tastes. I’ve worked with folks who love and swear by dynamic languages, I’ve met folks that want to rewrite everything they touch into monad transformer stacks, and I’ve worked with people all over in between. There’s folks that love classes, there’s folks that only want classes to act as dumb structs, there’s people that like to code in only functions, it all runs the gamut. Working in a team in any endeavor is often an exercise in compromise of opinions with the benefit of having multiple people to assist in execution. There’s no “right” and “wrong” here.

                                                                              1. 4

                                                                                I want to emphasize that different folks have very different aesthetic tastes. […] There’s no “right” and “wrong” here.

                                                                                There are absolutely different tastes.

                                                                                On the other hand, there are design practices that are fast and don’t scale (or vice versa), in some objective sense. For instance, I don’t think anybody here would say that whether or not to use revision control is purely an aesthetic matter with no effect on maintainability.

                                                                                Very often, novice programmers are attracted to techniques that make rapid iteration easier without developing the associated habits that make the resulting code maintainable, and at the same time, as developers gain experience we tend to develop strong personal preferences based on that experience that begin to outweigh hype and the expressed preferences of our peers. Our experience allows us to reject claims about improved productivity or ease of use that seem sensible to a naive outsider but that (predictably) don’t hold up when executed – because we have seen similar claims and tried them in the past, or because we have a better intuition for the mechanics of software development at scale than the people who invented the new schemes. So, overblown hype specifically targets the top and the bottom.

                                                                            1. 20

                                                                              Python package maintainers rarely use semantic versioning and often break backwards compatibility in minor releases. One of several reasons that dependency management is a nightmare in Python world.

                                                                              1. 18

                                                                                I generally consider semantic versioning to be a well-intentioned falsehood. I don’t think that package vendors can have effective insight into which of their changes break compatibility when they can’t have a full bottom-up consumer graph for everyone who uses it.

                                                                                I don’t think that Python gets this any worse than any other language.

                                                                                1. 20

                                                                                  I’ve heard this opinion expressed before… I find it to be either dangerously naive or outright dishonest. There’s a world of difference between a) the rare bug fix release or nominally-orthogonal-feature-add release that unintentionally breaks downstream code and b) intentionally changing and deprecating API’s in “minor” releases.

                                                                                  In my view, adopting SemVer is a statement of values and intention. It communicates that you value backwards compatibility and intend to maintain it as much as is reasonably possible, and that you will only knowingly break backwards compatibility on major release increments.

                                                                                  1. 18

                                                                                    In my view, adopting SemVer is a statement of values and intention. It communicates that you value backwards compatibility and intend to maintain it as much as is reasonably possible, and that you will only knowingly break backwards compatibility on major release increments.

                                                                                    A “statement of values and intention” carries no binding commitment. And the fact that you have to hedge with “as much as is reasonably possible” and “only knowingly break” kind of gives away what the real problem is: every change potentially alters the observable behavior of the software in a way that will break someone’s reliance on the previous behavior, and therefore the only way to truly follow SemVer is to increment major on every commit. Which is the same as declaring the version number to be meaningless, since if every change is a compatibility break, there’s no useful information to be gleaned from seeing the version number increment.

                                                                                    And that’s without getting into some of my own direct experience. For example, I’ve been on the Django security team for many years, and from time to time someone has found a security issue in Django that cannot be fixed in a backwards-compatible way. Thankfully fewer of those in recent years since many of them related to weird old functionality dating to Django’s days as a newspaper CMS, but they do happen. Anyway, SemVer’s answer to this is “then either don’t fix it, or do but no matter how you fix it you’ve broken SemVer and people on the internet will scream at you and tell you that you ought to be following SemVer”. Not being a fan of no-win situations, I am content that Django has never and likely never will commit to following SemVer.

                                                                                    1. 31

                                                                                      A “statement of values and intention” carries no binding commitment.

                                                                                      A label on a jar carries no binding commitment to the contents of the jar. I still appreciate that my salt and sugar are labelled differently.

                                                                                      1. 2

                                                                                        Selling the jar with that label on it in many countries is a binding commitment and puts you under the coverage of food safety laws, though.

                                                                                      2. 6

                                                                                        Anyway, SemVer’s answer to this is “then either don’t fix it, or do but no matter how you fix it you’ve broken SemVer and people on the internet will scream at you and tell you that you ought to be following SemVer”.

                                                                                        What do you mean? SemVer’s answer to “this bug can’t be fixed in a backwards-compatible way” is to increment the major version to indicate a breaking change. You probably also want to get the message across to your users by pushing a new release of the old major version which prints some noisy “this version of blah is deprecated and has security issues” messages to the logs.

                                                                                        It’s not perfect, I’m not saying SemVer is a silver bullet. I’m especially worried about the effects of basing automated tooling on the assumption that no package would ever push a minor or patch release with a breaking change; it seems to cause ecosystems like the NPM to be highly fragile. But when taken as a statement of intent rather than a guarantee, I think SemVer has value, and I don’t understand why you think your security issue anecdote requires breaking SemVer.

                                                                                        1. 7

                                                                                          What do you mean? SemVer’s answer to “this bug can’t be fixed in a backwards-compatible way” is to increment the major version to indicate a breaking change.

                                                                                          So, let’s consider Django, because I know that well (as mentioned above). Typically Django does a feature release (minor version bump) every 8 months or so, and every third one bumps the major version and completes a deprecation cycle. So right now Django 3.1 is the latest release; next will be 3.2 (every X.2 is an LTS), then 4.0.

                                                                                          And the support matrix consists of the most recent feature release (full bugfix and security support), the one before that (security support only), and usually one LTS (but there’s a period at the end of each where two of them overlap). The policy is that if you run on a given LTS with no deprecation warnings issued from your code, you’re good to upgrade to the next (which will be a major version bump; for example, if you’re on 2.2 LTS right now, your next LTS will be 3.2).

                                                                                          But… what happens when a bug is found in an LTS that can’t be fixed in a backwards-compatible way? Especially a security issue? “Support for that LTS is cut off effective immediately, everybody upgrade across a major version right now” is a non-starter, but is what you propose as the correct answer. The only option is to break SemVer and do the backwards-incompatible change as a bugfix release of the LTS. Which then leads to “why don’t you follow SemVer” complaints. Well, because following SemVer would actually be worse for users than this option is.

                                                                                          1. 3

                                                                                            But… what happens when a bug is found in an LTS that can’t be fixed in a backwards-compatible way?

                                                                                            Why do people run an LTS version, if not for being able to avoid worrying about it as a dependency? If you’re making incompatible changes: forget about semver, you’re breaking the LTS contract, and you may as well tell drop the LTS tag and people to run the latest.

                                                                                            1. 1

                                                                                              you may as well tell drop the LTS tag and people to run the latest

                                                                                              I can think of only a couple instances in the history of Django where it happened that a security issue couldn’t be fixed in a completely backwards-compatible way. Minimizing the breakage for people – by shipping the fix into supported releases – was the best available option. It’s also completely incompatible with SemVer, and is a great example of why SemVer is at best a “nice in theory, fails in practice” idea.

                                                                                              1. 3

                                                                                                Why not just tell them to upgrade? After all, your argument is essentially that stable APIs are impossible, so why bother with LTS? Every argument against semver also applies against LTS releases.

                                                                                                1. 3

                                                                                                  After all, your argument is essentially that stable APIs are impossible

                                                                                                  My argument is that absolute perfect 100% binding commitment to never causing a change to observable behavior ever under any circumstance, unless also incrementing the major version at the same time and immediately dropping support for all users of previous versions, is not practicable in the real world, but is what SemVer requires. Not committing to SemVer gives flexibility to do things like long-term support releases, and generally people have been quite happy with them and also accepting of the single-digit number of times something had to change to fix a security issue.

                                                                                            2. 2

                                                                                              “Support for that LTS is cut off effective immediately, everybody upgrade across a major version right now” is a non-starter

                                                                                              If it’s a non-starter then nobody should be getting the critical security patch. You’re upgrading from 2.2 to 3.0 and calling it 2.2.1 instead. That doesn’t change the fact that a breaking change happened and you didn’t bump the major version number.

                                                                                              You can’t issue promises like “2.2.X will have long term support” because that’s akin to knowing the future. Use a codename or something.

                                                                                              1. 7

                                                                                                It’s pretty clear you’re committed to perfect technical adherence to a rule, without really giving consideration to why the rule exists. Especially if you’re at the point of “don’t commit to supporting things, because supporting things leads to breaking SemVer”.

                                                                                                1. 4

                                                                                                  They should probably use something like SemVer but with four parts, e.g. Feature.Major.Minor.Patch

                                                                                                  • Feature version changes -> We’ve made significant changes / a new release (considered breaking)
                                                                                                  • Major version change -> We’ve made breaking changes
                                                                                                  • Minor version change -> Non breaking new features
                                                                                                  • Patch version change -> Other non-breaking changes

                                                                                                  That way 2.*.*.* could be an LTS release, which would only get bug fixes, but if there was an unavoidable breaking change to fix a bug, you’d signal this in the version by e.g. going from 2.0.5.12 to 2.1.0.0. Users will have to deal with the breaking changes required to fix the bug, but they don’t have to deal with all the other major changes which have gone into the next ‘Feature’ release, 3.*.*.*. The promise that 2.*.*.*, as an LTS, will get bug fixes is honored. The promise that the major version must change on a breaking change is also honored.

                                                                                                  SemVer doesn’t work if you try to imbue the numbers with additional meanings that can contradict the SemVer meanings.

                                                                                                  1. 3

                                                                                                    This scheme is very similar to Haskell’s Package Versioning Policy (PVP).

                                                                                                  2. 1

                                                                                                    I’m saying supporting things and adhering to SemVer should be orthogonal.

                                                                                            3. 5

                                                                                              every change potentially alters the observable behavior of the software

                                                                                              This is trivially false. Adding a new helper function to a module, for example, will never break backwards compatibility.

                                                                                              In contrast, changing a function’s input or output type is always a breaking change.

                                                                                              By failing to even attempt to distinguish between non-breaking and breaking changes, you’re offloading work onto the package’s users.

                                                                                              Optimize for what should be the common case: non-breaking changes.

                                                                                              Edit: to expand on this, examples abound in the Python ecosystem of unnecessary and intentional breaking changes in “minor” releases. Take a look at the numpy release notes for plenty of examples.

                                                                                              1. 7

                                                                                                Python’s dynamic nature makes “adding a helper function” a potentially breaking change. What if someone was querying, say, all definitions of a module and relying on the length somehow? I know this is a bit of a stretch, but it is possible that such a change would break code. I still value semver though.

                                                                                                1. 3

                                                                                                  The number of definitions in a module is not a public API. SemVer only applies to public APIs.

                                                                                                  1. 4

                                                                                                    If you can access it at run-time, then someone will depend on it, and it’s a bit late to call it “not public”. Blame Python for exposing stuff like the call stack to introspection.

                                                                                                    1. 2

                                                                                                      Eh no? SemVer is very clear about this. Public API is whatever software declares it to be. Undeclared things can’t be public API, by definition.

                                                                                                      1. 7

                                                                                                        Python has no concept of public vs private. It’s all there all the time. As they say in python land, “We’re all consenting adults here”.

                                                                                                        I’m sure, by the way, when Hettinger coined that phrase he didn’t purposely leave out those under the age of 18. Language is hard. :P

                                                                                                2. 1

                                                                                                  Adding a new helper function to a module, for example, will never break backwards compatibility.

                                                                                                  Does this comic describe a violation of SemVer?

                                                                                                  You seriously never know what kinds of things people might be relying on, and a mere definition of compatibility in terms of input and output types is woefully insufficient to capture the things people will expect in terms of backwards compatibility.

                                                                                                  1. 6

                                                                                                    No, it does not descripbe a violation of SemVer, because spacebar heating is not a public API. SemVer is very clear about this. You are right people will still complain about backward compatibility even if you are keeping 100% correct SemVer.

                                                                                              2. 6

                                                                                                I would agree if violations were rare. Every time I’ve tried to solve dependency issues on Python, about 75% of the packages I look into have broken semver on some level. Granted, I probably have a biased sampling technique, but I find it extremely hard to believe that it’s a rare issue.

                                                                                                Backwards compatibility is hard to reason about, and the skill is by no means pervasive. Even having a lot of experience looking for compatibility breaks, I still let things slip, because it can be hard to detect. One of my gripes with semver is that it doesn’t scale. It assumes that tens of thousands of open source devs with no common training program or management structure all understand what a backwards breaking change is, and how to fix it.

                                                                                                Testing for compatibility breaks is rare. I can’t think of any Python frameworks that help here. Nor can I think of any other languages that address this (Erlang might, but I haven’t worked with it first-hand). The most likely projects to test for compatibility between releases are those that manage data on disk or network packets. Even among those, many rely on code & design review to spot issues.

                                                                                                It communicates that you value backwards compatibility and intend to maintain it as much as is reasonably possible, and that you will only knowingly break backwards compatibility on major release increments.

                                                                                                It’s more likely that current package managers force you into semver regardless if you understand how it’s supposed to be used. The “statement of values” angle is appealing, but without much evidence. Semver is merely popular.

                                                                                                1. 7

                                                                                                  I guess this depends on a specific ecosystem? Rust projects use a lot of dependencies, all those deps use semver, and, in practice, issues rarely arise. This I think is a combination of:

                                                                                                  • the fact that semver is the only option in Rust
                                                                                                  • the combination of guideline to not commit Cargo.lock for libraries + cargo picking maximal versions by default. This way, accidental incompatibilities are quickly discovered & packages are yanked.
                                                                                                  • the guideline to commit Cargo.lock for binaries and otherwise final artifacts: that way folks who use Rust and who have the most of deps are shielded from incompatible updates.
                                                                                                  • the fact that “library” is a first-class language construct (crate) and not merely a package manager convention + associated visibility rules makes it easier to distinguish between public & private API.
                                                                                                  • Built-in support for writing test from the outside, as-if you are consumer of the library, which also catches semver-incompatible changes.

                                                                                                  This is not to say that semver issues do not happen, just that they are rare enough. I’ve worked with Rust projects with 200-500 different deps, and didn’t pensive semver breakage being a problem.

                                                                                                  1. 5

                                                                                                    I would add that the Rust type system is expressive enough that many backwards incompatible changes require type signature changes which are much more obvious than violations of some implicit contract.

                                                                                                2. 6

                                                                                                  I don’t think I have a naïve view of versioning; putting on my professional hat here, I have a decade of experience dealing with a dependency modeling system that handles the versions of hundreds of thousands of interrelated software artifacts that are versioned more or less independently of each other, across dozens of programming languages and runtimes. So… some experience here.

                                                                                                  In all of this time, I’ve seen every single kind of breaking change I could imagine beforehand, and many I could not. They occurred independent of how the vendor of the code thought of it; a vendor of a versioned library might think that their change is minor, or even just a non-impacting patch, but outside of pure README changes, it turns out that they can definitely be wrong. They certainly had good intentions to communicate the nature of the change, but that intention can run hard into reality. In the end, the only way to be sure is to pin your dependencies, all the way down, and to test assiduously. And then upgrade them frequently, intentionally, and on a cadence that you can manage.

                                                                                                  1. 1

                                                                                                    I don’t think I have a naïve view of versioning; putting on my professional hat here, I have a decade of experience dealing with …

                                                                                                    Here here. My experience isn’t exactly like @offby1’s but I can vouch for the rest.

                                                                                                  2. 4

                                                                                                    to be either dangerously naive or outright dishonest

                                                                                                    This phrase gets bandied around the internet so much I’m surprised its not a meme.

                                                                                                    SemVer is … okay, but you make it sound like lives depend on it. There’s a lot of software running mission critical systems without using SemVer and people aren’t dying everyday because of it. I think we can calm down.

                                                                                                3. 3

                                                                                                  Thats the problem of the package management being so old. Back then semantic versioning wasnt that common and it never really caught on. In my opinion the PyPA should make a push to make more packages use semantic versioning. I‘m seeing this trend already, but its too slow…

                                                                                                1. 3

                                                                                                  It was only a few months ago that I learned all Python integers are heavyweight, heap-allocated objects. This kind of boggled my mind. This post says that small ints in the 8-bit range are cached, but that’s not much of an optimization.

                                                                                                  Anyone know why Python never started using pointer-tagging for ints, the way tons of other dynamic languages (from LISP to Smalltalk to Lua) do?

                                                                                                  (Fun fact: Smalltalk-80 used pointer-tagging for ints up to +/-16384, and allocated objects for bigger ones. This was really apparent in the GUI: the text view would start to massively slow down as soon as the length of the text exceeded 16KB, because the layout and display would start allocating more and more numbers.)

                                                                                                  1. 6

                                                                                                    Virtually everything in Python is a “heavyweight” object, and neither ultra-low memory use nor blazing-fast bare-metal performance are particularly important to the average Python user. So there’s no real incentive to complicate the implementation with multiple differently-sized or differently-allocated backing types for int. In fact, Python moved away from that. In the old old days, Python had two integer types: one could only handle 32-bit values, and the other was a bignum. You could request the bignum with a suffix on an integer literal – 42 versus 42L – but despite some effort they were never quite interchangeable. The unification to a single (bignum) integer type began in Python 2.2 (released 2001) and now the old L suffix is a syntax error.

                                                                                                    1. 1

                                                                                                      neither ultra-low memory use nor blazing-fast bare-metal performance are particularly important to the average Python user

                                                                                                      They’re not THAT important, but they would sure be nice. It would probably result in me writing more Python and less, say, Rust.

                                                                                                    2. 2

                                                                                                      If you launch an interpreter and use sys.getrefcount, you can see that the first 10 numbers already sum up to over 800 references (Windows, Python 3.9.1). I’m guessing it does help, or it probably wouldn’t be there.

                                                                                                    1. 134

                                                                                                      For those who are curious about what’s going on and lack context…

                                                                                                      Let’s start with a quick refresher on Python packaging. When you use Python’s own default toolchain to package up and distribute some code, you generally have a choice of either or both of two types of artifacts: a “source distribution”, which is a .tar.gz archive whose installation on the target system may involve a build step and thus require the target system to have the requisite compiler toolchain and non-Python dependencies (such as C libraries it needs to link against). Or a binary distribution, which is a file with the extension .whl [note 1], and contains everything already compiled and packed up for a specific target operating system and CPU architecture, so that installation only requires unpacking the .whl and putting files in the correct location.

                                                                                                      And although Python technically exposes only a C API, that API is robust enough, and other languages’ interop with C is similarly robust enough, that it’s possible to have Python use compiled extensions that communicate with a variety of other languages. The numeric/scientific Python stack, for example, interfaces with linear-algebra libraries which are written in Fortran (and thus building from source requires a Fortran compiler; it’s much nicer to install from a precompiled .whl).

                                                                                                      So. This library – which is not part of the Python standard library, and also not required for common tasks like speaking HTTPS or writing scripts that will control other machines through SSH – had some compiled extensions which were written in C. Now, the library is in the process of moving the extensions from C to Rust. The latest release builds the Rust by default but does not have a hard dependency on it (you can disable with a build-time environment flag). The next release will have a hard dependency on it, and as such will require the now-ported-to-Rust extension code in order to work at all.

                                                                                                      There appear to be two groups affected by this.

                                                                                                      One group is people who were building Python applications in containers based on Alpine Linux. Python’s .whl format supports precompiled packages for Linux distros, but is based on a lowest-common-denominator definition of “Linux” that Alpine – with its alternative libc – does not meet. As a result, people using Alpine as their base image must build from source, and apparently only Alpine 3.13 and later have a sufficiently recent Rust to be able to do this. Inserting an aside of my own recommendation: in general, if you are looking to containerize Python applications, Alpine is a poor choice of base image. This article does a good job explaining why, but the tl;dr is the alleged benefits of Alpine mostly manifest when using other languages like Go, and actually become downsides when doing Python (since now you get to compile the world every time you rely on a Python package with an extension written in C or other non-Python language).

                                                                                                      The other group is people who are on certain types of systems that are unsupported by Rust itself, and thus cannot build the ported-to-Rust extension modules now used in this package.

                                                                                                      Now, that out of the way, I think the Cryptography team made the right choice. Their options are to stay in C and keep the attendant risks that brings, thus exposing every user to nebulous but likely future security issues, or switch to something memory-safe but low-level and performant enough to be a reasonable replacement, but at the cost of making Alpine users’ lives a bit harder and leaving a much smaller number of users completely unsupported. For a security-critical library which treats security as its first priority, I think this is an unpleasant (because nobody ever likes cutting someone else off, and because it generates heated complaint threads) but not particularly difficult choice to make.

                                                                                                      Side arguments about “$THING would have fixed this” where $THING is one of “alternative versioning scheme”, “alternative/additional announcement channel”, and so on also don’t really work for reasons that become clear after reading the GitHub thread.


                                                                                                      [note 1] This is a tortured joke. Python was named after Monty Python, and much of its early documentation contained references to Monty Python sketches. The original name of the Python Package Index was the Python Cheese Shop, in homage to one such sketch. Many types of cheese are distributed in large round blocks called wheels, so this type of Python package is called a “wheel” and has extension .whl.

                                                                                                      1. 8

                                                                                                        For those ecosystems without Rust, might something like mrustc be an option? It’d be a somewhat torturous path, but transpile Rust to C, then you could use the Python C bindings …

                                                                                                        Generalised, this could be a good option for dealing with Rust dependencies on a wide range of languages.

                                                                                                        1. 7

                                                                                                          This actually works pretty well. Someone brought up Rust ESP8266 ecosystem using mrustc.

                                                                                                        2. 11

                                                                                                          I really appreciate the in depth write up. Great stuff

                                                                                                          1. 1

                                                                                                            Wouldn’t it be possible to ship with Rust by default, but also accept a C fallback for cases where Rust is not (easily) available? I agree that in principle Rust is probably the better choice, but it’s not like the C code has become insecure or “bad” all of the sudden, and practicality beats purity IMO.

                                                                                                            This would meet both needs; it was mentioned in passing in one comment, but probably got lost in all the noise/rather off-topic complaining/ranting. The biggest downside is the increased maintenance burden, but I’d just offer a “if you want it, you can maintain it” and/or “if you want it, this is how much it will cost”. If no one (or too few people) take up the offer then it probably wasn’t all that important after all.

                                                                                                            1. 2

                                                                                                              Practically speaking if there is enough interest somebody will fork it to keep the C backend alive. The level of disgruntlement alone will drive that along for a while, long term I don’t think that will carry through. In some cases it makes more sense to just fix the Rust toolchain to support cross compiling to your favorite architecture. Once a few more platforms take than initiative the will to maintain a legacy fork will die off. Better for the health of the main project to be free of the burden early rather than later.

                                                                                                              1. 1

                                                                                                                The problem with a fork is that it won’t fix cases where this library is an indirect dependency (i.e. my app imports libfoo which imports this). I’m not sure if Python provides a good way to override this?

                                                                                                                it makes more sense to just fix the Rust toolchain to support cross compiling to your favorite architecture

                                                                                                                Yeah, this would be the best long-term solution, but also not something that will be done overnight. I’m just trying to think of a solution that will fix the problems of today, so that there’s time to fix the longer term issues without too much friction.

                                                                                                          1. 64

                                                                                                            I find Docker funny, because it’s an admission of defeat: portability is a lie, and dependencies are unmanageable. Installing dependencies on your own OS is a lost battle, so you install a whole new OS instead. The OS is too fragile to be changed, so a complete reinstall is now a natural part of the workflow. It’s “works on my machine” taken to the conclusion: you ship the machine then.

                                                                                                            1. 17

                                                                                                              We got here because dependency management for C libraries is terrible and completely inadequate for today’s development practices. I also think Docker is a bit overkill, but I don’t think this situation can be remedied with anything short of NixOS or unikernels.

                                                                                                              1. 8

                                                                                                                I place more of the blame on just how bad dynamic language packaging is (pip, npm), intersected with how bad most distributions butcher their native packages for those same dynamic languages. The rule of thumb in several communities seems to be a recommendation to avoid using native packages altogether.

                                                                                                                Imagine if instead static compilation was more common (or even just better packaging norms for most languages), and if we had better OS level sandboxing support!

                                                                                                                1. 3

                                                                                                                  Can you explain what you find bad about pip/npm packaging?

                                                                                                                  1. 2

                                                                                                                    I don’t think npm is problematic to Docker levels. It always supported project-specific dependencies.

                                                                                                                    Python OTOH is old enough that by default (if you don’t patch it with pipenv) it expects to use a shared system-global directory for all dependencies. This setup made sense when hard drive space was precious and computers were off-line. Plus the whole v2/v3 thing happened.

                                                                                                                    1. 5

                                                                                                                      by default (if you don’t patch it with pipenv)

                                                                                                                      pipenv is…controversial.

                                                                                                                      It also is not the sole way to accomplish what you want (isolated environments, which are called “virtual environments” in Python; pipenv does not provide that, it provides a hopefully-more-convenient interface to the thing that actually provides that).

                                                                                                                2. 4

                                                                                                                  Yes, unikernels and “os as static lib” seem the sensible way forward from here to me, also. I don’t know why it never caught on.

                                                                                                                  1. 4

                                                                                                                    People with way more experience than me on the subject have made a strong point about debuggability. Also, existing software and libraries make assumptions about the filesystem and other things that are not immediately available on unikernels being there, and rewriting them to be reusable on unikernels is not an easy task. I’m also not sure about the state of the tooling for deploying unikernels.

                                                                                                                    Right now it’s an uphill battle, but I think we’re just a couple years away and we’ll get there eventually.

                                                                                                                    1. 6

                                                                                                                      Painfully easy to debug with GDB: https://nanovms.com/dev/tutorials/debugging-nanos-unikernels-with-gdb-and-ops - Bryan is full of FUD

                                                                                                                      1. 4

                                                                                                                        GDB being there is great!

                                                                                                                        Now you also might want lsof, netstat, strace, iostat, ltrace… all the tools which exist for telling you what’s going on in the application to kernel interface are now gone because the application is the kernel. Those interfaces are subroutine calls or queues instead.

                                                                                                                        It’s not insurmountable but you do need to recreate all of these things, no? And they won’t be identical to what people are used to.

                                                                                                                        I guess the upside is that making dtrace or an analogue of it in unikernel land is prolly easier than it was in split kernel userspace land: there’s only one address space in which you need to hot patch code. :)

                                                                                                                        1. 2

                                                                                                                          Perhaps some tools you’d put in as plugins but most of the output from these tools would be better off being exported through whatever you want to use for observability (such as prometheus). One thing that confuses a ton of people is that they are expecting to deal with a full blown general purpose operating system which it isn’t. For example if you take your lsof example - suppose I’m trying to figure out what port is tied to what process - well in this case you already know cause there’s only one.

                                                                                                                          As for things like strace - we actually already did implement something similar a year or so ago as it was vital to figure out what applications were doing what. We also have ftrace like functionality too.

                                                                                                                          Finally, as for tool parity you are right if all you are using is Linux then everything should be relatively the same, but if you jump between say osx and linux you’ll find quite a few different flags or different names.

                                                                                                                          1. 2

                                                                                                                            It obviously wouldn’t be “identical to what people are used to” though, that’s kind of the point. And you don’t want a narrow slice of a full linux system with just the syscalls you use compiled in, it’d be a completely different and much simpler system designed without having to constantly jump up and down between privelege levels, which would make a regular debugger a lot more effective to track a wider array of things than it can now while living in the user layer of a full OS.

                                                                                                                    2. 1

                                                                                                                      Can you further clarify? With your distribution’s package manager and pkg-config development in C and C++ seems fine. I could see docker being more of a thing on Windows with C libraries because package management isn’t really a thing on that OS (although msys seems like it has pacman which is nice). Also wouldn’t you use the same C library dependency management inside the container?

                                                                                                                      Funny enough, we are using docker at work for non-C languages (dotnet/mono).

                                                                                                                    3. 6

                                                                                                                      That’s exactly what I said at work when we began Dockerization of our services. “We just concluded that dependency management is impossible, so we may as well hermetically seal everything into a container.” It’s sad that we’re here, but there are several reasons both technical and business related why I see containerization as being useful for us at $WORK.

                                                                                                                      1. 5

                                                                                                                        Which is what we used to do back in the 70s and 80s. Then operating systems started providing a common set of interfaces so you could run multiple programs safe from each other (in theory), then too many holes started opening up and programs relying on specific global shared libs/state which would clash, and too many assumptions about global filesystem layout, and now we’ve got yet another re-implementation of the original idea, just stacked atop of and wrapped around the old, crud piling up around us comprised of yak hair, old buffer overflows, and decisions made when megabytes of storage were our most precious resource.

                                                                                                                        1. 1

                                                                                                                          What if I told you that you don’t need an os at all in your docker container? You can, and probably should, strip it down to the minimal dependencies required.

                                                                                                                          1. -1

                                                                                                                            This is amazing insight. Wow. :O Saving and sending this.

                                                                                                                              1. 1

                                                                                                                                Thanks for the laugh :’)

                                                                                                                          1. 2

                                                                                                                            Yeah, why would you ever expect a package manager to expose a package you were supposed to use?

                                                                                                                            1. 2

                                                                                                                              The problem here is that they’re using a package manager to install a language and then using a hacky tool (virtualenv) which does some symlinking of binaries which are dynamically linked to libraries provided by that package manager, and then install and link other libraries using an entirely different package manager of which the original package manager is completely oblivious (pip inside the virtualenv), and then upgrade the underlying package against which the extension libraries were linked.

                                                                                                                              Of course this is going to break! In the extreme, think of this like installing Python 2, installing some libraries with pip and then upgrading to Python 3, and expecting those pip packages to just work without first rebuilding them against the new Python.

                                                                                                                              The real solution is to either do everything through the package manager (which means you’re stuck with whatever versions of Python libraries are provided by homebrew) or nothing (which means installing Python with some other tool).

                                                                                                                              edit: After re-reading, it seems to me Homebrew supports having multiple Pythons installed at the same time. So I guess the real problem is that it doesn’t have a mechanism to remember which Pythons you manually installed and want to keep. Or maybe it does and people are just relying on Python being installed through whatever other package and are not explicitly installing a specific version they want to rely on.

                                                                                                                              1. 3

                                                                                                                                I think you’re missing an important part of my article. The workflow you described worked very well for years, right up until Homebrew added two changes that caused everything to break, most notably the automatic cleanup that causes previous package versions to be deleted automatically.

                                                                                                                                To me, the real problem is that Homebrew does not mention this on their Homebrew and Python page. They could say, “Don’t rely on Homebrew Python versions, which can disappear at any time. You might be better off using asdf, pyenv, or Pythonz tooling to install and manage Python interpreter versions.”

                                                                                                                                1. 2

                                                                                                                                  To me, the real problem is that Homebrew does not mention this on their Homebrew and Python page.

                                                                                                                                  This would be a very helpful addition to the page. Have they refused the patch, or just not applied it yet?

                                                                                                                                  1. 1

                                                                                                                                    Fair enough; they broke something which used to work. Maybe it worked “by accident” in their vision, but if so many people relied on this accidental behaviour, it would behoove them to clearly inform users about it.

                                                                                                                                    Note that I was more responding to the parent comment than your post though.

                                                                                                                                  2. 3

                                                                                                                                    Think of this like installing Python 2, installing some libraries with pip then upgrading to Python 3.

                                                                                                                                    The problem here is that Homebrew decided to upgrade, not the user. If I upgraded to Python 3, I wouldn’t expect anything to work, but I didn’t - Homebrew did it for me without asking.

                                                                                                                                    But what I’m really complaining about is the way the article describes it as “a misunderstanding” like users should know better. It’s a package manager. Exposing packages under obvious names that the user is not supposed to install is a horrible design failure, and I don’t like the way the article blames the user.

                                                                                                                                    1. 4

                                                                                                                                      I don’t like the way the article blames the user.

                                                                                                                                      Article author here. Thanks for the constructive feedback. My initial reaction to this was, “Huh?? Did we read the same article?” Then I re-read it with your perspective in mind. And now I see your point. Allow me to explain how we got here. Short version: I was trying to be kind.

                                                                                                                                      I actually wrote this article last summer, soon after Homebrew’s changes started wreaking havoc. Around that time, a user of VirtualFish (a virtual environment manager for Fish shell that I maintain) encountered the same infuriating problem and posted an issue about it in the VirtualFish repository tracker. I responded by posting a long rant and venting my frustration with how poorly the transition to the new Homebrew behavior was managed. In that comment you will see the basis upon which my article was written, which I purposely softened so as to respect the hard work that unpaid, volunteer Homebrew maintainers put into the project, for free.

                                                                                                                                      I fear that in the process of toning down my frustration, I inadvertently left out the part about Homebrew’s culpability in neglecting to inform users about its suitability for Python development, leaving the opportunity for folks to walk away with the impression that I think the fault lies with users and their inability to understand. That was absolutely not my intention — quite the opposite. Mea culpa.

                                                                                                                                      I hope that after reading my above-linked screed, you will see that I do not hold users accountable here. Perhaps I should amend the article to make that clearer. Thanks again for communicating your perspective, which will help me express myself better in the future.

                                                                                                                                      1. 1

                                                                                                                                        Well that makes sense. Thanks for the explanation!

                                                                                                                                    2. 1

                                                                                                                                      There is still one way that it can break things even if you used another tool like pyenv to install the Python interpreters you actually use: pyenv compiles the interpreter locally, linking against the set of shared libraries you have at the time. Which doesn’t matter much for Python, except in the case of OpenSSL. Sometimes, Homebrew will “helpfully” clean up an older OpenSSL that some pyenv-compiled interpreters linked against, and suddenly those interpreters fail.

                                                                                                                                      1. 1

                                                                                                                                        Can’t you tell pyenv to link Python against the system-provided OpenSSL though? Or is that so hopelessly out of date that it’s unusable?

                                                                                                                                        1. 1

                                                                                                                                          I don’t know how it is now. I know at one point you didn’t really have a choice because the system OpenSSL was too old for things like pip to even be able to speak to the Python Package Index.

                                                                                                                                  1. 1

                                                                                                                                    I used pyenv for a minute but it seemed to screw a bunch of other stuff up, specifically IPython complained about running in a virtual environment, but there were other, mysterious, problems as well. Maybe if I’d used it through asdf’s consistent frontend I wouldn’t have made whatever mistake it was that I made that caused my entire Python toolchain to explode and become unusable.

                                                                                                                                    1. 3

                                                                                                                                      IPython complained about running in a virtual environment

                                                                                                                                      The message IPython gives is not, well-worded.

                                                                                                                                      The thing it’s trying to warn about is a case where the instance of IPython that’s running currently does not belong to the active virtual environment. IPython will do some gymnastics to make the virtual environment’s contents visible for you to import, but won’t be using the virtual environment’s Python interpreter, and won’t properly isolate what it can see or the side effects of what it’s doing. So the safe thing, and what it’s trying to suggest, is to create the virtual environment, activate it, and then pip install ipython so that the IPython you get when the virtual environment is active is one tied to that virtual environment.

                                                                                                                                      1. 1

                                                                                                                                        Yeah, see, that’s what I was doing. I looked it up at one point and IIRC there was a bug or something where it interpreted the pyenv environment as a virtual environment so it thought you were trying to run a virtual environment inside another virtual environment. There didn’t seem to be a reasonable solution at the time, and pyenv also seemed to cause some other problems I don’t recall specifically, so I gave up with the intention of trying again in the future.

                                                                                                                                    1. 18

                                                                                                                                      I’m increasingly coming around to the conclusion that there’s no such thing as a functional programming language; only programs that are written in a more or less functional style. A language can offer features that encourage or discourage the writing of functional programs, and when people say “language X is a functional language” what they mean is that it encourages the writing of functional programs.

                                                                                                                                      That said, any language with statements and no repl is difficult for me to describe as functional.

                                                                                                                                      1. 7

                                                                                                                                        I’ve joked before, but it’s somewhat true, that there’s really a hierarchy of “functional” which depends entirely on which other languages you sneer at for being insufficiently “functional” and which you sneer at for having uselessly gone too far in pursuit of purity.

                                                                                                                                        Like, the base level is you join the Church, you renounce von Neumann and all his works. There’s a level where you sneer at any poor pleb whose language doesn’t guarantee tail-call optimization. There’s a level where you sneer at any poor pleb whose language is based on untyped lambda calculus. There’s a level where you sneer at any poor pleb whose language doesn’t support fully monoiconic dyads bound over the closed field of a homomorphic Bourbaki trinoid. And at each level you also sneer at the people “above” you for going overboard and forgetting that programming needs to be practical, too.

                                                                                                                                        1. 4

                                                                                                                                          This is how I teach paradigms at the start of my Rust course! Programming paradigms are about a mental model for computation (“do I think of this in terms of functions / objects / logical relations / blocks / etc.”), and language support for a paradigm is about how well you can translate that mental model into code in that language. People sometimes don’t like this definition because it’s subjective, but it avoids the problems with defining paradigms extensionally or intensionally.

                                                                                                                                          If you define them extensionally, you’ll find that no one can agree on what the extent is. “Hey you left X language out of the functional list!” “Yes it isn’t functional” “But it is!” and so on.

                                                                                                                                          If you define them intensionally, you’ll find that no one can agree on what features constitute the necessary and sufficient conditions for inclusion in the set. “Functional programming requires automatic currying!” “No it doesn’t, but it does require partial function application!” “You’re both wrong, but it does require a powerful type system!” “What does ‘powerful’ mean?” and so on.

                                                                                                                                          So instead you say “well when I think of my program in terms of functions, I find that I can write the code that matches my mental model really easily in X language, so I say it’s functional for me!”

                                                                                                                                          Honestly, part of why I like this is that I think it helps us get away from an endless and unsolvable definitional fight and into the more interesting questions of how features intersect to increase of decrease comfort with common mental models.

                                                                                                                                          1. 2

                                                                                                                                            Honestly, part of why I like this is that I think it helps us get away from an endless and unsolvable definitional fight and into the more interesting questions of how features intersect to increase of decrease comfort with common mental models.

                                                                                                                                            I love how the other comments in this thread back this up. People are arguing over “no, a functional language must have X” / “no, it means Y” and they’re never going to agree with each other. Just acknowledge that fact and move on!

                                                                                                                                          2. 3

                                                                                                                                            For the repl: Give https://github.com/google/evcxr/blob/master/evcxr_repl a chance. I just became aware of that project recently (the fact that they have a standalone repl) and have not yet tried to push it. It’s certainly appears not that full featured compared to dynamic languages or ghci but it should be good enough to make an inferior lisp mode out of it.

                                                                                                                                            1. 11

                                                                                                                                              Thanks but if the language doesn’t have a repl built-in it’s a good sign that it’s creators don’t value any of the same things I value in a language, so I don’t think rust would be a good fit for me.

                                                                                                                                              1. 3

                                                                                                                                                Never change, Technomancy ❤️

                                                                                                                                            2. 3

                                                                                                                                              This is absolutely true, IMO. And the same can be said with OOP.

                                                                                                                                              A “functional language” would really be any language that strongly encourages a functional style or truly forbids an OOP style. I think Haskell and Elixir are pretty close to forbidding OOP programs.

                                                                                                                                              Likewise, an OOP language is one that strongly encourages an object oriented style. JavaScript is OO because even functions are objects and you can add behaviors and properties to any object.

                                                                                                                                              Etc, etc.

                                                                                                                                              But I’m a bit confused by your comment about statements. Rust is pretty expression oriented. if is an expression, match is an expression, loops are expressions that return the value of its final iteration, all functions return the final expression in its body as the return value, etc.

                                                                                                                                              1. 2

                                                                                                                                                I think Haskell and Elixir are pretty close to forbidding OOP programs.

                                                                                                                                                In what way do you see that? I guess it depends what you mean by “OOP” of course but Haskell has several powerful ways to do OOP-style message passing and encapsulation.

                                                                                                                                                1. 1

                                                                                                                                                  I’m sorry for the confusion. Everyone means something different when they say “OOP” (and “FP”). When I said OOP, I meant a style that revolves around “objects”. In my mind an object is something that has hidden, mutable, state. A black box, if you will. You may call a method on an object (or send it message), but you are not necessarily guaranteed the same response every time (think sending an HTTP request, a RNG, even a mutable Map/Dictionary is an object per my defiinition).

                                                                                                                                                  I’ve never used Haskell for serious work, so I could’ve been totally off base there. And, actually, I guess Elixir does have message passing between processes. I was only thinking of the code inside a process… So, I’m probably wrong on both counts!

                                                                                                                                                  1. 1

                                                                                                                                                    Just as an example, here’s one way to do mutable student message passing style in Haskell:

                                                                                                                                                    https://paste.sr.ht/~singpolyma/c618d894e7493d7197ef745035a8691d53e2a193

                                                                                                                                                    (This is an example and a real system would have to handle a few cases this does not.) In this case it’s a monitor (threaded and threadsafe) and you can’t do inheritance (but you can do composition).

                                                                                                                                                2. 1

                                                                                                                                                  A “functional language” would really be any language that strongly encourages a functional style or truly forbids an OOP style.

                                                                                                                                                  Gonna have to disagree here; whether something encourages object oriented programs or functional programs should be thought of as two orthogonal concerns that simply happen to be correlated in most widely-used languages. That said, “OOP” is such a poorly-defined term that I’m not even sure it’s worth spending any effort untangling this; IMO the term should be completely abandoned and more specific terms should be used in its place, like “message-passing”, “inheritance”, “encapsulation”, etc. (For instance, Elixir has great support for message passing and encapsulation, two cornerstones of what is often called “OOP”. No inheritance, but that’s great because inheritance is a mistake.)

                                                                                                                                                  But I’m a bit confused by your comment about statements.

                                                                                                                                                  I looked into it and … it IS confusing! Rust says that they “have statements” but what they really have is expressions that return Unit. Calling that a statement is pretty misleading IMO, because every other language I know that “has statements” means something completely different by it.

                                                                                                                                                  1. 2

                                                                                                                                                    That said, “OOP” is such a poorly-defined term that I’m not even sure it’s worth spending any effort untangling this; IMO the term should be completely abandoned and more specific terms should be used in its place, like “message-passing”, “inheritance”, “encapsulation”, etc. (For instance, Elixir has great support for message passing and encapsulation, two cornerstones of what is often called “OOP”. No inheritance, but that’s great because inheritance is a mistake.)

                                                                                                                                                    Yeah, it’s definitely poorly defined. And my examples of Haskell and Elixir were actually bad examples. When I think of OOP, I’m thinking about black boxes of (potentially) mutable state and “message passing” (which may just be calling methods). You can’t, in theory, expect to get the same “response” if you send multiple messages to an object.

                                                                                                                                                    As you said, Elixir is a good example of both FP and OOP. Kind of OOP in the large, FP in the small.

                                                                                                                                                    Apologies for the confusion.

                                                                                                                                                    I looked into it and … it IS confusing! Rust says that they “have statements” but what they really have is expressions that return Unit. Calling that a statement is pretty misleading IMO, because every other language I know that “has statements” means something completely different by it.

                                                                                                                                                    Yeah, that’s strange that Rust docs would say they have statements… Maybe assignment is a statement? I don’t know. Most stuff in Rust is expressions, though- even the for loop that always returns Unit/void/whatever. It’s a neat language. Definitely not (pure)-function-oriented, IMO, but really fun, ergonomic, and safe for a systems language.

                                                                                                                                                    I think the Rust docs also used to say that it’s not OOP. I think that’s wrong, too. It doesn’t have struct/class inheritance, but I think that you can get really far with an OOP style in Rust- struct fields are private by default; mutation is controlled via the borrow checker; traits are like type classes and if you write “object-safe” traits, you can pass trait objects around.

                                                                                                                                                3. 1

                                                                                                                                                  I’m increasingly coming around to the conclusion there’s no such thing as a functional programming language

                                                                                                                                                  The way I see it, a functional programming language is a one that maintains referential transparency.

                                                                                                                                                  Lambda calculus, Haskell, Elm, Purescript, Futhark, Agda, Coq, and Idris are some examples.

                                                                                                                                                  Then, languages which don’t enforce referential transparency fall on the scale of “more” or “less” functional, based on how easy it is to write pure code, and how frequently it is done in practice.

                                                                                                                                                1. 9

                                                                                                                                                  I literally cannot make sense of this thread at all. Half of the messages are from someone who I would assume is a conspiracy-theory spammer. I could literally replace that person’s messages with things copy-pasted from Time Cube and get something about as coherent. Except other people in the thread appear to interact with that person and a quick glance at the group archive suggests this person is somewhat important to what they do?

                                                                                                                                                  Is this some sort of Poe’s Law parody group that I’m just not deep enough into the material to get?

                                                                                                                                                  1. 15

                                                                                                                                                    The poster you’re talking about is Carl Hewitt. He has become a crank, and he denies various basic facts about lambda calculus, Gödel’s work, and Turing’s work. In this thread, he is complaining about Mark S. Miller’s proposed taxonomy for side channels and similar security issues, a work which simply tries to put names to certain common bug classes. Miller has written papers like this before, and just like how plan interference was first discovered in the context of capability theory, we might hope that other non-obvious bug classes can also be found via similar inquiries.

                                                                                                                                                    The main thrust of my post is to rigorously show that Hewitt is wrong to claim that actors are above and beyond Turing machines; we can indeed compute with actors using standard computers. This is something that he and I have argued about before.

                                                                                                                                                    I have owed this particular proof to the group for a few years; during an academic conference, I responded to an open unanswered question by noting that object graphs behave somewhat like hypergraphs, which can be lifted to categories. However, it wasn’t yet obvious how to mutate those graphs, and so it was an idle curiosity. The rest of the proof just snapped together last night.

                                                                                                                                                    1. 5

                                                                                                                                                      The poster you’re talking about is Carl Hewitt. He has become a crank, and he denies various basic facts about lambda calculus, Gödel’s work, and Turing’s work.

                                                                                                                                                      Don’t argue with cranks, your life is worth more than that

                                                                                                                                                      1. 3

                                                                                                                                                        some people watch TV, some play video games… others, they have less conventional hobbies…

                                                                                                                                                    2. 14

                                                                                                                                                      Carl Hewitt is famous for kinda sorta inspiring the actor model of concurrency, which is used by Erlang and Pony and stuff. In the mid-00’s or so he went completely bugnuts crazy and has since been banned from ArXiv, Wikipedia, and then Wikipedia a second time. However, since he did influential work in the 70’s and 80’s people still think he’s an intellectual giant.

                                                                                                                                                      I especially dislike him because he’s ruined a ton of Wikipedia and C2 pages on CS history. Every year Wikipedia admins find like another two dozen of his sockpuppets.

                                                                                                                                                      1. 1

                                                                                                                                                        went completely bugnuts crazy

                                                                                                                                                        Do you mind sharing some details?

                                                                                                                                                        1. 7

                                                                                                                                                          He’s been vandalizing Wikipedia pages for the past 15 years with falsehoods like “Gödel was wrong” and “all modern logic programming was inspired by PLANNER” and “Actors are stronger than Turing machines”.

                                                                                                                                                          If you want to learn more, best place to start is the Wikipedia talk page: https://en.wikipedia.org/wiki/User_talk:Prof._Carl_Hewitt. That doesn’t cover everything; it missed a bunch of pages he vandalized, or that, after having been banned for a decade, he was finally unbanned in October 2016 and then had to be banned again three weeks later.

                                                                                                                                                          1. 4

                                                                                                                                                            He gave a talk in Cambridge a couple of years back. I was excited to hear him speak, until about 5 minutes into the talk when it became clear that he was talking complete nonsense. It was pretty uncomfortable listening, he was making assertions that I’d expect any undergraduate to be able to disprove and no one wanted to interrupt because there’s no way of arguing usefully with someone that out of touch with reality.

                                                                                                                                                        2. 5

                                                                                                                                                          it’s category theory

                                                                                                                                                          1. 2

                                                                                                                                                            I’m gonna try just because I have some time on my hands: because of this document people wondered “do we have a formal model of this?” as it would be quite useful to reason about the stuff. Then @corbin (please correct me) linked the “everything is an actor” premises to a category theory notion, there must be a category of actors (he goes on to sketch one) and it might have some properties (can this thing actually compute everything?). A category of actors could be a Turing category and this is where things are confusing, the category of actors as sketched would be typed but we know Turing categories to be untyped, contadiction.

                                                                                                                                                            Is this some sort of Poe’s Law parody group that I’m just not deep enough into the material to get?

                                                                                                                                                            I’m gonna go with maybe :)

                                                                                                                                                            1. 2

                                                                                                                                                              I think that you picked up everything. The paradox of Turing categories is that all of the different types exist, but also they can be erased.

                                                                                                                                                              Like, given some object O, and a Turing object A, the arrows O -> A freeze elements of O as code literals. For example, if O were a natural numbers object, then the elements of O might be 42 or 100, and the corresponding elements of A might be "42" or "100". Similarly, the arrows A -> O are like evaluation of code literals, with "42" being sent to 42 and e.g. "anyRandomIdentifier" being sent to failure/partiality.

                                                                                                                                                              The effect of this is that an actor could both be fully strictly statically typed in the category-theory tradition, and also have the potential to evaluate code literals which could do anything, including representing elements who don’t belong to any type at all. I have mentioned this paradox before and given examples in C++ and Haskell.

                                                                                                                                                              Another effect, which contradicts Hewitt’s claims, is that actors can be truly Turing-complete; they can diverge.

                                                                                                                                                            2. 1

                                                                                                                                                              I’m curious, who? (you can respond privately if you don’t want to say in public)

                                                                                                                                                              Jonathan Shapiro and Mark Miller are two of the main figures behind capabilites. Shapiro used to be a professor at CMU. Miller was done lots of work related to sandboxing for JS (at google).

                                                                                                                                                              Carl Hewitt is the guy behind the Actor Model.

                                                                                                                                                              Lots of Actor model stuff sounds crazy (since it seems to try awfully hard to be it’s own unique thing), but it’s definitely influential on lots of things.

                                                                                                                                                              1. 12

                                                                                                                                                                Carl Hewitt is the guy behind the Actor Model.

                                                                                                                                                                Carl Hewitt is the guy behind the Actor Model of computation, which was a CS dead end. Notably, Steele and Sussman tried to implement it in Scheme and quickly abandoned it as pointless.

                                                                                                                                                                The Actor Model of concurrency was invented by Gul Agha, one of Hewitt’s grad students. Hewitt’s been taking credit for it ever since.

                                                                                                                                                                1. 2

                                                                                                                                                                  OK, this plus the “yeah, he really is that sort of crank” stuff makes things make more sense, because I was reading “Actor” as the concurrency model and not getting anywhere from that.

                                                                                                                                                                2. 4

                                                                                                                                                                  I’m curious, who?

                                                                                                                                                                  It’s pretty clear he’s referring to message like this, which have a typography and writing style with more than a passing resemblance to the Timecube website.

                                                                                                                                                                  This entire “Universal Intelligent Systems” (also see video) thing seems to be missing some points. It looks like a mathematician thinking they can solve problems that are not mathematical in nature with mathematics.

                                                                                                                                                                  1. 3

                                                                                                                                                                    I’ve seen his talks in person. They are much, much worse. But he keeps getting invited to prestigious places because he once did some interesting work. It’s kinda depressing all around.

                                                                                                                                                                    1. 5

                                                                                                                                                                      I’ve seen his talks in person. They are much, much worse. But he keeps getting invited to prestigious places because he once did some interesting work. It’s kinda depressing all around.

                                                                                                                                                                      I’ve been to one of these, and IMO the best part of the talk was when Hewitt got so caught up in his own conspiracy theory rant that he forgot where he was and literally fell off the stage.

                                                                                                                                                                      1. 3

                                                                                                                                                                        I feel bad, but fuck me that’s hilarious. Pure absent-minded professor.

                                                                                                                                                                      2. 1

                                                                                                                                                                        I really think there should be a limit on how much we tolerate cranks in the CS space. I get that some people are contrarian, but still.

                                                                                                                                                                        1. 6

                                                                                                                                                                          Contrarian implies some level of expertise. In a 2018 keynote I attended, he claimed he solved the halting problem.

                                                                                                                                                                          1. 1

                                                                                                                                                                            I’m a contrarian (or maybe a partial crank) in that I think fear of the halting problem keeps us from doing a lot of really useful stuff, but it’s a pretty simple thing to grok, how could it be wrong/solved?

                                                                                                                                                                            1. 3

                                                                                                                                                                              I believe his specific claim was “actors can have timeouts, so they’re guaranteed to halt”

                                                                                                                                                                      3. 2

                                                                                                                                                                        Someone should do a psychological study on the people who use colors and typefaces like this

                                                                                                                                                                        https://professorhewitt.blogspot.com/

                                                                                                                                                                        I’m unfamiliar with SSRN.com - it’s an open-access site run by Elsevier?

                                                                                                                                                                        1. 3

                                                                                                                                                                          SSRN is a pre-print site owned by elsevier.

                                                                                                                                                                  1. 3

                                                                                                                                                                    We should also recognize that Apple is hostile to developers. They don’t care about us anymore.

                                                                                                                                                                    If they cared about us, they would get over the GPL3+ and start upgrading Bash. Instead, we have to maintain back-compat for a 15-year old version.

                                                                                                                                                                    If they cared about us, they would provide us with a proper package manager. Instead, they break the OS slightly on each release and leave it to Homebrew and others to scramble. And there is nobody to help from Apple. Is it that hard to assign even a single developer that can communicate?

                                                                                                                                                                    They don’t care that running macOS is prohibitively expensive for CI. Isn’t it more important to have software be well tested in a VM than rake in the last little dollar from rack-mounted mac minis?

                                                                                                                                                                    Every release they break some kernel API and lock-down the system even further and take a bit of freedom away.

                                                                                                                                                                    It’s sad really. Apple used to have a vibrant community of passionate developers doing cool things with their OS. And this has been stripped bit by bit.

                                                                                                                                                                    1. 4

                                                                                                                                                                      We should also recognize that Apple is hostile to developers. They don’t care about us anymore.

                                                                                                                                                                      I hear the same thing from creatives, except it’s how much Apple prefers developers instead. “Grass is greener on the other side” happens everywhere, and it’s pretty amusing when you know it’s not the case.

                                                                                                                                                                      1. 2

                                                                                                                                                                        Can’t both be true? The Mac made large strides the last two years (first going back to scissor switches and then the M1), but one can hardly be blamed for believing that their focus has primarily been on iPhone, iPad, and Apple Watch from 2012-2018.

                                                                                                                                                                      2. 3

                                                                                                                                                                        They don’t care about us anymore.

                                                                                                                                                                        The correct reply here is “they never did, in the sense you’re intending it to be read”. Being an acceptable-to-many Unix-y programming environment was a contingent side effect of the history that led to OS X, not a necessary or deliberately designed-in (as far as I’m aware) feature. Similarly, the fact that their laptops happened to be decent for a typical software developer’s daily driver was also not something that (again, as far as I’m aware) they ever specifically set out to achieve, just a side effect of decisions that they made for other reasons and in pursuit of other ways to differentiate themselves in the market.

                                                                                                                                                                        I saw an analogy once to developers being a kind of creepy guy who convinces himself a girl is in love with him because she was nice to him once and can never let that fantasy go, and as harsh as it is I think it’s fairly accurate.

                                                                                                                                                                        1. 2

                                                                                                                                                                          I agree. I’m not a native MacOS developer (and frankly, I don’t really get the hype around “Mac design language”) but I read around the edges, and I get the impression that Apple takes decent care of the developers who develop paid and shareware software using their tools. Maybe not as well as Microsoft, but then who does? There’s rumblings about how the documentation is lacking nowadays and generally feeling left out compared to iOS, but frankly, any decent Mac developer should have seen the writing on the wall years ago and pivoted to iOS apps.

                                                                                                                                                                          My point is, I don’t think Apple has been especially friendly to FOSS developers (again, like Microsoft), and I don’t get why FOSS developers have the expectation that they have been in the past.

                                                                                                                                                                        2. 2

                                                                                                                                                                          Instead, they break the OS slightly on each release and leave it to Homebrew and others to scramble. And there is nobody to help from Apple. Is it that hard to assign even a single developer that can communicate?

                                                                                                                                                                          To put it differently, it is surprising how many FLOSS developers are willing to work for Apple for free. When they originally announced that the next macOS will run on Apple Silicon at WWDC, they talked about how all the major open source projects would support macOS on Apple Silicon. I initially thought that this could imply that they would make significant and proactive contributions to the FLOSS ecosystem. After 6 months it’s clear that they just new that the FLOSS community would scramble to get their projects running on M1.

                                                                                                                                                                          It is kind of sad that people invest so much of their time in a platform of a company that rarely if ever gives back (when will Facetime be the open industry standard that they promised?).

                                                                                                                                                                          Apple used to have a vibrant community of passionate developers doing cool things with their OS.

                                                                                                                                                                          I loved macOS when I started using it in 2007. It had a great ecosystem of independent developers. It was a great, reliable OS that was literally years ahead of the competition. Now the hardware is awesome, but Apple has destroyed much of the indie ecosystem with the app store. Everyone is moving to subscriptions, because the App Store makes it hard to sell upgrades. In the meanwhile macOS itself became increasingly buggy and had questionable changes. Also, it’s barely an open platform.

                                                                                                                                                                          But people will buy Macs anyway, because Apple made everyone believe that M1 is years ahead of the competition, while in practice Ryzen APUs are not far behind.

                                                                                                                                                                          I got rid of my last Mac 2 months ago and there’s little chance I will buy a Mac again.

                                                                                                                                                                          1. 4

                                                                                                                                                                            But people will buy Macs anyway, because Apple made everyone believe that M1 is years ahead of the competition, while in practice Ryzen APUs are not far behind.

                                                                                                                                                                            Yeah, no. My MBA is faster than my Ryzen based gaming desktop (at CPU; before I upgraded the GPU, it was even comparable graphics-wise), even in emulation. Anandtech doesn’t bullshit and they would tell you as much

                                                                                                                                                                            Mac OS is a bit of a mess (eagerly awaiting to see how the porting of alternative OSes happens), I agree. People porting themselves is just what naturally happens - if someone has a system, requires an application, and can do the porting, it’s likely that someone meeting that criteria will.

                                                                                                                                                                            1. 2

                                                                                                                                                                              That sounds like you’re not doing an apples-to-apples comparison of the latest gen vs the latest gen. Anandtech benchmarks show that the latest-gen Ryzen desktop CPUs are close to (or at the most expensive tier, comfortably exceed) the M1. And the only gaming GPUs that are even in the same ballpark as the M1 are multiple generations old; the M1 is equivalent to the lower end of the Nvidia 10-series. A 2070 Super comfortably leaves an M1 in the dust, and that’s not even considering the new 30-series.

                                                                                                                                                                              It’s still impressive that a laptop CPU is keeping up with desktop CPUs, given the different thermal profiles. But it’s not a blowout by any means compared to AMD, and it loses to the highest-end AMD chips. And the M1 GPU is just fine; it’s not even particularly great.

                                                                                                                                                                              1. 1

                                                                                                                                                                                Yeah, no. My MBA is faster than my Ryzen based gaming desktop

                                                                                                                                                                                The Ryzen 3700X desktop that I built last year had about the same price as the Mac Mini M1 with 16GB RAM and a 256 GB SSD. However, it has twice the amount of RAM, a 4 times larger SSD. The 3700X has slightly worse single core performance and slightly better multi-core performance and was released in 2019. The GPU in that machine is also a fair bit faster than the M1 GPU.

                                                                                                                                                                                My laptop with a Ryzen 7 Pro is a bit slower than the M1, but not by a large margin. But that is a last generation Renoir APU. The new Cezanne APUs are actually faster than the M1 (slightly lower single core performance, better multi-core performance). But at the same price as the M1, the laptop has 16GB RAM, which I extended to 32 GB, which makes it faster than the M1 MacBook in practice for my work.

                                                                                                                                                                                I had an M1 MacBook Air for a week, but I returned it because I was not impressed. Sure, the M1 is really impressive compared to the Intel CPUs in old MacBooks (which also had a terrible, loud cooling), but as I said, modern AMD Ryzen 2/3 CPUs and the M1 pretty much go toe to toe. Besides that, the M1 MacBook Air felt yet like another step in the direction of becoming an appliance, slowly more and more features of ‘general purpose computing’ are taken away. It’s not a future that I am interested in. So, that was the end of my 13-year Mac run.

                                                                                                                                                                                Maybe they will be able to beat AMD by a wide margin with a successor with more performance cores. But AMD also isn’t resting on their laurels. We’ll see.