Threads for hynek

  1. 1

    see also: https://explain.depesz.com/

    and there is another graph-like explain visualize which I can’t remember

    1. 2

      FTR the big difference is that pgMustard doesn’t just visualize the EXPLAIN, but also interprets it and makes practical suggestions how to improve your query and/or indexes. Which is a big deal if SQL performance isn’t your bread-and-butter.

      1. 1

        shrug I tried it once and all it suggested was that I had already created the indices I needed.

        1. 1

          That’s odd, in my case it told me that my index was wrong and why so they clearly should have an awareness. Might be different case, might be they improved since you tried.

          1. 3

            OK lets consider https://app.pgmustard.com/#/explore/51f2da78-0b10-4a45-b82c-a67fd8942097 and compare it with https://explain.depesz.com/s/MH7Y (slightly different query with a different parameter)

            This corresponds to this query (schema) which is the data source for this page. My site is a bit unusual in that I mostly do “OLAP” queries, and this page is perhaps the worst offender. It calculates statistics for a player grouped by other players they’ve played with. This query is rather pathological, since the player_stats_peers index’s column (logid aka the primary key for the games played) isn’t clustered by the access pattern (by steamid64 aka the player’s primary key). This results in a lot of blocks being read for each row. I’ve had a lot of trouble optimizing this query because of this.

            So what does mustard suggest?

            • Operation #11: Heap Fetches: 29,942
            • Operation #11: Read Efficiency
            • Operation #11: Cache Performance: 69.4%

            The cache performance is expected. This query completes around 10x faster if it is run in quick succession. Unfortunately, most pages are not reloaded immediately, and I use caches closer to the user to deal with those situations. This ties in with the poor read efficiency, which is due to the index’s column mismatch. These are both somewhat highlighted by the “read” column of depesz, though it shows absolute numbers and not relative.

            Mustard suggests that there are a lot heap fetches because the visibility map is out of date and that I should run vacuum. Indeed, running vacuum did increase the performance. Much of my dataset is effectively read-only, while most of the churn happens in the most recent hour of rows. This causes the autovacuum threshold to be set too high with the default scale factor of 0.2. I’ve been meaning to partition my tables to address this a bit, but I haven’t gotten around to it. I adjusted my autovacuum settings so hopefully this gets taken care of automatically next time.

            The heap fetches are visible in depesz, but not highlighted as concerning. So that’s a win for mustard I’d say. Is it worth $100/year? Maybe once I start spending more than $120/year on the server :)

    1. 1

      Right off the bat, this post misunderstands the point of versioning. The author opens with:

      Let’s set the stage by laying down the ultimate task of version numbers: being able to tell which version of an entity is newer than another.

      That is not quite right. It’s true that that’s one thing that versioning does, but it is not its ultimate task. Its ultimate task is to communicate to users of the software what has changed between different releases, and what impact that has on them (i.e. “why should I care”). Otherwise, why does anyone care which release is newer? What does that matter to a user of the software?

      The rest of the post seems to be reacting to people who believe that SemVer solves a lot of problems that it doesn’t, and throws out the baby with the bath water in doing so. SemVer is certainly imperfect. And maybe there are versioning schemes that are better! But it does have a legitimate claim to attempting to accomplish versioning’s “ultimate task”. And I think that this post fails to sufficiently recognize this fact.

      1. 4

        Its ultimate task is to communicate to users of the software what has changed between different releases, and what impact that has on them (i.e. “why should I care”).

        I’m sorry but that’s historically in that general sense just not true. There’s been a wild mixture of version schemes and they still exist and the only thing that they have in common is that you can order them.

        I could start enumerating examples, but let’s assume you’re right because that’s not my point, what bothers me is this:

        and throws out the baby with the bath water in doing so

        How does the post do that? That was entirely not my intent and I state several times, that there’s value to SemVer as a means fo communication. As you correctly say that the rest goes to dispel some myths (thanks for actually reading the article!) so I’m a bit saddened that you came to that conclusion? I’ve got a lot of feedback in the form of “i like SemVer but the article is right”, so I’m a bit baffled.

        1. 3

          There’s been a wild mixture of version schemes and they still exist and the only thing that they have in common is that you can order them.

          You can’t though, not with the vast majority of large projects.

          Which is more recent:

          • firefox 78.15esr or firefox 80.0.1?
          • OpenSSL_1_0_2u or OpenSSL_1_1_1c?
          • Linux 5.4.99 or 4.19.177?
          • Postgres 12.6 or Postgres 13.1?
          1. 1

            To be specific, many versioning systems only guarantee a partial ordering. This arises because they use a tree-like structure. (Contrast this with a total ordering.)

            1. 1

              That’s a very good point and it depends how you define “newer”. It certainly doesn’t mean “released after”.

            2. 1

              There’s been a wild mixture of version schemes and they still exist and the only thing that they have in common is that you can order them.

              I do agree with that. But trying to establish what the “ultimate task” of a versioning scheme is means coming up with a description of what problem(s) versioning schemes are intended to solve. I don’t think that “being unable to figure out which software release is newer than another” is really a description of a problem, because it’s not yet clear why that is valuable. I can say personally as a user of software (thinking primarily of packages/libraries here) that I never just want to know whether some release is newer than another, I always want to know 1) what changed between subsequent releases and the one my current project uses, and 2) why or whether that matters to my project. I’d say then that the task of a versioning scheme is to help me solve those problems, and that we can judge different versioning schemes by how well they do that.

              How does the post do that?

              I think it’s a little hard to explain concisely because my read (and those of other commenters, I think) of the post as unfairly criticizing the value of SemVer (and maybe versioning schemes in general) is at least somewhat a consequence what is emphasized, and maybe exaggerated, and what’s not. But here’s an example—you say at one point, after talking about strategies “to prevent third-party packages from breaking your project or even your business,” that

              There is nothing a version scheme can do to make it easier.

              which I think is simply untrue. In fact, like I was saying above, I think that’s the whole point (task) of a versioning scheme—to make the process of upgrading dependencies easier/less likely to break your project. Just because they (including SemVer) sometimes fail at that task, or try to reflect things (e.g. breaking API changes) that aren’t necessarily enforceable by mathematical proof, doesn’t mean that they can’t do anything to help us have fewer problems when upgrading our dependencies.

              1. 1

                and those of other commenters, I think

                I mean this in the least judgy way I can summon: I don’t think most other commenters have read the (whole) article. Part of that is poor timing on my side, but I didn’t expect two other articles riffing on that happening appear around the same time. :(

                Just because they (including SemVer) sometimes fail at that task, or try to reflect things (e.g. breaking API changes) that aren’t necessarily enforceable by mathematical proof, doesn’t mean that they can’t do anything to help us have fewer problems when upgrading our dependencies.

                I’m curious: how do you think does that I practice? Like how does that affect your Workflows?

                1. 2

                  I’m curious: how do you think does that I practice? Like how does that affect your Workflows?

                  For SemVer in particular, the MAJOR.MINOR.PATCH distinction helps gives me a sense of how much time I should spend reviewing the changes/testing a new version of a package against my codebase. If I don’t want to audit every single line of code change of every package anytime I perform an upgrade (and I and many people don’t, or can’t), then I have to find heuristics for what subset of the changes to audit, and SemVer provides such a heuristic. If I’m upgrading a package from e.g. 2.0.0 to 4.0.0, it also gives me a sense of how to chunk the upgrade and my testing of it—in this case, it might be useful to upgrade first to 3.0.0 and test at that interval, and then upgrade from there to 4.0.0 and test that.

                  Of course, as you note in your post, this is imperfect in lots of ways, and things could still break—but it does seem clearly better than e.g. a versioning scheme that just increments a number every time some arbitrary unit of code is changed.

                  1. 1

                    How many dependencies do you have though? I understand this is very much a cultural thing but to give you a taste from my production:

                    • a Go project has 25 (from 9 direct)
                    • a Python project has 48 (from 28 direct, some are internal though)
                    • my homepage uses Tailwind CSS + PurgeCSS through PostCSS and the resulting package-lock.json has 171 dependencies (!!!!)

                    It’s entirely untenable for me to check every project’s changelog/diff just because their major bumped – unless it breaks my test suites.

                    I fully understand that there’s environments that require that sort of diligence (health, automotive, military, …) but I’m gonna go out on a limb and say that most people arguing about SemVer don’t live in that world. We could of course open a whole new topic about supply chain attacks but let’s agree that’s an orthogonal topic.

                    P.S. All that said: nothing in the article said that SemVer is worthless, it explicitly says the opposite. I’m just trying to understand where you’re coming from.

                    1. 3

                      When I’m “reviewing my dependencies” I certainly don’t look at indirect dependencies! I don’t use them directly, so changes to their interfaces are (almost) never my problem.

                      1. 2

                        Like @singpolyma, I don’t bother with indirect dependencies either—I only review the changelogs of my direct dependencies.

                        The main project that I’m currently working on is an Elm/JS/TS app, and here’s the breakdown:

                        • Elm direct dependencies: 28
                        • JS direct dependencies: 22
                        • JS direct devDependencies: 70

                        I definitely read the changelog of every package that I update, and based on what I see there and what a smoke test of my app reveals I might dig in deeper, usually from there to the PRs that were merged between releases, and from there straight into the source code if necessary—although it rarely is. Dependabot makes this pretty easy, and upgrading Elm packages is admittedly much safer than upgrading JS ones. But I personally don’t find it to be all that time-consuming, and I think it yields pretty good results.

              2. -2

                Its ultimate task is to

                [citation needed]

                1. 1

                  Are you saying that my claim as to what “versioning’s ultimate task” is requires citation? Or that the author’s does? I’m making a claim about what that is, just as the author is—I’m not trying to make an appeal to authority here.

              1. 47

                I’m so tired of rehashing this. Pointing out that SemVer is not 100% infallible guarantee, or that major versions don’t always cause major breakage adds nothing new.

                Lots of projects have a Changelog file where they document major changes, but nobody argues that reading changelogs would hurt you, because it may not contain all tiniest changes, or mention changes that would discourage people from upgrading, staying on insecure versions forever, etc.

                SemVer is just a machine-readable version of documentation of breaking changes.

                1. 23

                  Yes, and the article tries to succinctly sum up what value can be derived from that and what fallacies await. I’d have to lie to have ever seen it summed up thought that lens in one place.

                  I’m sorry it’s too derivative to your taste, but when the cryptography fire was raging, I was wishing for that article to exist so I can just paste it instead of extensive elaborations in the comments section.

                  1. 11

                    I thought the same thing initially, but it could also be coming from the perspective of using Rust frequently, which is strongly and statically typed. (I don’t actually know how frequently you use it; just an assumption.)

                    A static/strong type system gives programmers a nice boundary for enforcing SemVer. You mostly just have to look at function signatures and make sure your project still builds. That’s the basic promise of the type system. If it builds, you’re likely using it as intended.

                    As the author said, with something like Python, the boundary is more fuzzy. Imagine you write a function in python intended to work on lists, and somebody passes in a numpy array. There’s a good chance it will work. Until one day you decide to add a little extra functionality that still works on lists, but unintentionally (and silently) breaks the function working with arrays.

                    That’s a super normal Python problem to have. And it would break SemVer. And it probably happens all the time (though I don’t know this).

                    So maybe for weakly/dynamically typed languages, SemVer could do more harm than good if it really is unintentionally broken frequently.

                    1. 8

                      That’s all very true!

                      Additionally what I’m trying to convey (not very successfully it seems) is that the reliance on that property is bad – even in Rust! Because any release can break your code even just by introducing a bug – no matter what the version number says. Thus you have to treat all versions as breaking. Given the discussions around pyca/cryptography this is clearly not common knowledge.

                      The fact that this is much more common in dynamic languages as you’ve outlined is just the topping.

                      I really don’t know what I’ve done wrong to warrant that OP comment + upvotes except probably hitting some sore point/over-satiation with these topics in the cryptography fallout. That’s a bummer but I guess nothing I can do about it. 🧘

                      1. 7

                        Car analogy time: You should treat cars as dangerous all the time. You can’t rely on seatbelts and airbags to save you. Should cars get rid of seatbelts?

                        The fact that SemVer isn’t 100% right all the time is not a reason for switching to YOLO versioning.

                        1. 3

                          Except that SemVer is not a seatbelt, but – as I try to explain in the post – a sign saying “drive carefully”. It’s a valuable thing to be told, but you still have to take further measures to ensure safety and plan for the case when there’s a sign saying “drive recklessly”. That’s all that post is saying and nothing more.

                          1. 2

                            Seatbelts reduce the chance of death. Reading a changelog reduces the chance of a bad patch. Trusting semver does not reduce the chance of an incompatible break.

                            1. 6

                              I really don’t get why there’s so much resistance to documenting known-breaking changes.

                              1. 3

                                I really don’t get why there’s so much resistance to documenting known-breaking changes.

                                I mean you could just…like…read the article instead of guessing what’s inside. Since the beginning you’ve been pretending the article’s saying what it absolutely isn’t. Killing one straw man after another, causing people to skip reading because they think it’s another screech of same-old.

                                I’m trying really hard to not attribute any bad faith to it but it’s getting increasingly harder and harder so I’m giving up.

                                Don’t bother responding, I’m done with you. Have a good life.

                                1. -1

                                  mean you could just…like…read the article instead

                                  So where in that article do you say why people don’t want to document known breaking changes ?

                                  Offtopic: That was really hard to read. Too many fat prints and

                                  quotes

                                  with some links in between. It just destroyed my reading flow.

                                  I also think the title “will not save you” is obviously telling everything about why people are just not reading it. It’s already starting with a big “it doesn’t work”, so why should I expect it to be in favor of it ?

                                  1. 4

                                    So where in that article do you say why people don’t want to document known breaking changes ?

                                    Well, the pyca/cryptography team documented that they were rewriting in Rust far in advance of actually shipping it, and initially shipped it as optional. People who relied on the package, including distro package maintainers, just flat-out ignored it right up until it broke their builds because they weren’t set up to handle the Rust part.

                                    So there’s no need for anyone else to cover that with respect to the cryptography fight. The change was documented and communicated, and the people who later decided to throw a fit over it were just flat-out not paying attention.

                                    And nothing in SemVer would require incrementing major for the Rust rewrite, because it didn’t change public API of the module. Which the article does point out:

                                    Funny enough, a change in the build system that doesn’t affect the public interface wouldn’t warrant a major bump in SemVer – particularly if it breaks platforms that were never supported by the authors – but let’s leave that aside.

                                    Hopefully the above, which contains three paragraphs written by me, and only two short quotes, was not too awful on you to read.

                                    1. 1

                                      Thanks, your summary is making a good point, and yes the original blogpost was hard to read, I did not intend this to be a troll.

                                      And nothing in SemVer would require incrementing major for the Rust rewrite

                                      Technically yes, practically I know that many rust crates do not increment the minimum required rust compiler version until a major version. So fair enough, semver in its core isn’t enough.

                          2. 3

                            AFAIU, I think the OP comment may be trying to say that they agree with and in fact embrace the following sentence from your article:

                            Because that’s all SemVer is: a TL;DR of the changelog.

                            In particular, as far as I can remember, trying to find and browse a changelog was basically the only sensible thing one could do when trying to upgrade a dependency before SemVer became popular (plus keeps fingers crossed and run the tests). With the main time waster being trying to even locate and make sense of the changelog, with basically every project showing it elsewhere, if at all. (Actually, I seem to remember that finding any kind of changelog was already a big baseline plus mark for a project’s impression of quality). As such, having a hugely popular semi-standard convention for a tl;dr of the changelog is something I believe many people do find super valuable. They know enough to never fully trust it, similarly as they’d know to never fully trust a changelog. Having enough experience with changelogs and/or SemVer, they however do now see substantial value in SemVer as a huge time saver, esp. compared to what they had to do before.

                            Interestingly, there’s a bot called “dependabot” on GitHub. I’ve seen it used b a team, and what it does is track version changes in dependencies, and generate a summary changelog of commits since last version. Which seems to more or less support what I wrote above IMO.

                            (Please note that personally I still found your article super interesting, and nicely naming some phenomena that I only vaguely felt before. Including the one I expressed in this post.)

                            1. 2

                              I think there is something a bit wrong about the blanket statement that others shouldn’t rely on semver. I suspect that for many projects, trying one’s best to use the API as envisioned by the author, and relying on semver, will in practice provide you with bugfixes and performance improvements for free, while never causing any major problems.

                              I like the parts of this blog post that are pointing out the problems here, but I think it goes way too far in saying that I “need to” follow your prescribed steps. Some of my projects are done for my own enjoyment and offered for free, and it really rubs me the wrong way when anyone tells me how I “should” do them.

                              [edited to add: I didn’t upvote the top level comment, but I did feel frustrated by reading your post]

                              1. 1

                                I’m not sure to respond to that. The premise of the article it that people are making demands, claiming it will have a certain effect. My clearly stated goal is to dissect those claims, so people stop making those demands. Your use case is obviously very different so I have no interest to tell you to do anything. Why am I frustrating you and how could I have avoided it?

                                1. 3

                                  My negative reaction was mostly to the section “Taking Responsibility”, which felt to me like it veered a bit into moralizing (especially the sentence “In practice that means that you need to be pro-active, regardless of the version schemes of your dependencies:”). On rereading it more carefully/charitably, I don’t think you intended to say that everyone must do it this way regardless of the tradeoffs, but that is how I read it the first time through.

                            2. 9

                              Type systems simply don’t do this. Here’s a list of examples where Haskell’s type system fails and I’m sure that you can produce a similar list for Rust.

                              By using words like “likely” and “mostly”, you are sketching a sort of pragmatic argument, where type systems work well enough to substitute for informal measures, like semantic versioning, that we might rely on the type system entirely. However, type systems are formal objects and cannot admit such fuzzy properties as “it mostly works” without clarification. Further, we usually expect type-checking algorithms to not be heuristics; we expect them to always work, and for any caveats to be enumerated as explicit preconditions.

                              1. 2

                                Also there were crate releases where a breaking change wasn’t catched because no tests verified that FooBar stayed Sync/Send.

                                1. 1

                                  All I meant is that languages with strong type systems make it easier to correctly enforce semver than languages without them. It’s all a matter of degree. I’m not saying that languages like Rust and Haskell can guarantee semver correctness.

                                  But the type system does make it easier to stay compliant because the public API of a library falls under the consideration of semver, and a large part of a public API is the types it can accept and the type it returns.

                                  I’m definitely not claiming that type systems prevent all bugs and that we can “rely entirely on the type system”. I’m also not claiming that type systems can even guarantee that we’re using a public API as intended.

                                  But they can at least make sure we’re passing the right types, which is a major source of bugs in dynamically typed languages. And those bugs are a prominent example of why OP argues that SemVer doesn’t work—accidental changes in the public API due to accepting subtly different types.

                            1. 2

                              You want to claim that version 3.2 is compatible with version 3.1 somehow, but how do you know that? You know the software basically “works” because of your unit tests, but surely you changed the tests between 3.1 and 3.2 if there were any intentional changes in behavior. How can you be sure that you didn’t remove or change any functions that someone might be calling?

                              Semantic versioning states that a minor release such as 3.2 should only add backwards compatible changes.

                              So all your existing unit tests from 3.1 should still be in place, untouched. You should have new unit tests, for the functionality added in 3.2.

                              I stopped reading after this, because the argument seems to boil down to either not understanding Semantic versioning, or not having full unit test coverage.

                              1. 20

                                I stopped reading after this

                                If you stopped reading at 10% of the article, you should probably also have stopped yourself from commenting.

                                not understanding Semantic versioning

                                The fallacy you’re committing here is very well documented.

                                1. 1

                                  If you are questioning whether the function you removed/changed is used by anyone when deciding the next version increment, you are not using semantic versioning correctly (unless you always increase the major, regardless of how many people used the feature you modified). As the parent said, if you need to edit 3.1 tests, you broke something, and the semver website is quite clear about what to do on breaking changes.

                                  1. 7

                                    If you don’t only test the public API, it’s entirely possible to introduce required changes in tests in bugfix versions.

                                    More importantly, my point about “no true Scotsman” was that saying “SemVer is great if and only if you follow some brittle manual process to the dot” proves the blog post’s narrative. SemVer is wishful thinking. You can have ambitions to adhere to it, you can claim your projects follow it, but you shouldn’t ever blindly rely on others doing it right.

                                    1. 5

                                      The question then becomes: why does nobody do it then? Do you truly believe that in a world, where it’s super rare that a major version exceeds “5” nobody ever had to change their tests, because some low-level implementation detail changed?

                                      We’re talking about real packages that have more than one layer. Not a bunch of pure functions. You build abstractions over implementation details and in non-trivial software, you can’t always test the full functionality without relying on the knowledge of said implementation details.

                                      Maybe the answer is: “that’s why everybody stays in ZeroVer” which is another way of saying that SenVer is impractical.

                                  2. 6

                                    The original fight about the PyCA cryptography package repeatedly suggested SemVer had been broken, and that if the team behind the package had adopted SemVer, there would have been far less drama.

                                    Everyone who suggested this overlooked the fact that the change in question (from an extension module being built in C, to being built in Rust) did not change public API of the deliverable artifact in a backwards-incompatible way, and thus SemVer would not have been broken by doing that (i.e., if you ran pip install cryptography before and after, the module that ended up installed on your system exposed a public API that was compatible after with what you got before).

                                    Unless you want to argue that SemVer requires version bump for any change that any third-party observer might notice. In which case A) you’ve deviated from what people generally say SemVer is about (see the original thread here, for example, where many people waffled between “only about documented API” and “but cryptography should’ve bumped major for this”) and B) have basically decreed that every commit increments major, because every commit potentially produces observable change.

                                    But if you’d like to commit to a single definition of SemVer and make an argument that adoption of it by the cryptography package would’ve prevented the recent dramatic arguments, feel free to state that definition and I’ll see what kind of counterargument fits against it.

                                    1. 1

                                      Everyone who suggested this overlooked the fact that the change in question (from an extension module being built in C, to being built in Rust) did not change public API of the deliverable artifact in a backwards-incompatible way

                                      I think you’re overlooking this little tidbit:

                                      Since the Gentoo Portage package manager indirectly depends on cryptography, “we will probably have to entirely drop support for architectures that are not supported by Rust”. He listed five architectures that are not supported by upstream Rust (alpha, hppa, ia64, m68k, and s390) and an additional five that are supported but do not have Gentoo Rust packages (mips, 32-bit ppc, sparc, s390x, and riscv).

                                      I’m not sure many people would consider “suddenly unavailable on 10 CPU architectures” to be “backwards compatible”.

                                      But if you’d like to commit to a single definition of SemVer and make an argument that adoption of it by the cryptography package would’ve prevented the recent dramatic arguments, feel free to state that definition and I’ll see what kind of counterargument fits against it.

                                      If you can tell me how making a change in a minor release, that causes the package to suddenly be unavailable on 10 CPU architectures that it previously was available on, is not considered a breaking change, I will give you $20.

                                      1. 8

                                        Let’s take a simplified example.

                                        Suppose I write a package called add_positive_under_ten. It exposes exactly one public function, with this signature:

                                        def add_positive_under_ten(x: int, y: int) -> int
                                        

                                        The documented contract of this function is that x and y must be of type int and must each be greater than 0 and less than 10, and that the return value is an int which is the sum of x and y. If the requirements regarding the types of x and y are not met, TypeError will be raised. If the requirements regarding their values are not met, ValueError will be raised. The package also includes an automated test suite which exhaustively checks behavior and correctness for all valid inputs, and verifies that the aforementioned exceptions are raised on sample invalid inputs.

                                        In the first release of this package, it is pure Python. In a later, second release, I rewrite it in C as a compiled extension. In yet a later, third release, I rewrite the compiled C extension as a compiled Rust extension. From the perspective of a consumer of the package, the public API of the package has not changed. The documented behavior of the functions (in this case, single function) exposed publicly has not changed, as verified by the test suite.

                                        Since Semantic Versioning as defined by semver.org applies to declared public API and nothing else whatsoever, Semantic Versioning would not require that I increment the major version with each of those releases.

                                        Similarly, Semantic Versioning would not require that the pyca/cryptography package increment major for switching a compiled extension from C to Rust unless that switch also changed declared public API of the package in a backwards-incompatible way. The package does not adhere to Semantic Versioning, but even if it did there would be no obligation to increment major for this, under Semantic Versioning’s rules.

                                        If you would instead like to argue that Semantic Versioning ought to apply to things beyond the declared public API, such as “any change a downstream consumer might notice requires incrementing major”, then I will point out that this is indistinguishable in practice from “every commit must increment major”.

                                        1. 1

                                          We don’t need a simplified, synthetic example.

                                          We have the real world example. Do you believe that making a change which effectively drops support for ten CPU architectures is a breaking change, or not? If not, why not? How is “does not work at all”, not a breaking change?

                                          1. 9

                                            The specific claim at issue is whether Semantic Versioning would have caused this to go differently.

                                            Although it doesn’t actually use SemVer, the pyca/cryptography package did not do anything that Semantic Versioning forbids. Because, again, the only thing Semantic Versioning forbids is incompatibility in the package’s declared public API. If the set of public classes/methods/functions/constants/etc. exposed by the package stays compatible as the underlying implementation is rewritten, Semantic Versioning is satisfied. Just as it would be if, for example, a function were rewritten to be more time- or memory-efficient than before while preserving the behavior.

                                            And although Gentoo (to take an example) seemed to be upset about losing support for architectures Gentoo chooses to support, they are not architectures that Python (the language) supported upstream, nor as far as I can tell did the pyca/cryptography team ever make any public declaration that they were committed to supporting those architectures. If someone gets their software, or my software, or your software, running on a platform that the software never committed to supporting, that creates zero obligation on their (or my, or your) part to maintain compatibility for that platform. But at any rate, Semantic Versioning has nothing whatsoever to say about this, because what happened here would not be a violation of Semantic Versioning.

                                        2. 7

                                          If you can tell me how making a change in a minor release, that causes the package to suddenly be unavailable on 10 CPU architectures that it previously was available on, is not considered a breaking change, I will give you $20.

                                          None of those architectures were maintained or promised by the maintainers, but were added by third parties. No matter what your opinion on SemVer is, activities of third parties about whose existence you possibly didn’t even know about, is not part of it.

                                          Keep your $20 but try to be a little more charitable and open-minded instead. We all have yet much to learn.

                                          1. 0

                                            Keep your $20 but try to be a little more charitable and open-minded instead. We all have yet much to learn.

                                            If you think your argument somehow shows that breaking support for 10 CPU architectures isn’t a breaking change, then yes, we all have much to learn.

                                            1. 8

                                              You still haven’t explained why you think Semantic Versioning requires this. Or why you think the maintainers had any obligation to users they had never made any promises to in the first place.

                                              But I believe I’ve demonstrated clearly that Semantic Versioning does not consider this to be a change that requires incrementing major, so if you’re still offering that $20…

                                              1. 0

                                                Part of what they ship is code that’s compiled, and literally the first two sentences of the project readme are:

                                                cryptography is a package which provides cryptographic recipes and primitives to Python developers. Our goal is for it to be your “cryptographic standard library”.

                                                If your self stated goal is to be the “standard library” for something and you’re shipping code that is compiled (as opposed to interpreted code, e.g. python), I would expect you to not break things relating to the compiled part of the library in a minor release.

                                                Regardless of whether they directly support those other platforms or not, they ship code that is compiled, and their change to that compiled code, broke compatibility on those platforms.

                                                1. 8

                                                  Regardless of whether they directly support those other platforms or not, they ship code that is compiled, and their change to that compiled code, broke compatibility on those platforms.

                                                  There are many types of agreements – some formal, some less so – between developers of software and users of software regarding support and compatibility. Developers declare openly which parts of the software they consider to be supported with a compatibility promise, and consumers of the software declare openly that they will not expect support or compatibility promises for parts of the software which are not covered by that declaration.

                                                  Semantic Versioning is a mildly-formal way of doing this. But it is focused on only one specific part: the public API of the software. It is not concerned with anything else, at all, ever, for any reason, under any circumstances. No matter how many times you pound the table and loudly demand that something else – like the build toolchain – be covered by a compatibility guarantee, Semantic Versioning will not budge on it.

                                                  The cryptography change did not violate Semantic Versioning. The public API of the module after the rewrite was backwards-compatible with the public API before the rewrite. This is literally the one, only, exclusive thing that Semantic Versioning cares about, and it was not broken.

                                                  Meanwhile, you appear to believe that by releasing a piece of software, the author takes on an unbreakable obligation to maintain compatibility for every possible way the software might ever be used, by anyone, on any platform, in any logically-possible universe, forever. Even if the author never promised anything resembling that. I honestly do not know what the basis of such an obligation would be, nor what chain of reasoning would support its existence.

                                                  What I do know is that the topic of this thread was Semantic Versioning. Although the cryptography library does not use Semantic Versioning, the rewrite of the extension module in Rust did not violate Semantic Versioning. And I know that nothing gives you the right to make an enforceable demand of the developers that they maintain support and compatibility for building and running on architectures that they never committed to supporting in the first place, and nothing creates any obligation on their part to maintain such support and compatibility. The code is under an open-source license. If you depended on it in a way that was not supported by the developers’ commitments, your remedy is to maintain your own fork of it, as with any other upstream decision you dislike.

                                      2. 4

                                        “Should” is the key word here because I haven’t ever contributed to an open source project that has that as part of their policy neither have I observed it’s wide application given the state of third party packages.

                                        The article specifically speaks about the divergence between aspiration and reality and what conclusions can be drawn from that.

                                        1. 3

                                          Unfortunately the aspiration is broken too.

                                          1. 2

                                            Baby steps 😇

                                        2. 3

                                          It sounds like you’re proposing to use unit tests to prove that a minor release doesn’t introduce backwards-compatible changes. However, tests cannot substitute for proofs; there are plenty of infinite behaviors which we want to write down in code but we cannot exhaustively test.

                                          All of these same problems happen in e.g. Haskell’s ecosystem. It turns out that simply stating that minor releases should only add backwards-compatible changes is just an opinion and not actually a theorem about code.

                                          1. 1

                                            No I think they have a valid point. “Surely” implies that it’s normal to “change” unittests between minor versions, but the term “change” here mixes “adding new” and “modifying existing” in a misleading way. Existing unittests should not change between minor versions, as they validate the contract. Of course, they may change anyway, for instance if they were not functional at all, or tested something wrong, but it should certainly not be common.

                                            edit: I am mixing up unittests and system tests, my apologies. Unit tests can of course change freely, but they also have no relation to SemVer; the debate only applies to tests of the user-facing API.

                                            1. 2

                                              I know people use different terminology for the same things, but if the thing being tested is a software library, I would definitely consider any of the tests that aren’t reliant on something external (e.g. if you’re testing a string manipulation method) to be unit tests.

                                              1. 1

                                                Take any function from the natural numbers to the natural numbers. How do you unit-test it in a way that ensures that its behavior cannot change between semantic versions? Even property tests can only generate a finite number of test cases.

                                                1. 2

                                                  I think the adage “code is written for humans to read, and only incidentally for computers to execute” applies to tests especially. Of course you can’t test every case, but intention does count.

                                              2. 1

                                                Aside:

                                                I just recently added a test that exercises the full API of a Rust library of mine, doing so in such a way that any backwards-compatible breaking changes would error if added. (The particular case was that I’d add a member to a config struct, and so anyone constructing that struct without including a ..StructName::default() at the end would suddenly have a compile error because they were missing a field.) This seemed to do the trick nicely and would remind me to bump the appropriate part of semver when making a release.

                                                I work on the library (and in the Rust ecosystem) infrequently so it’s not at the front of my mind. More recently I accepted a PR, and made a new release including it after. Then I got the warning, again, that I’d broken semver. Of course, the failing test was seen by the contributor and fixed up before they submitted the PR, so I never saw the alarm bells ringing.

                                            1. 2

                                              Nobody Has Suggested That Semantic Versioning Will Save Anyone

                                              1. 3

                                                Here on this site, several discussions in the original thread about the pyca/cryptography change brought up SemVer and certainly appeared to my eyes to be suggesting that it would have prevented or mitigated the drama.

                                                While it is possible that you personally have never made such claims about SemVer (I have not bothered to check), it is an easily-demonstrated fact that others have, and the OP here reads to me as an argument against those claims as made by those people.

                                                1. 2

                                                  Hmm. I remember that thread, and re-skimmed it now. I didn’t find anyone saying semver would have prevented the situation. It certainly would have mitigated it somewhat, though. And I don’t agree that “it is an easily-demonstrated fact” that any significant group of people believe that semver in itself is going to solve any problems. My experience has consistently been that most, or almost all, people in semver’s demographic understand it is an approximate tool and not a panacea.

                                                2. 3

                                                  Hey Peter big fan here! Sadly there’s been plenty suggesting that in that particular fiasco. Repeatedly. Even right now on Twitter in my mentions.

                                                  There’s still a lot of assumptions about what SemVer can do for someone. I needed to write down the explanation why that’s not the case so I don’t have to repeat myself.

                                                  1. 2

                                                    Can you link to one of these examples? As I said below, my experience has consistently been that most, or almost all, people in semver’s demographic understand it is an approximate tool and not a panacea.

                                                    1. 2

                                                      I have to admit that “maintainer of a popular package thinks ‘almost all users’ have a realistic expectation from SemVer” was not on my bingo card!

                                                      I suspect the kicker is

                                                      people in semver’s demographic

                                                      And that your demographic is simply different from mine. Maybe Python vs Go is all that it takes. Who knows. One of the main drivers why I write is to avoid repeating myself and I assure you I wouldn’t taken the time to write it if I didn’t expect to save time in the future.

                                                      Can you link to one of these examples?

                                                      I don’t want to call out people in public and if in your lived reality this isn’t a problem that’s fair enough.

                                                      I state my premise in first paragraph and if it doesn’t apply to you or your users, it’s fair to skip it. Not sure if the sardonic dunk without reading it was necessary though.

                                                      1. 3

                                                        What makes you think I didn’t read the article?

                                                        Like others, I think you’re “dunking” on semver unnecessarily. I generally agree with your description of it as a tl;dr of the changelog — but that’s incredibly valuable! 99% of the time I can trust it’s accurate and that’s a huge boon to my productivity. I understand it’s not necessarily accurate — and I haven’t yet encountered anyone who doesn’t understand it’s not necessarily accurate — but that’s fine, when it fails it’s detected with tests and that’s just an inconvenience more than anything.

                                                      2. 2

                                                        The entire Haskell ecosystem overly depends on semantic versioning. As a result, there are over 8000 Haskell packages in nixpkgs which are broken:

                                                        $ git grep ' broken = true;' pkgs/development/haskell-modules/ | wc -l
                                                        8835
                                                        
                                                  1. 2

                                                    Enthusiasts of 32-bit hardware from the 1990s aside

                                                    Dedicated to Alex and Paul who are willing to take the heat for the rest of us.

                                                    Right from the start this seems inflammatory.

                                                    1. 3

                                                      I don’t think people who openly stated they love their Amigas see that as inflammatory because it’s not. That part is no way judging, it’s just one half of the complaints.

                                                      What’s supposed to be inflammatory about dedicating a post that tries to dispel some myths that caused massive abuse against two of my friends is also entirely unclear to me.

                                                      1. 1

                                                        There’s been 4 posts in just the last couple weeks on this topic, they all immediately rise to the top of the front page and create hundreds of comments. At this point I don’t think the community is gaining anything by reading more takes on this topic that do not attempt to come to a solution or a compromise.

                                                        What’s supposed to be inflammatory about dedicating a post that tries to dispel some myths that caused massive abuse against two of my friends is also entirely unclear to me.

                                                        I want to be clear that it is not cool that anyone receives abuse. No matter what position you take, as long as you do not hurl abuse at someone else, you should not receive abuse. That said, I think authoring a post on a controversial topic like this isn’t helped by immediately laying out that you’re here to defend your friends. Anger or defensiveness is probably not the spice for reasonable debate.

                                                        Anyway I don’t want to belabor a thread on this so that’s my $0.02.

                                                        1. 2

                                                          There’s been 4 posts in just the last couple weeks on this topic, they all immediately rise to the top of the front page and create hundreds of comments.

                                                          I think that has been been my original sin but just in my defense: that draft has been sitting around for a year and the whole thing made me finish it. I’ve had my first draft ready when the last two bigger articles appeared last weekend but I’m generally slower at writing.

                                                          But I hope that the article delivers some timeless value that will be seen more kindly down the road.

                                                    1. 3

                                                      I agree with the sentiment that you should pin all dependencies.

                                                      But I never had the idea that SemVer would „save“ me - I only ever saw it as a means of communicating expected impact.

                                                      1. 3

                                                        That’s very correct, but look at the other comments and you’ll see that it isn’t universal consensus and even suggesting it seems rather triggering to some. 🤷‍♂️

                                                      1. 6

                                                        Where does it say that a version change means no bugs? Maybe I’m wrong but I’ve always understood SemVer to be a means of communicating the scope of changes. A patch change means I shouldn’t have to change anything, minor means I can access new features, and major means I may have to change my code. Nothing there says anything about lack of bugs though.

                                                        I’ve also come to believe less and less in placing 1.0 on a pedestal. So many companies and devs use 0.x software that the idea of 1.0 == production is just a silly things we tell ourselves to feel cozy. Many popular tools either stay at 0.x for years while being used in production and others hit 1.0 then 2.0, 3.0, etc all in quick succession making the idea of 1.0 meaning stability just a joke.

                                                        1. 8

                                                          Ehm, you have just somewhat summed up parts of the article making it sound like you’re contradicting. What did you think it was trying to say? 😅

                                                        1. 5

                                                          YAML is to config syntaxes what Python is to programming languages. Seriously, significant whitespaces?

                                                          Wow. Great way to weaken one’s argument by incorporating completely unnecessary cheap shots.

                                                          1. 7

                                                            I’m gonna pass over the problems of the methodology that have been pointed out before and stop right right at the title: “Async Python is not faster”. The click-baitiness aside, it demonstrates a prevalent misconception that – as a card-carrying async aficionado – I’ve always found problematic.

                                                            Asynchronous IO is about one thing only: a better usage of resources. Never has anyone claimed that your code will get magically faster by switching from preemptive multitasking to cooperative multitasking. The promise was always that if you get 1,000 instead of 10 simultaneous connections, your code will slow down linearly and not fall over. And in the case of higher-level languages like Python, you also get better ergonomics through explicit concurrency and nicer APIs (lol epoll, BTDT).

                                                            So if you want to serve many clients – some of which having huge latencies – at once? Use async. You want to run I/O-bound code (doesn’t have to be network – black achieves some great feats with asyncio) concurrently with nice APIs? Use async. Long lived connections like websockets in a GIL-ed runtime? Use async.

                                                            But if all you do is getting data from a database and serve it by HTTP? You’re gaining nothing and you’re paying by having to sprinkle your code with async and await for no measurable gain. You also lose all benefits of database drivers and SQLAlchemy using C to release the GIL. So yeah, even if the benchmarks weren’t flawed: they don’t matter if you choose the right tools for the job.

                                                            The fact that async is a poor match for this job, does not mean it’s bad for the jobs that it was built for.

                                                            1. 0

                                                              Never has anyone claimed that your code will get magically faster by switching from preemptive multitasking to cooperative multitasking

                                                              If only - performance claims from async web frameworks are in fact extremely common. I covered Vibora in the article. Starlette and Sanic both also make prominent claims in their documentation vs alternatives (which, let’s be honest, are Django and Flask). These claims were not proved out in my tests.

                                                              The promise was always that if you get 1,000 instead of 10 simultaneous connections, your code will slow down linearly

                                                              This promise seems very dubious because in fact I found, both in the benchmark and out of it, that the async frameworks dealt extremely poorly with high load.

                                                              Moreover I think the topic of dealing with connections rather than requests is in my opinion moot because very few people terminate a TCP connection that has arrived over the internet with their Python program. Amazon’s ELB, HAProxy, nginx etc are used for that. And then of course there is the question of whether defining an autoscaling group is a more appropriate solution for this worry than writing your application in a special way.

                                                              So if you want to serve many clients – some of which having huge latencies – at once? Use async. You want to run I/O-bound code (doesn’t have to be network – black achieves some great feats with asyncio) concurrently with nice APIs? Use async. Long lived connections like websockets in a GIL-ed runtime? Use async.

                                                              I think some of this is fine as far as it goes. Using asyncio for an websocket service makes intuitive sense to me and especially if you avoid doing any CPU work I think that will probably work fine. However this is not as far as it goes - there is a profusion of general purpose web frameworks and other tools that are clearly intended to do much more than just TCP connection management. That is the problem.

                                                              1. 4

                                                                performance claims from async web frameworks are in fact extremely common

                                                                The boisterous claims of async frameworks have irked me for a long time, however the title of your post is not “async frameworks make misleading claims about performance” or “simple web apps don’t need async” but “Python async is not faster”. The irony of quoting NJS, an author of an async framework, as an argument against async in general has been already pointed out too.

                                                                It seems like to me you were disappointed by the characteristics of a web app running asynchronously, drew the wrong conclusions, and quoted any material that remotely seemed to confirm your case – not even shying away from pulling gevent into the mix.

                                                                I can assure you that watching people froth over async as the silver bullet for anything is just as frustrating from my end however I’m afraid you took the wrong turn. I’d suggest to have a look what the original promises were and what good fits for that are – the comments here should have given you a few good pointers. If you judge a technology by what excited kids push to GitHub or Medium and compare it to reality, you’re always gonna be disappointed.

                                                                This promise seems very dubious because in fact I found, both in the benchmark and out of it, that the async frameworks dealt extremely poorly with high load.

                                                                Yes, but the problems with your benchmark have been discussed elsewhere so no need to reiterate them here.

                                                                Moreover I think the topic of dealing with connections rather than requests is in my opinion moot because very few people terminate a TCP connection that has arrived over the internet with their Python program. Amazon’s ELB, HAProxy, nginx etc are used for that.

                                                                That is only true for short-lived, stateless HTTP requests which async is only a mediocre fit for. Only few people will argue this point. However there’s many more types of connections, the most common one probably being web sockets and good luck handling them with a sync framework with more then ten clients. But I can assure you there are many more and I wouldn’t want to miss async networking when dealing with them.

                                                                1. 0

                                                                  Originally you claimed that

                                                                  Never has anyone claimed that your code will get magically faster by switching from preemptive multitasking to cooperative multitasking

                                                                  And now you admit that

                                                                  The boisterous claims of async frameworks have irked me for a long time

                                                                  I think you are right the second time and this is my feeling too.

                                                                  Re:NJS - for what it’s worth I didn’t quote him but I wouldn’t feel bad if I did. I don’t think it’s wrong to surmise from the progression of asyncio -> curio -> trio that async is difficult. I am not “out to get” async but I do strongly dislike the chronic over application of it - which, it sounds to me, you also recognise as a problem.

                                                                  1. 3

                                                                    I think the problem here is that I was talking about the people that built async APIs (epoll/kqueue/…) and low-level frameworks (asyncio, Twisted, trio, …) and you about applications/framework that build on it (not gonna name them to avoid unnecessary shaming).

                                                                    I absolutely see the problem of its misapplication, which is why I didn’t argue about the benchmarks at all: I don’t find them interesting for that use case because the use case isn’t interesting.

                                                                    But I also don’t see how your post is conveying that point neither from reading it myself nor from the reception it got.

                                                            1. 4

                                                              I’d generally this a bit more and say “you should treat tests like any other code”. This includes making sure it’s sure it’s understandable (using e.g. documentation), but also includes things like running your linters, making sure it can be understood easily, and generally just making sure it doesn’t become this kludge I’ve seen far too often.

                                                              1. 5

                                                                So, I generally agree with this, but there are some important differences.

                                                                For instance: Test code is not usually itself tested. If you fail to hook up a feature correctly, users will notice it isn’t there; but if you fail to hook up the test code correctly, it’ll never fail.

                                                                As a result, it’s valuable to keep test code very simple, even if doing so generates verbosity. Multiple times I’ve discovered tests that are either not running at all or not executing any checks, because some clever abstraction was introduced to make them easy to write.

                                                                1. 2

                                                                  I’ve actually joked that some tests need tests themselves.

                                                                  Yeah, I agree tests should be very simple; overcomplicated tests and testing frameworks are one of the few things I have very strong negative opinions on. Not just because what you’re saying, but also because when writing tests you need to think of both the testing code and the code being tested. Reducing the cognitive load here really helps in my experience.

                                                                  1. 2

                                                                    when writing tests you need to think of both the testing code and the code being tested

                                                                    Conversely, thinking about both the code itself and how you will test it when you write the code can help you to end up with more testable code which avoids the need for overcomplicated tests.

                                                                    1. 2

                                                                      A test by itself doesn’t have to be over complicated to need an explanation. The connection between “what am I testing” and “how do I verify I have achieved it” can and often is though.

                                                                      1. 1

                                                                        A test by itself doesn’t have to be over complicated to need an explanation

                                                                        Sorry. I didn’t mean my comment as “If you write testable code, your tests will be simple enough to be self documenting”, although given the context, I now see that it could be read as such. While it might be true that testable code leading to simpler tests might let you get away with poorer documentation, I wouldn’t encourage it. I agree with the idea that you should document tests.

                                                                        As I see it, the two issues (how simple your tests are vs. how well documented your tests are) are largely orthogonal.

                                                                      2. 1

                                                                        Perhaps; but it’s still a lot more to keep in your head.

                                                                1. 6

                                                                  I don’t like these “you must add comments” policies as more often than not it leads to a lot of noise and duplicate or obsolete information. The test should be well written enough that what it does is obvious and, just like for regular code, comments should be added only when something may be unclear.

                                                                  Also I don’t know about Python but, with most JavaScript test unit libraries, the test will be described directly in code:

                                                                  it('should find a character in a string', () => {
                                                                      expect('abcd'.indexOf('b')).toBe(1);
                                                                  }) 
                                                                  

                                                                  This is useful because that string, unlike comments, would show up when a test doesn’t pass.

                                                                  But the intent can’t always be summed up in a few words

                                                                  It implies that it can often be summed up in a few words, which in turns implies that most of the time comments would be useless.

                                                                  1. 9

                                                                    it('should find a character in a string'

                                                                    This is pretty much exactly what the article is arguing for. The rest is just a matter of syntax and tooling features, so why the contrarianism? There’s an entire paragraph talking about avoiding the kind of bad comments you mention.

                                                                    This is useful because that string, unlike comments, would show up when a test doesn’t pass.

                                                                    Ironically, Python’s unittest module (the one in the standard library; nowadays most people use pytest fortunately) does that with doctests which made CPython Core ban the usage of doctests because they found that confusing.

                                                                    It implies that it can often be summed up in a few words, which in turns implies that most of the time comments would be useless.

                                                                    This wildly depends on the project and type of code. It’s also really difficult to judge right now – when the code is fresh on your mind – what will confuse you in a year. At least that was my experience.

                                                                  1. 2

                                                                    Random 16 bit numbers that are prefixed by the type and that can be translated offline to private IP addresses:

                                                                    Eg c-1000 is a container whose IP address ends with 10.0. So if 10.1.0.0/16 is the network for containers, this one’s main address would be 10.1.10.0.

                                                                    1. -1

                                                                      The best SRE recommendation around Memcached is not to use it at all:

                                                                      • it’s pretty much abandonware at this point
                                                                      • there is no built-in clustering or any of the HA features that you need for reliability

                                                                      Don’t use memcached, use redis instead.

                                                                      (I do SRE and systems architecture)

                                                                      1. 30

                                                                        … there was literally a release yesterday, and the project is currently sponsored by a little company called …[checks notes]…. Netflix.

                                                                        Does it do everything Redis does? No. Sometimes having simpler services is a good thing.

                                                                        1. 11

                                                                          SRE here. Memcached is great. Redis is great too.

                                                                          HA has a price (Leader election, tested failover, etc). It’s an antipattern to use HA for your cache.

                                                                          1. 9

                                                                            Memcached is definitely not abandonware. It’s a mature project with a narrow scope. It excels at what it does. It’s just not as feature rich as something like Redis. The HA story is usually provided by smart proxies (twemcache and others).

                                                                            1. 8

                                                                              It’s designed to be a cache, it doesn’t need an HA story. You run many many nodes of it and rely on consistent hashing to scale the cluster. For this, it’s unbelievably good and just works.

                                                                              1. 3

                                                                                seems like hazelcast is the successor of memcached https://hazelcast.com/use-cases/memcached-upgrade/

                                                                                1. 3

                                                                                  I would put it with a little bit more nuance: if you have already Redis in production (which is quite common), there is little reason to add memcached too and add complexity/new software you may have not as much experience with.

                                                                                  1. 1

                                                                                    this comment is ridiculous

                                                                                    1. 1

                                                                                      it’s pretty much abandonware at this point

                                                                                      i was under the impression that facebook uses it extensively, i guess redis it is.

                                                                                      1. 10

                                                                                        Many large tech companies, including Facebook, use Memcached. Some even use both Memcached and Redis: Memcached as a cache, and Redis for its complex data structures and persistence.

                                                                                        Memcached is faster than Redis on a per-node basis, because Redis is single-threaded and Memcached isn’t. You also don’t need “built-in clustering” for Memcached; most languages have a consistent hashing library that makes running a cluster of Memcacheds relatively simple.

                                                                                        If you want a simple-to-operate, in-memory LRU cache, Memcached is the best there is. It has very few features, but for the features it has, they’re better than the competition.

                                                                                        1. 1

                                                                                          Most folks run multiple Redis per node (cpu minus one is pretty common) just as an FYI so the the “single process thing” is probably moot.

                                                                                          1. 5

                                                                                            N-1 processes is better than nothing but it doesn’t usually compete with multithreading within a single process, since there can be overhead costs. I don’t have public benchmarks for Memcached vs Redis specifically, but at a previous employer we did internally benchmark the two (since we used both, and it would be in some senses simpler to just use Redis) and Redis had higher latency and lower throughput.

                                                                                            1. 2

                                                                                              Yup. Totally. I just didn’t want people to think that there’s all of these idle CPUs sitting out there. Super easy to multiplex across em.

                                                                                              Once you started wanting to do more complex things / structures / caching policies then it may make sense to redis

                                                                                              1. 1

                                                                                                Yeah agreed, and I don’t mean to hate on Redis — if you want to do operations on distributed data structures, Redis is quite good; it also has some degree of persistence, and so cache warming stops being as much of a problem. And it’s still very fast compared to most things, it’s just hard to beat Memcached at the (comparatively few) operations it supports since it’s so simple.

                                                                                    1. 7

                                                                                      I’m currently a Python dev (apparently this is the most recent turn my career has taken), and I’m really bummed out by its web story outside of Django.

                                                                                      My last gig was Elixir, before that Node, and some Rails and Laravel in there. The tooling in the Python ecosystem, especially around migrations and dependency management, just feels clunky.

                                                                                      It singlehandedly sold me on Docker just so I didn’t have to mess with virtualenvs and multiple runtimes on my system and all of that. Like, what happened? Everybody groused about 2-to-3 (which is still hilarious) but like even without that I feel like the ecosystem has been vastly outstripped by “worse” technologies (see also, NodeJS).

                                                                                      1. 4

                                                                                        It singlehandedly sold me on Docker just so I didn’t have to mess with virtualenvs

                                                                                        One thing that made virtualenvs almost entirely painless for me was using direnv: in all my python project directories I have a bash script named .envrc that contains source .venv/bin/activate, and now cd-ing in/out of that directory will enter/exit the virtualenv automatically and instantaneously. It’s probably possible to set it up to switch pyenv environments as well.

                                                                                        1. 3

                                                                                          One of the reasons why Python packaging still feels so clunky compared to other ecosystems is that the Python ecosystem is a lot more diverse thanks to e.g. the scientific stack that has very different needs than the web peeps so there’s never gonna be an all-encompassing solution like Cargo. Pipenv tried and failed, poetry is carving a niche for itself.

                                                                                          But the primitives are improving. pip is currently growing a proper resolver and doesn’t e.g. Ruby still need a compiler to install binary packages? As long as long as you don’t use Alpine for your Docker images, Python’s wheels are great (they’re just a bit painful to build).

                                                                                          1. 1

                                                                                            How did pipenv fail?

                                                                                            1. 4

                                                                                              Short answer: it’s too complex which makes it buggy and there wasn’t a release in over a year. IOW: It’s falling over it’s own weight.

                                                                                              Long answer: https://hynek.me/articles/python-app-deps-2018/

                                                                                          2. 3

                                                                                            The tooling in the Python ecosystem, especially around migrations and dependency management, just feels clunky.

                                                                                            Currently working on a Rails app, coming from the Flask ecosystem. You have no idea how much I can miss SQLAlchemy and Alembic.

                                                                                            I agree about dependency management, but certainly not about migrations. Modifying models and auto-generating migrations works much better than the other way around for me.

                                                                                          1. 3

                                                                                            LOVE this article! Especially the offer to help folks in need to get more visibility. Good on you for being willing to put elbow grease into moving the needle!

                                                                                            A question about your reactions to people writing about building in the cloud. Are you saying you’d like to see less of that, or that you think doing so is a bad idea to begin with?

                                                                                            I ask because the former seems a perfectly reasonable preference, but I’d argue that the latter could be a reactionary stance we might think carefully before taking.

                                                                                            The cloud is a GREAT tool for certain use cases and an AWFUL one for others. I’d love to see some of the hype and acrimony get stripped away so we could all just use the right tool for the right job and get on with our lives :)

                                                                                            Your chances of me helping are increased if you’re part of an URM and/or if I find your topic interesting. Please accept my apology if I can’t help you specifically, but I’ll try to find time for as many people as my time permits.

                                                                                            Just in case anyone else read this and feels too abashed to ask: URM is an acronym for Under Represeented Minority.

                                                                                            1. 1

                                                                                              A question about your reactions to people writing about building in the cloud. Are you saying you’d like to see less of that, or that you think doing so is a bad idea to begin with?

                                                                                              Not at all! I’m just somewhat annoyed by the fact that given a lot of public discourse is dominated by paid cloud advocates you could get the impression, that everyone is is running their stuff in their clouds, on top of their products like hosted Kubernetes.

                                                                                              That’s obviously wrong but they’re paid to give you that impression and I suspect that many people feel inadequate due to that despite having good reasons to run their stuff differently. We need to get those people too – not asking for exclusivity. :)

                                                                                              1. 1

                                                                                                I think you’re right, and I also think some people end up feeling like they SHOULD run their workloads in the cloud even when maybe doing an actual evaluation of their situation might serve them better. Heck, a cloud solution may well be exactly the right fit, but you won’t know until you really survey all the options and figure out what works best for you.

                                                                                            1. 2

                                                                                              We are one of those Python web services companies running on Docker (AWS Fargate). One of our big problems has been dependency management in our monorepo. We want to be able to build and deploy to production quickly and often (more than a dozen times per day) so we want our CI job to run in ~10 minutes or less and our deploys to run in ~30 minutes or less. We also care about reproducible builds, so we initially looked at pipenv, but it took 30 minutes just to resolve dependencies for any change in any container. Eventually we moved to https://github.com/pantsbuild/pants which has solved many problems, but it’s an awful piece of engineering that happens to do the job right so long as you never deviate from the happy path and you don’t need to do reasonable things like ask, “what is the unique hash of this version of my system? [so I can use that to tag my Docker images]”. In general, dependency management and ecosystem tooling is still a big pain and we spent a lot of time to find something that worked only passably. Others have probably found a solution that works well, but we haven’t stumbled upon it yet (possibly because there isn’t enough attention devoted to running web services among the Python community?).

                                                                                              1. 9

                                                                                                Without knowing your details, I can’t give advice, but as a data point: I’ve personally found joy in the flexibility of pip-tools. I’ve blogged about it (well, I mostly blogged about why I don’t use Pipenv and poetry) in 2018 and updated it November 2019 with my personal workflow: https://hynek.me/articles/python-app-deps-2018/

                                                                                                1. 2

                                                                                                  I haven’t even heard of pip-tools. Reading through your blog post now. Thanks for the recommendation!

                                                                                              1. 3

                                                                                                That’s really cool. Btw, I`ve heard about their plans to remove or significantly modify the GIL logic after 3.8.0, so that there would be a distinct interpreter lock for each thread. How is it going so far?

                                                                                                1. 4

                                                                                                  Do you mean subinterpreters? Eric talked about it recently on a podcast: https://talkpython.fm/episodes/show/225/can-subinterpreters-free-us-from-python-s-gil

                                                                                                  3.8’s multiprocessing.shared_memory is almost as interesting because it allows “free” one-way communication between processes.

                                                                                                1. 2

                                                                                                  I agree that macOS has arrived at a complexity that Apple apparently can’t handle anymore (which is very different to Linux’s problem in 2000, I’ve been there).

                                                                                                  But I’m gonna leave here that if you have problems with projectors, it’s probably the fault of the USB-C to HDMI converters. Which is most probably caused by USB-C/TB3 being a shit show so far. You can put it on Apple that they miscalculated the trajectory of USB-C but I’ve seen other notebooks fail and honestly MacBooks still seem to make the least problems at conferences. They got the same shit for dropping disk drives and going all in on USB-A and it worked fine. There had to come a miss (no I don’t consider dropping 3.5mm a miss, it’s a mixed bag at best). 🤷‍♂️

                                                                                                  FWIW I have a 2018 MBP with the Belkin USB-C to HDMI converter sold directly from Apple and have spoken at conferences on three continents and I’ve had zero problems so far. Which is why I always travel with my own HDMI and VGA adapters to conferences. And I have already saved other speakers their butts with them too, so I can very much recommend that – it might help you making friends. :)

                                                                                                  1. 2

                                                                                                    …but ask yourself why that USB-HDMI dongle is needed in the first place. I don’t need one because several laptops around here have a native HDMI interface.

                                                                                                    When form goes over function, function is lost.

                                                                                                    1. 2

                                                                                                      That’s not a uniquely Apple problem though. Hard to say if it’s caused by others aping Apple or whether it’s a natural progression, but the “naked robotic core” seems to be an ideal that is favored generally.

                                                                                                      USB-C/TB3 is as far I can say one of the biggest consumer-hostile failures of the tech industry in the last years: no good hubs, wonky dongles, five different cables that look the same but do different things. But that’s not on Apple (alone).

                                                                                                      1. 1

                                                                                                        The last Macbook pro with an HDMI was circa 2015 I think, or that last year before Apple decided that a terrible keyboard and a single USB-C port ought to be enough for anybody. :p

                                                                                                        1. 2

                                                                                                          Ironically, my current USB-C to HDMI dongle is more reliable than my 2014’s built-in HDMI port. At some point I started carrying an extra dongle to be sure too. ¯\_(ツ)_/¯ (Having to present from a stranger’s notebook is one of the biggest nightmares of most speakers.)