1. 48
  1.  

  2. 31

    Python is as messy, if not messier, than Node

    I’d say Node is, like, not messy at all. Due to Node being younger, its ecosystem developed and matured in an age where per-project isolated local dependencies were becoming the norm.

    Python is arguably uniquely messy due to the invention of virtualenv. I’m not aware of any other language community where the default way to make isolated projects for ages was a thing that made a whole local prefix with symlinks to the interpreter and whatnot and a script you’d have to run to bring it into $PATH. Python’s closest cousin (in terms of 2000s web dev evolution anyway), Ruby, did it all correctly relatively early on — the Bundler experience is just like npm/cargo/stack/etc.

    But forget virtualenv, the real messiness of Python packaging is the freaking layering. Distutils, setuptools, pip, virtualenv (or the new project-based managers) — it’s all there and which part is handled by which layer gets confusing quickly.

    1. 10

      The thing that worries me is that I don’t think the core Python devs get it: the packaging situation is so bad that it very well may kill the language. It should be all hands on deck, this is the most important thing for us to fix. But I don’t see that happening at all, and instead there’s fiddling with esoterica like walrus and match…

      1. 13

        So one interesting twist here is that many core devs work for big companies that use their own build and deployment systems for Python. (e.g. I worked with probably a dozen core Python devs at Google many years ago.) So they may not feel the issue on a daily basis. I certainly feel it more now that I do more open source, although I was aware of it back then whenever I had to install NumPy, etc.

        From what I hear Jane St. is in a similar situation with OCaml and OPAM. They sponsor the open source package manager, but they don’t actually use it themselves! Because they use a monorepo like Google. monorepo means “no version constraint solving”, which simplifies the problem drastically (it’s an NP complete problem).


        I also think the problem is more complicated than “core devs don’t get it”. It’s more like the solutions being very constrained by what happened in the past. For a long time the import system / algorithm itself was a very big wart and there was a huge effort to clean it up.

        I will say that I think most of these problems were known before Python 3, and I wish there was a hard break then, but that was already 12-15 years ago at this point. And then people would have probably liked the 2->3 transition even less, etc.

        1. 5

          So one interesting twist here is that many core devs work for big companies that use their own build and deployment systems for Python. (e.g. I worked with probably a dozen core Python devs at Google many years ago.) So they may not feel the issue on a daily basis. I certainly feel it more now that I do more open source, although I was aware of it back then whenever I had to install NumPy, etc.

          This is definitely the case with my cloudy overlords, although I will say that this may be changing. I think some people are recognizing that there is wisdom in not rolling their own and allowing devs to leverage familiar interfaces for packaging.

        2. 4

          For what it’s worth, this is the #1 reason I keep not trying Python. It’s just a huge headache and I can’t care enough about it.

          1. 5

            I’ve been using Python professionally since 2010, I used to absolutely love it, and I’ve finally just reached the point where I no longer consider it an acceptable choice for a greenfield project of any scale, no exceptions. Someone will always come around to suggest that it’s open source, and you should fix what you don’t like, but it’s been my experience that the community culture has settled into a state that no project or ecosystem ever recovers from, which is when issues start being responded to with justifications along the lines of “this can’t change because of fundamental technical deficiency X, which because of its magnitude we’ve decided to stop treating as a technical deficiency in favor of treating it as a fundamental invariant of the universe, in spite of ample evidence that there are better ways to do it.” Either that technical deficiency is truly unsurmountable, in which case the tool is de jure broken and I have no good reason to use it, or that technical deficiency is surmountable but there will never be any will to fix it, in which case the tool is de facto broken and I have no good reason to use it. I feel deep sadness about this, but at this point there is too little that is exceptional about Python to justify putting effort into fixing what’s broken. No tool lasts forever, maybe we should just accept that it’s time to sunset this one.

            1. 1

              Thats a great point, and a good answer to “Why don’t you just fix it yourself” that is thrown anytime you complain about any open source project, especially established ones like Python.

              Python especially has this cultish mentality “All if perfect how dare you suggest otherwise”

          2. 3

            The thing that worries me is that I don’t think the core Python devs get it: the packaging situation is so bad that it very well may kill the language. It should be all hands on deck, this is the most important thing for us to fix. But I don’t see that happening at all, and instead there’s fiddling with esoterica like walrus and match…

            Python’s governance is pretty transparent. Do you have concrete suggestions for improvement? If you do, consider coming up with even a proof of concept implementation and creating a PEP.

            Be the change you want to see in the world :)

          3. 8

            the real messiness of Python packaging is the freaking layering. Distutils, setuptools, pip, virtualenv (or the new project-based managers) — it’s all there

            Yup +100 to this … This is why I sometimes download tarballs with shell scripts and use “python setup.py build” instead. That’s only one layer :)

            That approach doesn’t work all the time, e.g. if you have a big NumPy stack, or a big web framework with transitive dependencies.

            On the other hand, if it works, then you know exactly what your dependencies are, and you can archive the tarballs somewhere for reproducible builds, etc.

            1. 3

              the Bundler experience is just like npm/cargo/stack/etc

              Thatk’s mostly splitting hairs, but for me there’s a big difference with cargo when it comes to experience: cargo run / cargo build just work, while bundle exec, npm run require running install command manually.

              1. 1

                There’s a lot that node gets right simply by virtue of being a younger language and community than Python or Ruby by a long shot, so it could benefit from observing the effect of critical decisions over time.

                Unfortunately that lack of maturity can express itself in various ways, some merely cosmetic, some less so.

                Overall even as someone like myself who doesn’t love the language, having now learned it I can appreciate that there’s a lot of interesting work going into that community that bears watching and learning from, whatever your first choice in programming languages may be.

              2. 15

                Packaging is usually people’s #1 problem with the Python ecosystem. I do empathise with people struggling but often when I read the specifics of problems people have they often seem to be doing what I would consider “ambitious” things, like mixing package managers (conda and pip in one example in TFA), trying to do data science on a musl libc ddocker container or trying to take their app to prod without packaging it up.

                In Pythons case I think the packaging system is solid overall and does work, though there are beartraps that unfortunately they aren’t labelled well. That said I think poetry does provide the right UX “affordances” and I think it should be the default choice for many.

                1. 14

                  My latest problem with Python packaging was with setting up JupyterHub at $JOB. Here is what happened:

                  • building a notebook docker image FROM jupyter/scipy-notebook
                  • RUN conda update
                  • environment completely broken
                  • change conda update to only update the specific package we need a bug fix for
                  • environment completely broken
                  • ADD fix_a_specific_bug_in_that_specific_package_myself.diff /tmp/patch.diff
                  • WORKDIR /opt/conda
                  • RUN patch -p3 < /tmp/patch.diff

                  I did nothing ambitious, unless updating packages is “ambitious” in Python. The entire packaging ecosystem seems like a huge dumpster fire. Why are there so many different package managers? I can only imagine because the authors of each package manager thought all the other ones suck too much to bother using, which seems completely true to me.

                  Ruby just has gem. Rust just has cargo. They both just work. Python has really dropped the ball here.

                  1. 6

                    Have you tried just using pip? pip install jupyterlab just works for me.

                    To be honest I see a lot of people complain about conda which makes me suspect it doesn’t work. Like I have gotten a lot of mileage out of pip and python -m venv (and even that one, if you’re in docker you don’t need to mess with virtual envs) (Though maybe you’re on windows?)

                    1. 2

                      The official Docker image we based on uses Conda, and mixing Conda and pip is asking for even more pain. And I was under the impression Conda has packages that pip doesn’t, but maybe that’s not true anymore.

                    2. 4

                      @ $WORK, For the past decade+ our policy was: whatever our OS packages for dependencies only. If we need more than what our OS(typically Ubuntu LTS) packages for python dependencies, then the only option is to import the package into the tree, as a direct dependency and we now OWN that code. This is the only sane way to do it, as the python packaging stuff after many decades is still mostly broken and not worth trying to fight with.

                      We’ve since started moving to playing with Nix and Docker containers, which mostly solve the problem a different way, but it’s still 90% saner than whatever python packaging people keep spouting. Note, the technical issues are basically 100% solved, it’s all community buy in and miserable work, nothing anyone wants to do, which is why we are stuck with Python packaging continually being a near complete disaster.

                      Maybe the PSF will decide it’s a real problem eventually, hire a few community organizers full time for a decade and a developer(maybe 2, but come on it’s not really a technical problem anymore) and solve it for real. I’m not holding my breath, nor am I volunteering for the job.

                      1. 2

                        My solution has been the opposite: do not rely on brew or apt for anything. Their release strategies (like “oh we’ll just upgrade python from 3.8 to 3.9 for you”) just don’t work, and cause pain. This has solved so much for me (and means I’m not breaking the system Python when doing “weird” stuff). Python.org has installers, after all!

                        Granted, I’m not on Windows but everything just works for me and I think it’s partly cuz package maintainers use the system in a similar way.

                        I kinda think Python needs a “don’t use anything from dist-packages and fully-qualify the Python version everywhere” mode, to prevent OS updates from breaking everything else.

                        1. 1

                          There is def. work involved to “port” our software to the next LTS branch, but it’s usually not miserable work. Plus we only have to do it every few years. The good thing is, by the time the python package makes it into a .deb in -stable branch the python code is also pretty stable, so upgrades and bugfixes are usually pretty painless.

                      2. 1

                        Yes I’m afraid I would consider blithely updating lots of software to the latest version very ambitious. Only a loon would update a number of different pieces of software at once with the serious expectation that everything would continue working, surely you can’t mean that?

                        I can’t speak to other package managers in general except to contradict you on rubygems. Last time I was running ruby in prod we had to increase the vm size to run rubygems in order to run puppet. Our Rubygem runs were unreliable (transient failures ala npm) and this was considered common. Perhaps this is an out of date view, it was some years ago.

                        1. 15

                          That’s insane. apt-get upgrade works fine. gem update works fine. cargo update works fine. Each updates the packages while maintaining dependency version invariants declared by those packages. Conda seemed to be trying to do that, but couldn’t do it correctly, and then trashed the system. This is not a failure mode I consider acceptable.

                          Sure, sometimes upgrading to newer software causes issues with backwards compatibility, hence stable and LTS release channels. But that’s not what happened here. After conda update completed, it was impossible to do anything else with conda at all. It was utterly broken, incapable of proceeding in any direction, no conda commands could successfully change anything. And the packages were all in completely inconsistent states. Stuff didn’t stop working because the newer packages had bugs or incompatibilities, they stopped working because package A required B version 3+, but version 2 was still installed because conda self-destructed before it finished installing B version 3. Has that ever happened to you with gem?

                          If updating my packages in the Python ecosystem makes me a loon, what is normal? Does anyone care about security patches? Bug fixes?

                          I can’t comment on gem‘s memory usage, I’ve never run it on such a small machine that I’ve had that problem. But I haven’t had transient failures for any reason other than networking issues. I have used gem to update all my --user installed gems many times. Conda literally self-destructed the first time I tried. Literally no issue with gem comes remotely close to the absolute absurdity of Python packaging.

                          1. 4

                            apt-get upgrade does work pretty reliably (but not always, as you imply) because it is a closed universe as described in TFA in which the totality of packages are tested together by a single party - more or less anyway. Cargo I have not used seriously.

                            Upgrading packages does not make you a loon, but blindly updating everything and expecting it to work for you arguably does - at least not without the expectation of debugging stuff, though I wouldn’t expect to debug the package manager. That sounds like you also ran into conda bug on your first (and apparently sole?) experience with python packaging. I can’t help you there except to say that is not the indicative experience you are extrapolating it to be.

                            I don’t want to get deep into rubygems tit for tat except to repeat that it has been a huge problem for me professionally and yes including broken environments. Rubygems/bundler performance and reliably were one of the reasons that team abandoned Puppet. Ansible used only system python on the target machine, a design I’m sure was motivated by problems having to bootstrap ruby and bundler for puppet and chef. I’m sure there were other way to surmount that but this was a JVM team and eyes were rolling each time DevOps was blocked on stuff arising from Ruby packaging.

                            1. 2

                              your first (and apparently sole?) experience with python packaging

                              Latest experience. Previous experiences have not been so egregiously bad, but I’ve always found the tooling lacking.

                              having to bootstrap ruby and bundler for puppet

                              That’s the issue. As a JVM team, would you run mvn install on production machines, or copy the jars? PuppetLabs provides deb and rpm packages for a reason.

                              Regardless, even though gem has problems, I can still run gem update on an environment and resolve any problems. What about Python? Apparently that’s so inadvisable with Conda that I was a loony for trying. It’s certainly not better with pip, which doesn’t even try to provide an update subcommand. There’s pip install -U, which doesn’t seem to provide any way to ensure you actually end up with a coherent version set after upgrading. Conda, though it blew up spectacularly, at least tried.

                              Seriously, how are you supposed to keep your app dependencies updated in Python?

                              1. 2

                                how are you supposed to keep your app dependencies updated in Python?

                                In the old days, there were scripts that told you what’s outdated and even generated new requirements.txt. Eventually pip gained the ability to list outdated deps by itself: https://superuser.com/questions/259474/find-outdated-updatable-pip-packages#588422

                                State of the art — pyproject.toml projects — your manager (e.g. pdm or poetry or whatever else because There’s One Way To Do It™) just provides all the things you want.

                          2. 8

                            Yes I’m afraid I would consider blithely updating lots of software to the latest version very ambitious. Only a loon would update a number of different pieces of software at once with the serious expectation that everything would continue working, surely you can’t mean that?

                            You have been Stockholm syndromed. Yes, in most languages, you just blithely do upgrades and only run into problems when there are major version bumps in the frameworks you are using and you want to move up.

                        2. 5

                          I’ve been in the Python ecosystem now for about six months after not touching it for 15+ years. Poetry is a great tool but has its warts. I worked around a bug today where two sources with similar URLs can cause one to get dropped when Poetry exports its data to a requirements.txt, something we need to do to install dependencies inside of the Docker container in which we’re running our app.

                          As Poetry matures, it’ll be great.

                          1. 2

                            Poetry is already three years old.

                            1. 2

                              A young pup!

                          2. 2

                            I agree with your post, but thisnis a stretch:

                            In Pythons case I think the packaging system is solid overall and does work

                            You define you dependencies but they will update their dependencies down the tree, to versions incompatible to each other or with your code. This problem is left unsolved by the official packaging systems.

                            But in all frankness… This of pulling dependencies for something that takes otherwise 5 minutes to implement, and being ok with having dozens of moving targets as dependencies, is something that just can’t be solved. I rather prefer to avoid it. Use fewer and well defined dependencies.

                            1. 7

                              If you ship an app, this can be solved by “pip freeze” which saves the deep dependencies as well. If you ship a library, limit the dependency versions.

                              This is not really a python problem - you have to do the same thing in all languages. In the similar group of languages Ruby and JS will suffer the same issue.

                              1. 2

                                Pip freeze helps, but you’re still screwed if you rely on some C wheel (you probably do) or Python breaks something between versions (3.7 making async a keyword in particular was brutal for the breakage it caused).

                                1. 1

                                  I’m not sure what you mean - wheels are precompiled, so get frozen like everything else. What do you think breaks in that situation?

                                  1. 2

                                    If I knew, it wouldn’t be broken, would it? All I can tell you is that wheels routinely work on one machine but not another or stop working when something on a machine gets upgraded, whether deliberately or accidentally. Why? I have no idea.

                          3. 12

                            I was trying a new gaming library– and it just wouldn’t install on Windows, but worked fine on Linux. Windows is still a 2nd class citizen in the Python world.

                            We hear this a lot in the CHICKEN community too. But where are the people willing to actually put time into making things work on Windows? Making things work takes effort. And making things work under Windows takes at least twice as much effort, especially if you’re not using it as your daily driver, which most people in OSS aren’t doing.

                            1. 3

                              EXACTLY this.

                              Gaming is like an extreme corner case in the wider Python ecosystem. Sure, there are pockets of real activity there with Pygame, Arcade and Kivvy, but these are often built and promoted by a single person or a tiny handful of people who almost certainly aren’t testing on Windows where the hurdles for interfacing with a whole different set of lower level graphics primitives could be rather high.

                              I’ve transitioned to being 99% Windows on the desktop over the last year and I have yet to find a Python library that doesn’t work there, but I’m not trying to do gaming development either :)

                            2. 11

                              Packaging problems are not unique to Python.

                              There is a tipping point in languages in near history ,sometime after Go I would say, where it got better but before that tipping point most language package managers have a large body of tech debt trying to deal with how muddy the line between global and localized package installs are. It affects Python, Perl, Ruby, PHP and a host of others.

                              1. 17

                                that is IMHO the one true reason docker got popular. It said, screw it, we just package up the whole filesystem and run that in prod. Now that is the de-facto standard everywhere.

                                1. 7

                                  Oh undoubtedly. It made every language effectively a statically linked application for deployment purposes. Opinions are still divided on whether that was the best solution to this problem though. It’s definitely a solution and maybe the only one with a chance of succeeding.

                                  1. 4

                                    Sometimes I fantasize about the idea of an alternative universe in which dynamic linking does is not used at all, everything is statically linked, and we dealt with the RAM and disk footprint by coming up with clever hacks for deduping code pages instead. :)

                                    1. 2

                                      we dealt with the RAM and disk footprint by coming up with clever hacks for deduping code pages instead

                                      Oh, like dynamic linking :)

                                      1. 2

                                        Dynamic linking solves the problem by introducing new problems that then spawned Docker which reintroduces the original problem again only it’s less of a problem now because disk and memory are “cheap”.

                                        1. 1

                                          I wrote clever.

                                    2. 3

                                      It’s why even though I know that Docker is solving real people’s real problems today, I simply can’t accept that this is the best we as an industry can do, or, rather, I don’t want to accept it, because it is such an indictment.

                                      1. 3

                                        I totally agree, but I have left the “denial” phase behind and am now in the “acceptance” phase. I am running some self-hosted stuff and one is written in python. I am glad it is a docker container tbh, b/c setting that up on my server would be a mess.

                                        I would love for some cleaner, purer solution to come along, but I fear that ship has sailed.

                                        1. 2

                                          I also am free to be hyper arch about my technology choices because I no longer write software for a living and thus don’t continually run into disappointments, at least in my choices of tooling.

                                    3. 10

                                      Packaging problems are not unique to Python.

                                      True– I’ve seen this problem with JS, Ruby and Haxe (and more languages, Im sure).

                                      But I wrote about Python because:

                                      • It’s my main language

                                      • The problems have been around for years, and have become a joke now

                                      • Python is sold as a “beginners” language, but many beginners quit Python because they can’t even install the libraries

                                      Edit: And to add, looking at some comments here, and on Reddit, there seems to be an attitude of “Lol, I’ve never seen this problem, you must be doing something silly/stupid” in the Python community, and I wanted to address this. (And yes, I’m sure other communities have this problem too– just in my limited experience, the JS crowd are more open and honest about this)

                                      1. 7

                                        Python is sold as a “beginners” language, but many beginners quit Python because they can’t even install the libraries

                                        I am not a beginner, but I have quit using Python for new projects because it’s not worth my time to figure out how to install the libraries.

                                        1. 4

                                          I regularly walk away from Javascript projects for the same reason.

                                    4. 5

                                      This is timely. I spent the better part of last week trying to get a Python library working on my NixOS system. I eventually gave up and installed Python inside Arch inside a headless qemu virtual machine. Now I build using ssh -t my_qemu_instance 'cd src/foo ; cargo build'. I guess it could be worse.

                                      1. 2

                                        I’m a little surprised that that’s “ssh… cargo …” rather than “ssh… pip…” or something.

                                        1. 3

                                          Oh, right. It’s a Rust project that uses the inline_python crate. It’s been a long week…

                                          To my surprise (& to that crate’s credit), invoking Python from Rust worked like a charm. The hard part was installing the Python dependencies in the first place.

                                        2. 1

                                          Are you blaming nixos or Python package system?

                                        3. 4

                                          One fun thing I recently ran into is that if you want a “clean” Docker image for your application (because that’s the easiest way to distribute Python stuff), you end up needing a requirements.txt file even if you’re using Pipenv because Pipenv itself pulls in a bunch of dependencies. This is fairly straightforward to do with a multi-stage build, but it’s still annoying.

                                          1. 2

                                            @shantnu_tiwari, Which of the multitude of environment managers listed do you recommend?

                                            1. 6

                                              I would try to stick to standard virtualenv + pip, if you can.

                                              Some libraries may not work, but if you try the other package managers, you end up going down a rabbit hole…

                                            2. 2

                                              What is needed to achieve something like what node or go does? It has venvs for separate library environments, it has pipenv for project local environments, it has venv for project local environments with a specific version of python.

                                              What specific UX changes are needed to assemble these to something similar to what node or go does?

                                              What is done in these other communities to handle the other half of the problem which is the health of library dependencies? Automated testing of all permutations seems impractical for large packages.

                                              1. 3

                                                pipenv/poetry/pdm/etc do achieve npm/cargo/stack/etc style “project” UX. It just took so long… and all the layers are just hidden inside, it’s an abstraction waiting to leak.

                                              2. 2

                                                Eh, I mean, they’re not wrong about the packaging situation.

                                                That said, my experience is that if you pick one way of doing things and stick with it, don’t use Windows, and don’t use system packages, you’re probably okay. I haven’t written any Python in a couple of years, so I haven’t used poetry, but pipenv worked great for me then. Pyenv if your development or deployment environment is old or weird in any way.

                                                I agree that this is not obvious to beginners, and One Way To Do It should be blessed by the language maintainers.