1. 44
  1.  

  2. 38

    To me, this really drives home the need for language projects to treat dependency and build tooling as first-class citizens and integrate good, complete tools into their releases. Leaving these things to “the community” (or a quasi official organization like PyPA) just creates a mess (see: Go).

    1. 9

      100% agree. I recently adopted a Python codebase and have delved into the ecosystem headfirst from a high precipice to find that’s improved drastically from the last time I wrote an app in Python — 2005 — but still feel like it’s in disarray relative to the polish of the Rust ecosystem and the organized chaos of the Ruby and JVM ecosystems in which I’ve swum for the last decade. I’ve invested considerable time in gluing together solutions and updating tools to work on Python 3.x.

      The article under-examines Poetry, which I find to meet my needs almost perfectly and have thus adopted despite some frustrating problems with PyTorch packages as dependencies (although PyTorch ~just fixed that).

      1. 5

        I also think poetry isn’t being considered enough. The article gives the impression that the author doesn’t have a lot of hands on experience of poetry but is curious about it. I’d recommend further exploring that curiosity. I understand that it’s hard to cover everything in a short article like this. If you’ve got an existing project using a working setup a lot of the points make sense and there’s no need to hurry up and change your setup. But I wouldn’t really call it a fair assessment of “The State of Python Packaging in 2021”.

        From my point of view it’s clear that pyproject.toml is the way going forward and is growing in popularity. Especially with the way considering it’s also required to specify the build system with modern setuptools going forward.

        As for the setup.cfg requiring an empty setup() with an setup.py is a half truth at best. It’s true that PEP-517 purposely defers editable installs to a later standard in order to reduce complexity of the PEP. But in practiceit’s not required if you use setuptools of a version equal to or greater than v40.9, released in spring of 2019. This is documented in the setuptools developers guide, if a setup.py is missing setuptools emulates a dummy file with an empty setup() for you. If you build you project with a PEP517/518 frontend you don’t need the setup.py. Having static setup.cfg is a massive improvement for the ecosystem as a whole since we can actually start resolving dependencies statically without running code, this benefit for the ecosystem as a whole should not be downplayed.

        I get the feeling that the author want to wait for a pipe-dream future where everything is perfectly specified and standardised before starting to adapt any of the new standards. I see this as completely fine and valid if you’re working on your own project, especially if you’ve already got existing working code. That said, in my opinion, I wouldn’t recommend it as the approach for everyone. I see it as necessary to start using the new standards on new projects so that we can start going forward, if we’re always clamping to the old way of doing things it’s going to be hard to progress and the progress will be hampered.

        I get the impression that the author is very knowledgeable and have plenty experience in the area, and I see the article as reflecting the opinion of the author which I respect but don’t fully agree with. I would love to have a chat with the author given the opportunity and hear more about his opinions. I’m also looking forward to read the 2022 edition next year. It’s also easy for me to contest some of the points here but it’s not completely fair without a reply from original author where he’s given a chance to elaborate and defend their choices.

        Full disclosure: I’m currently writing a book on the subject and I’ve researched the strides in Python Packaging quite heavily in recent time.

      2. 3

        just creates a mess (see: Go).

        It’s fair to say that packaging is a mess in Python but why exactly is packaging in Go a mess? Since 1.13 we have Go Modules which solves packaging very elegantly, at least in my opinion. What I especially like is that no central index service is required, to publish a package just tag a public git repo (there are also other ways to do that).

        1. 8

          Yeah, Go is fine now, but in the past, when the maintainers tried to have “the community” solve the packaging problem it was a mess. There were a bunch of incompatible tools (Glide, dep, “go get”, and so many more) and none of them seemed to gain real traction. Prior to Go modules the Go situation looked similar to the current Python situation. To their credit, the Go developers realized their mistake and corrected it pretty quickly (a couple years, versus going on a couple decades for Python, so far).

          1. 1

            Thank you for the explanation.

            Prior to Go modules the Go situation looked similar to the current Python situation.

            Yes, I agree with you that the situation was similar before Modules were a thing. I was fed up with the existing solutions around that time and had written my own dependency management tool as well.

          2. 4

            It’s fair to say that packaging is a mess in Python but why exactly is packaging in Go a mess?

            Not the original poster, but I think it’s because modules weren’t there from the start, and this allowed dep, glide, and others to pop up and further fragment dependency management.

        2. 16

          Bullshit around Python at a previous gig is what finally convinced me to just shove things into containers and say fuck it, embrace Docker.

          Thanks Python. :|

          1. 2

            I work in ML and the fact that the whole ecosystem is founded on Python spaghetti has made me seriously reconsider working in this field long-term. Network architectures are the perfect use case for static (if not fully dependent) types. I’m at least hoping Julia will disrupt things.

            1. 1

              Julia is still dynamically typed and in my very limited experience the type system doesn’t help as much with catching bugs as one would expect.

              Maybe I was just doing it wrong and you’re supposed to run a separate linter to catch trivial mistakes. But you can do the same thing in Python with mypy and type annotations so I’m not sure that counts.

          2. 10

            PyPA still advertises pipenv all over the place and only mentions poetry a couple of times, although poetry seems to be the more mature product.

            I definitely remember at my old job having significantly less pain after switching to poetry from pipenv. I think part of that was that some of the changes we needed in pipenv were in master but not released, because the sole person with authority to cut releases had disappeared for… a year? (The latest release is 2021.5.something, so presumably at some point they started releasing again).

            Also, I’m firmly convinced that having your metadata be stored in something that’s a file written in your own language as opposed to a json/xml/toml/ini/whatever file is a mistake.It makes code analysis harder (want to figure out the dependencies of a package from source? execute some arbitrary code!), makes it harder to write tools to modify it… it’s just a mess.

            1. 8

              Having recently struggled with this topic (again), I think there are some inaccuracies in the article, it especially seems to overstate both setup.py and the PyPA’s role in the ecosystem. E.g.:

              poetry has an interesting approach: it will allow you to write everything into pyproject.toml and generate a setup.py for you at build-time, so it can be uploaded to PyPI.

              …but a setup.py file isn’t a requirement for uploading to PyPI; Flit can happily build and publish distributions to PyPI without ever reading or generating a setup.py file, and those same packages can be consumed by pip without fanfare.

              What I want to see is an article that tries to unbundle what is meant by “Python Packaging” and which use cases are covered by various common tools (setuptools, pip, build, twine, virtualenv, venv, virtualenvwrapper, pip-tools, flit, poetry, pipenv, conda, pants, pdm, etc), because there are at least half a dozen distinct tasks in the Python development + publishing lifecycle, and part of the confusion is that you can mix and match tools to cover as many or as few of those cases as you want. In a way that’s almost antithetical to “There should be one– and preferably only one –obvious way to do it.”

              For example, build only concerns itself with creating distributions, and twine only with publishing those to PyPI. Both are complementary to modern Setuptools, which can be driven solely by a declarative setup.cfg file.

              …or if you wanted a unified tool to handle describing + building + publishing your packages, you could switch to Flit, which covers all three tasks.

              …but that still leaves you manually managing dependencies and environment isolation during development. If you wanted tools for that, you could reach for pip and venv, which complement either Flit or Setuptools + Build + Twine workflows.

              …or you could switch to something like Poetry or Pipenv which handle all of those tasks in a single, omnibus tool.

              Edit: Pipenv doesn’t handle building + publishing, but it does consolidate dependency and virtualenv management into a single tool.

              1. 2

                Poetry does handle all these but last time I looked at pipenv it doesn’t really handle tasks related to packaging and publishing but rather focuses on consumption of packages.

                1. 2

                  Holy cow, you’re right :) I could’ve sworn there was something, but nope.

                2. 1

                  What I want to see is an article that tries to unbundle what is meant by “Python Packaging” and which use cases are covered by various common tools

                  Mentioned in another comment, but I made an attempt at this a while back.

                3. 7

                  So I’m sticking to the official PyPI distribution wherever possible. However, compared to the Debian distribution it feels immature. In my opinion, there should be compiled wheels for all packages available that need it, built and provided by PyPI. Currently, the wheels provided are the ones uploaded by the upstream maintainers. This is not enough, as they usually build wheels only for one platform. Sometimes they don’t upload wheels in the first place, relying on the users to compile during install.

                  For this and other reasons I prefer to use Nix as my Python toolchain. The packages are up to date, they all come with binaries, and they’re guaranteed to work no matter what distribution I’m running on top of. (Plus I don’t have to worry about the constantly changing “official” packaging workflow…)

                  1. 1

                    Adding onto this. I experimented with nix for this use as well. My problems are slightly different than most though. I need to ship python code (with C source dependencies) for an ARM SoC. I made a proof of concept that cross-compiled and ran on the target seamlessly.

                    It’s a bit convoluted exporting derivations to non-nix systems, however.

                  2. 5

                    A while back I wrote a long-ish blog post explaining what I see as the different things people actually mean when they say “Python packaging”, and what the landscape looked like at the time. It still looks like that.

                    The tl;dr is that I break things down into three operations, each of which has a default, if sometimes low-level-feeling, tool available which does the job well:

                    • Producing a distributable artifact from some code (setuptools)
                    • Installing distributable artifacts, once produced, at some other location (pip)
                    • Isolating different Python codebases’ potentially-conflicting dependencies from one another (venv)

                    The thing that didn’t exist then, and doesn’t exist now, is a single universally-agreed-on high-level tool wrapping all three of the above operations. Several attempts have been made at doing that. Some have gained traction with the general community. None have reached ubiquity, or anything close enough to it, to become the One True Way™ to do all things related to packaging. Though of note, the evil horrible terrible no-good very-bad PyPA has been doing a ton of thankless work to standardize APIs and metadata and other things and make lots of components swappable so that you can get to a situation where a tool like, say, Poetry (which is one of the popular high-level tools right now) can even exist without making everyone everywhere have to adopt its tooling and workflows.

                    Most pessimistic views of Python packaging are based entirely on the last bit, and on the fact that for a long stretch (2000s into early/mid 2010s) the base low-level tools had issues or didn’t exist, and so people developed a whole hodgepodge of different one-off custom approaches and wrote about them, leading to an explosion in the perception of the size and complexity of “Python packaging”.

                    Personally, I stick to the relatively low-level defaults because they’re not actually difficult or complex to use (you can walk through a tutorial and be packaging and distributing/installing your code pretty quickly – nearly all the real complexity is for cases that are actually complex, like NumPy/SciPy needing code from four different languages in the package). The only thing I add on top of that is pip-tools to automate compiling a list of direct dependencies into a full pinned/hashed tree of everything that will be needed. At work I’ve been helping to standardize a lot of stuff related to our Python development and deployment tooling, around the same principles (the default tools, plus pip-tools to compile) and took the step of putting the dependency management behind a small layer of abstraction in a Makefile, so that we could switch to a specific high-level tool without changing anyone’s day-to-day workflow.

                    1. 5

                      Surprising no one, it’s still a mess, with no end in sight. Maybe another decade is needed. Despite the fact that many other languages have managed a sane story, so it’s not like it’s a hard technical problem to solve anymore.

                      1. 9

                        Right, just a problem that needs a large amount of attention to detail, tons of unforgiving work, practical buy-in from all sorts of people, large amounts of documentation and marketing, and a team to run the infrastructure. ezpz.

                        1. 5

                          Probably less work than having 5 teams running various incompatible half-assed efforts in parallel?

                          1. 4

                            I agree completely. I never said it was easy, you don’t see me volunteering to do that work!

                            We went with the @friendlysock’s solution, hide the problem in Containers. It’s sucky, but it’s practical.

                        2. 4
                          $ nix-shell -p python3 --run 'python -c "import this" | sed -n 15p'
                          There should be one-- and preferably only one --obvious way to do it.
                          

                          Maybe one day.

                          1. 5

                            I don’t understand why PyPA is trying so hard to make us believe pipenv would be the standard tool for Python packaging

                            I can answer that: because of the disgusting corruption of the PyPA. Yes, that’s really the reason.

                            At the time, they were all friends with Kenneth Reitz and adopted it because it. Reitz has even the nerve to admit it on an interview for the podcast “Talk Python to me”: “we did the due diligence to make it recommended, to work with those people… it is a political sphere and politics are involved so 🤷” he says, like a politician offering bribes would say. (https://talkpython.fm/episodes/show/208/packaging-making-the-most-of-pycon-and-more at 35:38)

                            Reitz was later caught trying to scam companies into giving him money for work done by a requests contributor (see https://archive.is/OA0F1) but the PyPA didn’t care. It seems that it really was business as usual.

                            1. 4

                              It’s very easy to make this sound sinister, of course, but the simple fact is that Python packaging is a relatively small world and everybody knows everybody. That doesn’t make it an evil “corrupt” cabal, it’s literally the same as any other niche open-source interest group – the people who show up and participate naturally gain influence just from the fact that they’re there and taking part (and a lot of alternative packaging tools, both some that utterly failed and some that enjoyed a bit of success, seemed to take it as their mission in life not to collaborate with anyone who’d worked on prior art, which, humans being social animals, certainly did dampen their chances of gaining influence with those folks).

                              And honestly, a few years back you could hardly open a thread on reddit or HN without seeing people giving themselves repetitive-stress injuries from how hard they were spamming for pipenv; it was perfectly reasonable to pick it as a recommendation for the use case it tries to handle. Today, of course, you can hardly open a thread without seeing people trip over themselves to spam for Poetry, and so if it were being done today I expect Poetry would probably have a good chance of being the one picked as the recommendation. But the only thing worse than no recommended tool is changing the recommended tool way too often; in a couple years, perhaps everyone will hate Poetry and be spamming something else!

                              1. 1

                                The fact is, this is not about something subjective like art. The alternatives can be evaluated by objective measures eg: work with x, has y feature, etc. publish all the results and make a recommendation based on that (or even not!). but this wasn’t that.

                                The library was new and the features were still just a promise. The only merit that has was its author. This was a favor and a betrayal to the community.

                            2. 2

                              Poetry + taskipy gives you bundler + scripts from package.json. It’s great. I dearly miss RSpec and Rails magic but it’s so close. It’s just surprising how much force is behind the community but very little cross pollination. I’ve seen some ports of ideas and that’s good. I don’t even want to bring up Go. Language wars, sure but it’s also huge ergonomics. I have to explain so much history to interns and fresh grads. It’s zero fun.

                              The old thing about “what you get right they never mention”. Yeah.