1. 48

  2. 20

    Almost every nontrivial C extension out there has threading bugs that are currently impossible to hit because of the GIL. It will take an enormous amount of effort to make all the old C code thread safe. That means in practice you may not be able to use a lot of useful libraries that rely on C code in a multithreaded python interpreter.

    1. 8

      Sure, but this is how we make progress with software:

      1. Put the feature behind a flag.
      2. Test with flag on.
      3. Find bug, fix.
      4. Goto 2.
      5. Make flag default on.

      When I introduced Sidekiq to the Ruby community, there were still lots of gems with threading issues, including Rails itself. Now the ecosystem is thread-safe as Sidekiq usage grew and folks fixed the threading issues. It just takes a few years.

      1. 5

        I love how your algorithm will never reach point 5.

        1. 3

          It will, after you hit undefined behaviour in step 2.

          1. 2

            “There was an issue found during peer review”

          2. 1

            Data races are famously non-deterministic. Your tests will likely pass and you will still release a heisenbug into the wild.

          3. 6

            I hate to be that guy who suggests rewriting in rust, but I came to this thread to lament the fact that my current favourite pastime (writing python modules in rust for the enormous performance boost) would be going away.

            Maybe though… it wouldn’t be a bad idea?

            1. 13

              A lot of these extensions are wrappers for third-party libraries, so what you’d be doing really is rewriting the third-party library and committing to maintain it forever.

              Also, it’s not always only C – if you think you can build a better BLAS/LAPACK implementation (required for NumPy) than the current Fortran versions, you’re welcome to try it. But I think there’s a reason why Fortran continues to be the base for that stuff, and it’s not “because they hate Rust”.

              1. 8

                A Rust adapter-wrapper from Python to some Rust and some C is still a totally valid pattern where you don’t rewrite the whole stack, but you add some correctness (say parsing, state machines, thorny inplace processing). It can be a spectrum.

                1. 3

                  A lot of these extensions are wrappers for third-party libraries, so what you’d be doing really is rewriting the third-party library and committing to maintain it forever.

                  I don’t mean to be glib, but isn’t that a bit of an entitled attitude that places an unfair burden on open-source maintainers? I see no reason that they shouldn’t commit to only maintain it as long as its useful for them. If someone else finds it useful, then they can either pay for support or maintain it themselves.

                2. 8

                  That’s assuming the Python API even makes that possible. For example the buffers API allows python side to cause data races and Rust can’t prevent that: https://alexgaynor.net/2022/oct/23/buffers-on-the-edge/

                  1. 2

                    Parallelism is not the same as a speed-up. If a task takes 100 CPU seconds with Python, 10 threads/processes with full parallelism means 10 seconds to get a result, and those cores can’t be used for other things. The Rust equivalent might only take 1 CPU second on a single core. In some situations parallelism is good enough, but optimization tends to be much more valuable if you can manage it.

                    (Longer discussion: https://pythonspeed.com/articles/do-you-need-cluster-or-multiprocessing/)

                    1. 2

                      Rust has a very strong unique ownership model. Python allows arbitrary aliasing. Two Python threads can share any objects that are returned from Rust. I think the only solution to this is to make everything that is exposed to Python adopt the Sync trait. Anything in Rust that is mutable and Sync is unsafe. The simplest way of doing that mostly correctly is to expose Sync proxies that acquire a lock before calling into the real Rust objects (Rust has some nice lock guard types for doing this, so the unsafe bit is in the standard library). It’s not clear to me that this would end up being easier than exposing C/C++ things.

                      Rust’s concurrency benefits rely on the Rust type system. When you are interoperating with code that has weaker constraints then there are lots of places for bugs to sneak in.

                      1. 1
                      2. 3

                        Unless the C code is doing stuff like using OS-level resources assuming there’s only one instance of it running (which is ekhm anyway because there can be multiple CPython processes,) it’s still going to run under GIL within the subinterpreter.

                      3. 7

                        The article is about subinterpreters, but there is also nogil fork: https://github.com/colesbury/nogil

                        1. 1

                          I looked at this years ago, so. many. global. variables. I can’t imagine how much work this took unless there was a simpler way than putting everything into a Lua style environment struct.

                        2. 4

                          The channels code appears to do rather more than just move the objects in memory (and if you think about having multiple interpreters, it kinda has to do some sort of serialization to be safe) which depending how inefficient this is makes this in many ways no better than multiprocessing.

                          1. 2

                            Could it be used as another mode for multiprocessing, instead of the existing spawn and fork?

                          2. 3

                            This tutorial is excellent - it’s not an easy feature to play with, but I followed the steps in this and got it running exactly as described.

                            Also interesting is this library which the tutorial links to at the end: https://github.com/jsbueno/extrainterpreters