1. 6
  1.  

  2. 3

    The mailing list discussion is not great. The thread starting here discusses PyPy’s architecture in brief but fails to explain how this proposal meaningfully improves on what PyPy does.

    I wonder about the claim made in the top of the thread:

    I already have working code for the first stage.

    This implies that the first stage of the plan yields a 50% speedup. However, the first stage is:

    The interpreter will adapt to types and values during execution, exploiting type stability in the program, without needing runtime code generation.

    Monomorphizing at runtime is not new, and in some sense it’s obligatory; what matters is how fast the implementation details are, and not whether the code is lowered to work on concrete values before it operates on…well, concrete values. Moreover, although I’m having trouble finding it, there has been an attempt to specialize CPython bytecode for various types like float and int before, and it failed to manifest the desired speedups on real-world code.

    So I would ask relative to which benchmark he is going to manifest this speedup. (And frankly, the reason why I am skeptical of this entire effort is that even after three stages, they’ll still trail behind PyPy, and meanwhile PyPy themselves can duplicate whatever in the first two stages leads to real speedups!)

    1. 3

      IMHO Python is fast enough and speed will never be a distinguishing factor of it, so the focus should be on further enhancing Python‘s strengths. Additionally easier and better multicore support is much more needed on the performance site of Python.

      1. 2

        As far as I recall the greatest hurdle to applying these JIT-type techniques is the CPython maintainers explicitly wanting to keep such advanced techniques out of the CPython implementation for understandability reasons.

        So this is a great list and all but what’s changed to mean that any of this plan will make it into the releases you say it will? And how did you get to “four distinct stages, each stage increasing the speed of CPython by (approximately) 50%” when none of the work has been done to show that that’s even possible?

        The funding plan you give is more-or-less “it should be funded”. And there’s no mention in that repo of where this discussion with the CPython folk is happening.

        1. 2

          you

          Note that I’m not the OP, just sharing this interesting-to-me plan.

        2. 1

          Improved performance for integers of less than one machine word.

          I think they mean less than two machine words… (edit: or rather “less than or equal to one full machine word”)

          1. 1

            I think they actually mean less than one machine word – which gives you some spare bits for tagging and boxing.

            1. 1

              Yeah, I made the edit after realising that they probably were referring to “fractional” words. I guess I had it stuck in my mind that a word is indivisible, you can’t have “less than one machine word” (e.g. how many less-than-one-word integers do you store in 10 words? typically, 10), and though that’s open to interpretation I think I would have written it as “integers whose value can be stored in a single word with at least one bit to spare”, assuming that’s what they mean (and I think you’re probably right, they’re referring to the trick of using a single bit to “tag” object pointers vs simple integers, or something along those lines).