1. 22

  2. 7

    It should be possible to define a subset of the Python language, uninspiredly dubbed “TurboPython”, that excludes those features that stand in the way of high-performance JIT execution (or compilation).

    This sounds a lot like RPython, the language that PyPy is written in.

    1. 2

      Or Cython?

      1. 3

        They are a little different. RPython is a subset of Python which can be compiled. Cython is a superset, and only the superset (C-like) parts are fast. That said, adding C type annotations can be pretty simple in some cases.

        1. 1

          Hm, I thought that not all Python code would work in Cython? e.g. I wouldn’t really expect Cython to implement exec statements.

          1. 2

            Exec works in cython, it just calls the C interface to exec.

            1. 1

              All Python code can be Cythonized, it is just translated to the equivalent C interface calls.

              For example, the Python code

              o.attr_name = v

              could become the C code

              PyObject_SetAttrString(o, "attr_name", v);

              but this doesn’t really provide any speed benefits. The real advantage of Cython is using the extra C type annotations, which allow Cython to convert the code into a speedier form.

      2. 5

        Here’s a good presentation from Alex Gaynor about why Python is slow. https://speakerdeck.com/alex/why-python-ruby-and-javascript-are-slow

        Very crudely, it’s not about dynamism at all. It’s about APIs convenient to programmers but forcing excessive allocation and copying.

        Also, all data is on the heap, so you can’t have cache-friendly dense arrays of even simple structures, and no vectorization since everything is accessed indirectly.

        1. 3

          While the fact that everything is mutable in Python is useful for mocking purposes during testing, it is generally considered extremely bad style to override builtin functions, constants or class methods in production code.

          I haven’t used Lua heavily, but from what I understand it also has the ability to do all that too. Does anyone know why Lua is (apparently) so much easier to jit than Python? Other than Mike Pall being a conglomerate of robots from the future, I mean.

          1. 2

            I don’t know for sure, but two things come to mind:

            1. LuaJIT is the primary runtime for Lua, which means that almost all community effort goes into it, and decisions are made about the language with LuaJIT in mind. Note: I’m way wrong, see mjn’s comment.

            2. Lua is a much simpler language than Python, and that almost certainly translates to more room for optimization.

            1. 6

              Re #1, I always viewed LuaJIT as an “alternative” optimized runtime if you needed maximum speed and could live with being behind on features, compared to the primary implementation. It appears it’s still quite a bit behind the main Lua language, compatible only with 5.1, which was released in 2006 and EOLed (in the main Lua implementation) in 2012.

              1. 3

                Oh wow, I’m way wrong, thanks!

              2. 2

                Lua is a much simpler language than Python, and that almost certainly translates to more room for optimization.

                But that doesn’t tell me anything. I still have no idea what features​ it excludes that are so problematic in Python.

                1. 4

                  In Python you can do some odd things like:

                  import sys
                  def abomination():
                      # get the stack frame of the function which called me
                      fr = sys._getframe(1)
                      # create a global variable visible to caller
                      fr.f_globals['z'] = "z wasn't defined, but it is now."
                      # spy on caller's local variable
                      print ("Supposedly hidden value of x: %s" % (fr.f_locals['x'],))
                      # spy on caller's name
                      return "I was called by a function called %r" % (fr.f_code.co_name,)
                  def victim():
                      x = "Some hidden secret, never to be revealed."
                      y = abomination()
                      print ("y = %s" % (y,))
                      print ("z = %s" % (z,))  # note the lack of 'z' in scope!
                  if __name__ == '__main__':

                  Which will output:

                  Supposedly hidden value of x: Some hidden secret, never to be revealed.
                  y = I was called by a function called 'victim'
                  z =  z wasn't defined, but it is now.

                  Lots of code in real-life libraries that people depend upon in production (transitively) depends on code that does all of these things, and more. (Very little code does this directly, but projects have dependencies which have dependencies which have…).

                  Making odd things like these work (and programs that sometimes do them run fast) requires either a somewhat naïve execution strategy that materialises a lot of these objects all the time and leaves a lot of performance on the table (e.g. touching global variables is noticeably slower than local variables in CPython), or a really complicated execution strategy that materialises them only when precisely necessary.

                  1. 7

                    Lua has debug libraries which allow caller stack introspection and such as well.

                    1. 1

                      Huh. Do they work under LuaJIT? Do they cause deoptimisation under LuaJIT?

                      1. 6

                        I don’t know how well luajit actually optimizes such code, but it certainly doesn’t prevent luajit from optimizing other code.

                        local debug = require "debug"
                        local function leaker()
                                local n, v = debug.getlocal(2, 1)
                                print(n, v)
                        local function foo()
                                local x = "not a secret"
                        luajit t.lua                                                                             
                        x       not a secret
              3. 2

                I suspect some of the following:

                1. Python has a much richer class system with multiple inheritance, static methods, class methods, class attributes, metaclasses, etc. When you do self.foo it has to look the object and then the class, and superclasses, at least. It also has the __dict__ attribute on every class which probably has some implications for the representation.

                2. Python has more extension points. It has __eq__ and __iter__ and so forth. It has “properties” and I think a more complicated version of __get__ and __set__. And __getitem__ and __setitem__ and __slice__, etc.

                3. It’s true everything is mutable in Lua too, but Python just has more stuff that’s mutable. You can look at type.__bases__ etc. to get the base class, etc.

                4. The bigness probably makes a difference. Python has dicts, tuples, and lists, while Lua just has the table. Python 3 strings are unicode by by by default.

                A lot of things like for x in y and len(x) are polymorphic in Python; they either aren’t or are less so in Lua.

                1. 8

                  I think you are underestimating the extensibility and mutability of things like metatables in lua. I see the “python is just too complex; you don’t understand” argument a lot, but mostly from people who don’t seem to know lua all that well either. My apologies if you’re a seasoned Lua developer, but I don’t think you’ve really demonstrated that python is intractably more complex.

                2. 1

                  One reason that few people understand is that it is possible to write a fast Lua interpreter. LuaJIT with JIT off is already very fast, the interpreter is written in assembly for that purpose.

                  A fast interpreter is a prerequisite for a good tracing JIT compiler, because the code must be interpreted from time to time. That is one of the reasons why attempts at pure tracing JavaScript JIT compilers have failed. Current JS JIT compilers typically compile methods instead.

                  If you haven’t read this LtU thread and are interested in this topic, go do it now :) http://lambda-the-ultimate.org/node/3851

                  1. 2

                    I’ve read it, but I’m still not sure why all of the attempts at making python fast seem to run into walls.

                3. 1

                  It would be possible to define such a subset, sure. But what would it gain from being a subset rather than a new language? It wouldn’t be able to use the python library ecosystem (because that’s written in naturally aspirated python) and would probably have trouble differentiating itself enough from python to form its own ecosystem (in the same way as e.g. coffeescript). If you want a somewhat-python-like language with much higher performance there are plenty of existing options (e.g. Crystal, and I don’t even like Crystal); what advantage would an actual subset have?