I find it amusing that this is tagged ‘plt’ because the PLT in ELF linkage is an example of inline caching for C-like languages (on some architectures, exactly as inline caching was originally described, on most these days with a slight tweak to avoid needing writable and executable memory).
Hm what kind of speedup can you get like this? For some reason I am skeptical it could speed up bytecode interpreters like Python or Lua. Is the method lookup enough of a bottleneck? I thought that these kinds of techniques mainly applied to JIT compilers.
I have seen the “quickening” line of work, and portability is a big appeal, although I’m hesitant to use it without seeing “production” use of it first.
It is used in PyPy. Because of the way that Python’s lookups work, PyPy has several optimized bytecodes to help overcome the fact that Python method lookup and attribute lookup use lots of the same machinery. I don’t know how much of a speedup is obtained. Note that the cache is only used when the JIT is not tracing this code.
I used the technique in Monte, too. It is localized to one method because Monte transforms method signatures into keys called “atoms” for quick lookup. We too only use the inline method cache when we are not in the JIT; according to this commit, it can be up to 40% faster for the JIT to skip the cache and instead specialize for a monomorphic call. The specific benchmark is our good friend richards, which has several single-method objects which can be optimized in this way.
pfalcon of MicroPython/pycopy suggests it provides a big speedup. I think for dictionary-based attribute lookup this could be a sizeable win. I don’t think that any benchmark on this tiny demo would be representative of any real workload, unfortunately.
I think Brunthaler found it effective in speeding up CPython in one of his papers.
I think this would speed up python by a notable factor, since dispatch can be a large amount of cost for any given line of code.
I did something similar-ish for PISC, and it helped a fair bit.
I added a new post on quickening
It’s a significant benefit. This is how obj-c msgsend (and probably the OO parts of Swift) works.