It would be fun to have a benchmark so that people can do targeted work on emulator perf and see it make the superoptimizer faster. But maybe that’s just my bias as a runtimes optimization person :)
Super optimisation is still somewhat infeasible in the general case, but I really like the work that the Souter project did for LLVM peephole optimisations, effectively adding a memorisation layer. The goal was to replace the IR to IR peephole optimiser with something that was machine-generated and was optimal. They used Alive, which gave a formal model of the semantics of the instructions to generate every pattern of three IR instructions and find a two-instruction sequence or single instruction that was equivalent (or was equivalent with some additional constraints). It probably doesn’t scale much beyond three-instruction sequences, but the current hand-written code doesn’t either.
I think most of the folks working on it moved onto other things, unfortunately, and it wasn’t merged. Part of the problem was with moving code into the tree that could be modified only with additional tools. There was also some disagreement over Alive’s interpretation of the semantics (personally, I’d rather have a formal semantics that people disagree over than an informal semantics that people disagree over - at least you can converge the former onto something unambiguous).
This might interest you: https://egraphs-good.github.io/
Their new paper was also posted recently: Better Together: Unifying Datalog and Equality Saturation
Very creative! So cool. You might have fun with solvers like Z3 so you don’t brute force it
It would be fun to have a benchmark so that people can do targeted work on emulator perf and see it make the superoptimizer faster. But maybe that’s just my bias as a runtimes optimization person :)
Also, consider using PyPy–it’s at least 5x faster on my machine at doing the superoptimization.
Super optimisation is still somewhat infeasible in the general case, but I really like the work that the Souter project did for LLVM peephole optimisations, effectively adding a memorisation layer. The goal was to replace the IR to IR peephole optimiser with something that was machine-generated and was optimal. They used Alive, which gave a formal model of the semantics of the instructions to generate every pattern of three IR instructions and find a two-instruction sequence or single instruction that was equivalent (or was equivalent with some additional constraints). It probably doesn’t scale much beyond three-instruction sequences, but the current hand-written code doesn’t either.
Do you know the current status? Has it been merged and replaced hand-written codes? If not, what remains to be done in order to be merged?
I think most of the folks working on it moved onto other things, unfortunately, and it wasn’t merged. Part of the problem was with moving code into the tree that could be modified only with additional tools. There was also some disagreement over Alive’s interpretation of the semantics (personally, I’d rather have a formal semantics that people disagree over than an informal semantics that people disagree over - at least you can converge the former onto something unambiguous).