I’d need to be convinced that another interpreter solves any major problems of cross-language interoperability. As far as I can tell, the challenge is squaring the circle of different semantics.
What happens when you take a Rust affine value, pass it to Python, which assigns a few copies of it around, and passes all of those copies back? What happens when you pass something with a destructor to a language that doesn’t support destructors? What happens with weak references that are implicitly nullable when you pass them to a language without NULL?
Getting all of that to work is where the effort in cross-language interop lies.
You might be able to get it to work in languages that feel close – the .NET CLR does this, for example. But getting Haskell, C++, Rust, and Python to easily interoperate without tons of glue code seems… unlikely.
And if you’re ok with glue code, the C ABI and the various language FFIs more or less lets you do the job.
C ABI does the job, but I think I am not revealing anything shocking when I say C ABI could be greatly improved. Have you seen Fine-grained Language Composition by Laurence Tratt? It is a clear proof of concept of what is possible, and we are very far from it.
Of course there’s room for improvement – the x86_64 calling convention is way too complicated. But that’s beside the point. The hard part isn’t the calling convention or the VM, it’s figuring out how to handle all of the language semantics.
That proof of concept is interesting – but I didn’t see any challenging features shared. What does PHP do with a StopIteration exception from python? Can it consume a python iterator at all? From that description, it seems like you can’t easily pass a list from PHP to python – you need to manually adapt it. And you can get exceptions if the adapted PHP variable stops looking like a list.
And, these are two languages that most people would consider close.
And, these are two languages that most people would consider close.
I made that mistake when we started that work, but was quickly disabused of it. PHP and Python look close only if you don’t look closely, mostly because PHP is a very unusual language (e.g. PHP’s references, for example, are rather unique). You might find the full paper (https://soft-dev.org/pubs/html/barrett_bolz_diekmann_tratt__fine_grained_language_composition/) has more detail about the challenges and how they were resolved. For example, Section 6 tackles cross-language variable scoping, which stretched my (admittedly meagre) language design abilities to, and possibly beyond, their maximum!
the x86_64 calling convention is way too complicated
It’s quite simple. First few integral parameters in registers. First few floating parameters in floating registers. Further arguments on the stack. Large parameters (structs) passed either as pointers or on the stack.
Large parameters (structs) passed either as pointers or on the stack.
The complexity comes from how small structs are passed and returned. It’s not unmanagable, but it’s enough to be painful compared to the Windows ABI. The rules around types interactions in small structs are more complicated than they need to be. This makes determining which registers to use to pass a struct overly complex.
Rather than cross-language interoperability, I think that the real value would be in providing a single platform that could be ported to different systems and brings tons of existing code to them. Basically, taking the JVM portability advantage and democratizing it. It would really benefit niche systems. (Commoditize your compliments!)
But getting Haskell, C++, Rust, and Python to easily interoperate without tons of glue code seems… unlikely.
But that doesn’t mean there wouldn’t be wins in other areas. Programming in Haxe was like programming in 1 + (targets * .5) languages because you had to be mindful of how everything would be translated. However, things like unit testing were O(1) because the test suites would be automagically generated and run for each target.
And if you’re ok with glue code, the C ABI and the various language FFIs more or less lets you do the job.
I agree with you insofar that I’m skeptical language interoperability can go beyond functional interfaces passing serializable data structures. However, I think we can significantly improve on the C FFI baseline and piping strings around.
TruffleRuby’s implementation of the Ruby C FFI requires interpreting C code, yet it still manages to beat the pants off of mainline because the common AST representation of the two languages allows the JVM to perform optimizations on a shared in-memory representation of their data. Or take the WASM type spec, which would allow for similar optimizations on similar data structures while providing sane defaults when converting between primitive values.
For reference: Note that this appears to be a collaborative research project. So not necessarily something aimed at producing Real Systems, but more exploring what Real Systems could look like and feeding that back into industry.
There’s a spectrum to how close you want your VM to be to your language. Closer means better performance (e.g. luajit using bytecode tailored to lua 5.1’s semantics), farther means easier support for a variety of languages.
But farther also means that you’ll eventually end up adding new intermediate representations that better suit a specific language (e.g. Rustc having MIR, Swiftc having SIL…) and then are you really better off than where you started? In the case of LLVM’s IR/GCC’s tree representation you are because you get a ton of optimizations and targets for free, but if the IR is very low level and only has a single backend I’m not so sure.
I’ll keep an eye on this project, I’m not sure anything usable will come out of it but I’m certain many interesting insights will.
There’s a spectrum to how close you want your VM to be to your language. Closer means better performance (e.g. luajit using bytecode tailored to lua 5.1’s semantics), farther means easier support for a variety of languages.
While I largely agree with you, I think that organizational and personal preferences have a lot more to do with VM/language infrastructure choices. Graal.js is competitive with Node and Oracle, Mozilla, Apple, and Google would spend far fewer engineering resources optimizing a shared runtime than they currently spend on Graal, Spidermonkey, FTL, and V8.
But they don’t do that for the same reasons that Google developed Dalvik or V8 or Blink: because they wanted to do their own thing. Google didn’t want to license HotSpot from Sun for Android and they wanted to try different techniques than what was in WebKit. Similarly, I suspect Python, Ruby, and Lua developers enjoy programming in C and don’t want to spend their time in Truffle or RPython, even though it would be very beneficial to their community.
In the case of LLVM’s IR/GCC’s tree representation you are because you get a ton of optimizations and targets for free, but if the IR is very low level and only has a single backend I’m not so sure.
I’ll keep an eye on this project, I’m not sure anything usable will come out of it but I’m certain many interesting insights will.
I think that organizational and personal preferences have a lot more to do with VM/language infrastructure choices.
I think I agree.
Graal.js is competitive with Node
While that’s true for hot-loop performance, it isn’t for startup time. Graal would be a very poor choice for CGI-style execution of js scripts.
Similarly, I suspect Python, Ruby, and Lua developers enjoy programming in C and don’t want to spend their time in Truffle
Truffle is a very poor choice if startup times matter (can you imagine recompiling your scripts everytime you want to run them?). I would wager the same thing about code size, although I don’t have any data to prove that. Truffle being worse at some things is normal - you have to sacrifice some aspects to win on others, but it also means that you can’t have a generic technology that does everything better than a non-generic technology. The non-generic technology can be tailored to a use case.
Any thoughts on MLIR?
I know nothing about MLIR - from what I’ve read it’s a machine-learning IR that targets LLVM’s IR. This fits the pattern my original comment described: LLVM was too generic for ML applications and now it makes economical sense to build something else on top that better describes the actual programs and allows better optimizations. To me, MLIR makes sense because it’s higher-level, contrary to SOIL which seems lower-level.
While that’s true for hot-loop performance, it isn’t for startup time. Graal would be a very poor choice for CGI-style execution of js scripts.
OpenJ9 has experimented with the full range of trade-offs including AOT, AOT+JIT, caching JIT output, and server based JIT (transcript, slides). I can’t find the paper right now, but I know that there was an academic publication exploring a shared Android JIT for phones, which showed a lot of energy savings.
Similarly, I suspect Python, Ruby, and Lua developers enjoy programming in C and don’t want to spend their time in Truffle
Truffle is a very poor choice if startup times matter (can you imagine recompiling your scripts everytime you want to run them?). I would wager the same thing about code size, although I don’t have any data to prove that. Truffle being worse at some things is normal - you have to sacrifice some aspects to win on others, but it also means that you can’t have a generic technology that does everything better than a non-generic technology. The non-generic technology can be tailored to a use case.
I know nothing about MLIR - from what I’ve read it’s a machine-learning IR that targets LLVM’s IR. This fits the pattern my original comment described: LLVM was too generic for ML applications and now it makes economical sense to build something else on top that better describes the actual programs and allows better optimizations. To me, MLIR makes sense because it’s higher-level, contrary to SOIL which seems lower-level.
It started in Machine Learning land, but it (now?) stands for Multi-Level Intermediate Representation and is designed to make it easier to apply common optimizations in IRs above the LLVM IR (explainer).
There will always be room for language specific optimizations, hand-tooled machine code, etc. But on the technical merits, I think the “roll everything yourself” crowd is more wrong than it is right.
Yes, GCC has a higher-level AST named Tree/Generic and a lower-level, SSA-based representation named Gimple. You can see the various stages the program goes through by passing the -fdump-tree-all option to GCC. There’s also a register-level representation named RTL but I don’t remember the flags you need to use in order to dump that.
I’m not a GCC expert, but I think GCC has two intermediate representations, one called GIMPLE and the other called RTL. I think GIMPLE is a higher-level IR, while RTL is a very low level representation using s-expressions. Not 100% on this.
You can get a GIMPLE dump with gcc -fdump-tree-gimple
Is the characterisation as intermediate language fair? This reads like an extension of WebAssembly, which I would consider more a compilation target than an intermediate language. While you can compile several languages to WebAssembly and link them, how do the necessary runtime systems cooperate? This is already difficult when using one foreign function interface but this would need some fresh ideas to make it work beyond that.
I think it depends on how far up the stack you want to crawl. The WASM type specification will (in theory) provide nice defaults for wrangling low-level primitives. However, this can break because languages change over time.
This happened to me when working with Haxe’s transpilation system: for performance reasons, they chose to stick to their hand-rolled JS implementation of a Map even though JavaScript had introduced a native Map object. However, there are no stability guarantees WRT their implementation, so they could have decided to extend the native Map object with a semvar patch bump. As a result I had to resort to passing JSON around despite Haxe producing readable JavaScript code.
This could have been avoided with more effort, but they didn’t have the resources to manage multiple compilation targets and backport changes. My memory is fuzzy, but I know that a/some JVM language(s) changed their implementation(s) of a specific feature when Java implemented it in a way that was represented differently at the bytecode level.
I’d need to be convinced that another interpreter solves any major problems of cross-language interoperability. As far as I can tell, the challenge is squaring the circle of different semantics.
What happens when you take a Rust affine value, pass it to Python, which assigns a few copies of it around, and passes all of those copies back? What happens when you pass something with a destructor to a language that doesn’t support destructors? What happens with weak references that are implicitly nullable when you pass them to a language without NULL?
Getting all of that to work is where the effort in cross-language interop lies.
You might be able to get it to work in languages that feel close – the .NET CLR does this, for example. But getting Haskell, C++, Rust, and Python to easily interoperate without tons of glue code seems… unlikely.
And if you’re ok with glue code, the C ABI and the various language FFIs more or less lets you do the job.
C ABI does the job, but I think I am not revealing anything shocking when I say C ABI could be greatly improved. Have you seen Fine-grained Language Composition by Laurence Tratt? It is a clear proof of concept of what is possible, and we are very far from it.
Of course there’s room for improvement – the x86_64 calling convention is way too complicated. But that’s beside the point. The hard part isn’t the calling convention or the VM, it’s figuring out how to handle all of the language semantics.
That proof of concept is interesting – but I didn’t see any challenging features shared. What does PHP do with a StopIteration exception from python? Can it consume a python iterator at all? From that description, it seems like you can’t easily pass a list from PHP to python – you need to manually adapt it. And you can get exceptions if the adapted PHP variable stops looking like a list.
And, these are two languages that most people would consider close.
I made that mistake when we started that work, but was quickly disabused of it. PHP and Python look close only if you don’t look closely, mostly because PHP is a very unusual language (e.g. PHP’s references, for example, are rather unique). You might find the full paper (https://soft-dev.org/pubs/html/barrett_bolz_diekmann_tratt__fine_grained_language_composition/) has more detail about the challenges and how they were resolved. For example, Section 6 tackles cross-language variable scoping, which stretched my (admittedly meagre) language design abilities to, and possibly beyond, their maximum!
It’s quite simple. First few integral parameters in registers. First few floating parameters in floating registers. Further arguments on the stack. Large parameters (structs) passed either as pointers or on the stack.
The complexity comes from how small structs are passed and returned. It’s not unmanagable, but it’s enough to be painful compared to the Windows ABI. The rules around types interactions in small structs are more complicated than they need to be. This makes determining which registers to use to pass a struct overly complex.
Rather than cross-language interoperability, I think that the real value would be in providing a single platform that could be ported to different systems and brings tons of existing code to them. Basically, taking the JVM portability advantage and democratizing it. It would really benefit niche systems. (Commoditize your compliments!)
But that doesn’t mean there wouldn’t be wins in other areas. Programming in Haxe was like programming in
1 + (targets * .5)
languages because you had to be mindful of how everything would be translated. However, things like unit testing wereO(1)
because the test suites would be automagically generated and run for each target.I agree with you insofar that I’m skeptical language interoperability can go beyond functional interfaces passing serializable data structures. However, I think we can significantly improve on the C FFI baseline and piping strings around.
TruffleRuby’s implementation of the Ruby C FFI requires interpreting C code, yet it still manages to beat the pants off of mainline because the common AST representation of the two languages allows the JVM to perform optimizations on a shared in-memory representation of their data. Or take the WASM type spec, which would allow for similar optimizations on similar data structures while providing sane defaults when converting between primitive values.
For reference: Note that this appears to be a collaborative research project. So not necessarily something aimed at producing Real Systems, but more exploring what Real Systems could look like and feeding that back into industry.
There’s a spectrum to how close you want your VM to be to your language. Closer means better performance (e.g. luajit using bytecode tailored to lua 5.1’s semantics), farther means easier support for a variety of languages.
But farther also means that you’ll eventually end up adding new intermediate representations that better suit a specific language (e.g. Rustc having MIR, Swiftc having SIL…) and then are you really better off than where you started? In the case of LLVM’s IR/GCC’s tree representation you are because you get a ton of optimizations and targets for free, but if the IR is very low level and only has a single backend I’m not so sure.
I’ll keep an eye on this project, I’m not sure anything usable will come out of it but I’m certain many interesting insights will.
While I largely agree with you, I think that organizational and personal preferences have a lot more to do with VM/language infrastructure choices. Graal.js is competitive with Node and Oracle, Mozilla, Apple, and Google would spend far fewer engineering resources optimizing a shared runtime than they currently spend on Graal, Spidermonkey, FTL, and V8.
But they don’t do that for the same reasons that Google developed Dalvik or V8 or Blink: because they wanted to do their own thing. Google didn’t want to license HotSpot from Sun for Android and they wanted to try different techniques than what was in WebKit. Similarly, I suspect Python, Ruby, and Lua developers enjoy programming in C and don’t want to spend their time in Truffle or RPython, even though it would be very beneficial to their community.
Any thoughts on MLIR?
I think I agree.
While that’s true for hot-loop performance, it isn’t for startup time. Graal would be a very poor choice for CGI-style execution of js scripts.
Truffle is a very poor choice if startup times matter (can you imagine recompiling your scripts everytime you want to run them?). I would wager the same thing about code size, although I don’t have any data to prove that. Truffle being worse at some things is normal - you have to sacrifice some aspects to win on others, but it also means that you can’t have a generic technology that does everything better than a non-generic technology. The non-generic technology can be tailored to a use case.
I know nothing about MLIR - from what I’ve read it’s a machine-learning IR that targets LLVM’s IR. This fits the pattern my original comment described: LLVM was too generic for ML applications and now it makes economical sense to build something else on top that better describes the actual programs and allows better optimizations. To me, MLIR makes sense because it’s higher-level, contrary to SOIL which seems lower-level.
OpenJ9 has experimented with the full range of trade-offs including AOT, AOT+JIT, caching JIT output, and server based JIT (transcript, slides). I can’t find the paper right now, but I know that there was an academic publication exploring a shared Android JIT for phones, which showed a lot of energy savings.
Interpreted languages in general don’t have great startup times 😉. I know there are some asterisks associated with Graal/Truffle AOT, but Graal/Truffle can perform AOT compilation.
It started in Machine Learning land, but it (now?) stands for Multi-Level Intermediate Representation and is designed to make it easier to apply common optimizations in IRs above the LLVM IR (explainer).
There will always be room for language specific optimizations, hand-tooled machine code, etc. But on the technical merits, I think the “roll everything yourself” crowd is more wrong than it is right.
Does GCC really have an IR you can inspect (like Clang’s
-emit-llvm
)?Yes, GCC has a higher-level AST named Tree/Generic and a lower-level, SSA-based representation named Gimple. You can see the various stages the program goes through by passing the -fdump-tree-all option to GCC. There’s also a register-level representation named RTL but I don’t remember the flags you need to use in order to dump that.
I’m not a GCC expert, but I think GCC has two intermediate representations, one called GIMPLE and the other called RTL. I think GIMPLE is a higher-level IR, while RTL is a very low level representation using s-expressions. Not 100% on this.
You can get a GIMPLE dump with
gcc -fdump-tree-gimple
Is the characterisation as intermediate language fair? This reads like an extension of WebAssembly, which I would consider more a compilation target than an intermediate language. While you can compile several languages to WebAssembly and link them, how do the necessary runtime systems cooperate? This is already difficult when using one foreign function interface but this would need some fresh ideas to make it work beyond that.
I think it depends on how far up the stack you want to crawl. The WASM type specification will (in theory) provide nice defaults for wrangling low-level primitives. However, this can break because languages change over time.
This happened to me when working with Haxe’s transpilation system: for performance reasons, they chose to stick to their hand-rolled JS implementation of a Map even though JavaScript had introduced a native Map object. However, there are no stability guarantees WRT their implementation, so they could have decided to extend the native Map object with a semvar patch bump. As a result I had to resort to passing JSON around despite Haxe producing readable JavaScript code.
This could have been avoided with more effort, but they didn’t have the resources to manage multiple compilation targets and backport changes. My memory is fuzzy, but I know that a/some JVM language(s) changed their implementation(s) of a specific feature when Java implemented it in a way that was represented differently at the bytecode level.
I miss a chapter about GraalVM in the „First questions“.
(and BTW: Java/JVM is not proprietary)