Bear in mind this is a proposal and is not part of the roadmap. You can read it on a sprint report:
Another interesting vision statement was about using Rust in Mercurial. Most people agreed that Mercurial would benefit from porting its native C code in Rust, essentially for security reasons and hopefully to gain a bit of performance and maintainability. More radical ideas were also proposed such as making the hg executable a Rust program (thus embedding a Python interpreter along with its standard library) or reimplementing commands in Rust (which would pose problems with respect to the Python extension system). Later on, Facebook presented mononoke, a promising Mercurial server implemented in Rust that is expected to scale better with respect to high committing rates.
That’s an impressively thorough proposal. I’m quite happy that Facebook is using Mercurial, it helps drive innovation and helps having an alternative to Git that works as scale. Now, I’m not sure how much performance they’ll get: many extensions are in Python, which is good thing for extensibility, so unless they port extensions to Rust (and lose the flexibility of Python), they won’t get that much of an improvement. Or did I miss something?
Extensions would still benefit from their primitives being faster. They appreciate that issues around FFI might arise and passing from Rust to Python and back in quick succession is definitely one of them.
Yeah, FFI speed is a concern, and ideally it’d be easier to implement an entire class in Rust and expose some methods to Python, because then it’d be easier to move some low-level parsers into Rust. I did a naive implementation of one of our parsers using nom and it was 100x (not a typo, one hundred times) faster than the C version, but the FFI overhead ended up making it not a win.
Out of curiosity, why is the Rust-Python FFI slower than C-Python FFI? I thought that Rust could generate C-callable symbols and call C directly. On that topic, I wrote a PEG parsing library in C with Python bindings and in production workloads 90% of the time is spend in FFI and object creation.
Well, with C, there’s always the possibility to write a python extension. This is not a generic FFI then.
Often, the issue there - and, reading python.h etc. at a glance - is that many interpreters allow direct manipulation of their memory structures (including creating and destroying objects). For that, they ship a lot of macros and definitions. You cannot use those in Rust directly. There’s two approaches to that: write a library that does exactly what python.h (on every version of the interpreter!) does and use that. Alternative: write a small C shim over your Rust code that does just the parts you need.
The big issue seemed to be in creating all the Python objects - I was returning a fairly large list of tuples, and the cpython extension could somewhat intelligently preallocate some things, whereas the Rust I think was having to be a bit dumber due to the wrappers in play.
As an addition to the point about primitives: there are cheap operations that are fast even in Python, there are expensive operations you run several times a day and there are rare operations where you need flexibility and mental-model-fit but can accept poor performance. Having better performance for frequent operations while keeping the flexibility for the long tail could be a win (depends on the effort required and usage patterns, of course).
Bear in mind this is a proposal and is not part of the roadmap. You can read it on a sprint report:
That’s an impressively thorough proposal. I’m quite happy that Facebook is using Mercurial, it helps drive innovation and helps having an alternative to Git that works as scale. Now, I’m not sure how much performance they’ll get: many extensions are in Python, which is good thing for extensibility, so unless they port extensions to Rust (and lose the flexibility of Python), they won’t get that much of an improvement. Or did I miss something?
Extensions would still benefit from their primitives being faster. They appreciate that issues around FFI might arise and passing from Rust to Python and back in quick succession is definitely one of them.
Yeah, FFI speed is a concern, and ideally it’d be easier to implement an entire class in Rust and expose some methods to Python, because then it’d be easier to move some low-level parsers into Rust. I did a naive implementation of one of our parsers using nom and it was 100x (not a typo, one hundred times) faster than the C version, but the FFI overhead ended up making it not a win.
Out of curiosity, why is the Rust-Python FFI slower than C-Python FFI? I thought that Rust could generate C-callable symbols and call C directly. On that topic, I wrote a PEG parsing library in C with Python bindings and in production workloads 90% of the time is spend in FFI and object creation.
Well, with C, there’s always the possibility to write a python extension. This is not a generic FFI then.
Often, the issue there - and, reading python.h etc. at a glance - is that many interpreters allow direct manipulation of their memory structures (including creating and destroying objects). For that, they ship a lot of macros and definitions. You cannot use those in Rust directly. There’s two approaches to that: write a library that does exactly what python.h (on every version of the interpreter!) does and use that. Alternative: write a small C shim over your Rust code that does just the parts you need.
The big issue seemed to be in creating all the Python objects - I was returning a fairly large list of tuples, and the cpython extension could somewhat intelligently preallocate some things, whereas the Rust I think was having to be a bit dumber due to the wrappers in play.
As an addition to the point about primitives: there are cheap operations that are fast even in Python, there are expensive operations you run several times a day and there are rare operations where you need flexibility and mental-model-fit but can accept poor performance. Having better performance for frequent operations while keeping the flexibility for the long tail could be a win (depends on the effort required and usage patterns, of course).