This seems like a blog post using a PDF paper format rather than a publishable paper. I appreciate the entertainment value, but the order I’ve learned over the years on how to read an academic paper that best suit my needs are totally thrown out of whack here. (Usually I read the abstract, then the conclusion, and based on those go back and skim or read specific sections in depth).
That abstract is a non-abstract. It’s somewhat interesting, but completely unhelpful in understanding what the paper is even about other than “a story about ffi” by someone who is entertaining and snarky.
There’s also no conclusion section. Discussion could possibly be named conclusion. Reading it helps me understand what the paper is even about (the attack surface area of FFI between rust and C, I think).
On the second page, halfway through the introduction:
In the rest of this paper, we look at real-world attempts to rewrite components of large C/C++ systems in Rust— and the new class of bugs and issues that developers in- troduce when writing FFI code
This seems like a blog post using a PDF paper format rather than a publishable paper.
I think it’s a preprint for HotOS (see url title). “Anonymous Authors” because the conference requires anonymized papers. A bit of a shame that this paper is being widely distributed if so
It’s a shame that the paper decided to open with cheap shots and stir needless controversy. The problems it describes are real. The C ABI boundary is unchecked and wildly unsafe, and the glue FFI language they propose could make it safer.
The thing that makes me the most sad about Rust is that a new language introduce foreign code interoperability without sandboxing. I understand why this happened, it makes it easy to incrementally add code to existing projects, but I think it’s the wrong approach long term. I want to write new programs in safe languages and use existing libraries in them. I want the guarantee that memory- (or type-) safety bugs in the libraries are no worse than logic errors in my code, they can’t escape from the abstract machine or break any of the type system’s invariants. I’ve been working on tools for building such a system for a long time.
Rewriting existing code in a safe language is very rarely the right thing to do. The existing implementation will have been widely tested, the new version will not. You can easily introduce logic errors when you rewrite anything and it takes a long time for a new project to reach the same level of stability as an existing one. There is also a huge opportunity cost. There are around 40 MLoC in Rust today and over 10 BLoC of C/C++. There is more C/C++ running with memory safety on Morello, where it can be compartmentalised and can share objects safely with safe-language code, than there is Rust code in total (including things that use unsafe). The more time spent rewriting things in Rust, the less time spent writing new things in Rust.
I’d love to see Rust move away from a foreign function interface as an interoperability layer and adopt a foreign library interface model so that C/C++ libraries can be isolated and have a clear boundary where things in the box may break the rules but the compiler can restrict what is shared and not trust types for things passed across the boundaries. This would let the 10% of new code in a program be written in Rust without giving up the safety properties as soon as you call any of the 90% that you get from existing libraries. You’d easily get ten times as many safe programs from this model than the one that requires rewriting all of your dependencies in safe Rust.
I didn’t know this was valid. I guess I assumed that pointer arguments in extern functions always had to be raw pointers? It seems like the semantics guaranteed by & and &mut could never be correct when passing pointers from C.
calling, say, add_twice(&bar, &bar) from C results in undefined behavior
Rust gives ABI layout guarantees for a few types. For example extern fn free_t(_: Option<Box<T>>){} is a valid useful implementation taking a nullable pointer and freeing it.
But it’s caller’s responsibility to pass valid data. Rust enforces this on its side, but C can’t.
Tangent, but note that assuming you care about portability and haven’t done things like setup custom allocators, you should only free something allocated by Rust like this, not something allocated in C with malloc. Freeing rust-allocated pointers in C or C allocated pointers in rust is a way to get undefined behaviour.
This seems like a blog post using a PDF paper format rather than a publishable paper. I appreciate the entertainment value, but the order I’ve learned over the years on how to read an academic paper that best suit my needs are totally thrown out of whack here. (Usually I read the abstract, then the conclusion, and based on those go back and skim or read specific sections in depth).
That abstract is a non-abstract. It’s somewhat interesting, but completely unhelpful in understanding what the paper is even about other than “a story about ffi” by someone who is entertaining and snarky.
There’s also no conclusion section. Discussion could possibly be named conclusion. Reading it helps me understand what the paper is even about (the attack surface area of FFI between rust and C, I think).
On the second page, halfway through the introduction:
Neat. Sounds like a worthwhile read.
Dammit I read this comment before clicking the link and still did the exact same thing
It’s like when someone posts a rickroll link and the first comment is “aww you got me” and I click it every time anyway.
I think it’s a preprint for HotOS (see url title). “Anonymous Authors” because the conference requires anonymized papers. A bit of a shame that this paper is being widely distributed if so
It’s a shame that the paper decided to open with cheap shots and stir needless controversy. The problems it describes are real. The C ABI boundary is unchecked and wildly unsafe, and the glue FFI language they propose could make it safer.
The thing that makes me the most sad about Rust is that a new language introduce foreign code interoperability without sandboxing. I understand why this happened, it makes it easy to incrementally add code to existing projects, but I think it’s the wrong approach long term. I want to write new programs in safe languages and use existing libraries in them. I want the guarantee that memory- (or type-) safety bugs in the libraries are no worse than logic errors in my code, they can’t escape from the abstract machine or break any of the type system’s invariants. I’ve been working on tools for building such a system for a long time.
Rewriting existing code in a safe language is very rarely the right thing to do. The existing implementation will have been widely tested, the new version will not. You can easily introduce logic errors when you rewrite anything and it takes a long time for a new project to reach the same level of stability as an existing one. There is also a huge opportunity cost. There are around 40 MLoC in Rust today and over 10 BLoC of C/C++. There is more C/C++ running with memory safety on Morello, where it can be compartmentalised and can share objects safely with safe-language code, than there is Rust code in total (including things that use unsafe). The more time spent rewriting things in Rust, the less time spent writing new things in Rust.
I’d love to see Rust move away from a foreign function interface as an interoperability layer and adopt a foreign library interface model so that C/C++ libraries can be isolated and have a clear boundary where things in the box may break the rules but the compiler can restrict what is shared and not trust types for things passed across the boundaries. This would let the 10% of new code in a program be written in Rust without giving up the safety properties as soon as you call any of the 90% that you get from existing libraries. You’d easily get ten times as many safe programs from this model than the one that requires rewriting all of your dependencies in safe Rust.
The motivating example in the introduction is interesting:
I didn’t know this was valid. I guess I assumed that pointer arguments in extern functions always had to be raw pointers? It seems like the semantics guaranteed by
&
and&mut
could never be correct when passing pointers from C.Rust gives ABI layout guarantees for a few types. For example
extern fn free_t(_: Option<Box<T>>){}
is a valid useful implementation taking a nullable pointer and freeing it.But it’s caller’s responsibility to pass valid data. Rust enforces this on its side, but C can’t.
Tangent, but note that assuming you care about portability and haven’t done things like setup custom allocators, you should only free something allocated by Rust like this, not something allocated in C with malloc. Freeing rust-allocated pointers in C or C allocated pointers in rust is a way to get undefined behaviour.