I think the compiler would have to do some analysis and know that qux modifies big.a and bar calls qux so also modifies big.a by proxy. Because of this, bar would be responsible for making a copy of bar.a before ti calls qux.
What makes this tricky is that you have to know what functions modify what even if it’s a nested function call and that information isn’t available in C headers so you would have to do some link time analysis. That’s a lot of extra linking complexity for some perf gains but it could be worth it.
A correctly-specified ABI should pass large structures by immutable reference, usually obviating the copy. In the event that a copy is needed, it will happen only once, in the callee, rather than needing to be repeated by every caller. The callee also has more flexibility, and can copy only those portions of the structure that are actually modified.
Looks like the immutability only applies to the reference that callee holds, not the original lvalue. This paragraph also says that only callee will ever need to do the copy. The only way to fix this that I see, is to do copy on caller side if it cannot guarantee that callee holds the only mutable pointer to the data.
When you say “will print 42 instead of 11”, what does that mean? Is it the intended semantics, or is it the semantics guaranteed by C? If no copy is ever intended, how can it be advantageous to have an ABI copy more, as is suggested in OP?
It’s a good point. I guess my answer is that structures should generally be passed by value, and that global variables should generally be avoided, meaning that there won’t generally be opportunity for aliasing. In the event that there is aliasing you have to make a copy on the caller side, but you’re not much worse off than you would otherwise have been.
Aggregates larger than 2×XLEN bits [side note: why the hell are you talking about bits?] are passed by reference and are replaced in the argument list with the address
XLEN is the width of registers. So you should read this as
Aggregates larger than two registers are passed by reference and are replaced in the argument list with the address
see also:
Aggregates whose total size is no more than XLEN bits are passed in a register, with the fields laid out as though they were passed in memory. If no register is available, the aggregate is passed on the stack. Aggregates whose total size is no more than 2×XLEN bits are passed in a pair of registers; if only one register is available, the first half is passed in a register and the second half is passed on the stack. If no registers are available, the aggregate is passed on the stack. Bits unused due to padding, and bits past the end of an aggregate whose size in bits is not divisible by XLEN, are undefined.
Honestly even the fact that structs can be passed by value (in C specifically) gets me every time. I really should not be surprised but I am surprised, every time :D
Oh wow, l’ve spend some time trying to figure out the best rules of thumb for when to go from by-value to by-immutable-reference, and this ABI argument was never mentioned. Thanks for bringing it up!
I don’t know, but I will say: it doesn’t really matter. There is no stable rust ABI, so poor decisions made by the rust compiler don’t have long-reaching effects that are difficult to reverse. If some choice they make negatively affects performance, they can simply change it, and—as happens frequently enough—when you upgrade your compiler, performance improves. Which really makes it a not particularly interesting question.
Flaws in stable ABIs are like optimizations you’re not allowed to perform because they would break compatibility.
I wouldn’t say that it doesn’t matter. Performance is important, and “is passing by value idiomatic?” has rather long reading effects. But it is indeed comparatively significantly less important than system’s abi.
In rust, passing by value or by (actually immutable) reference is idiomatic. Both of which give the compiler the same degree of freedom as the c compiler has when passing a structure by value.
But the question is not whether performance is important, the question is whether you’re locked into something that prevents you from improving your performance later. C on most popular architectures with most popular implementations is. Rust is not.
I mean, once you are not locked into something, the next question is “ok, is the current implementation the best one, or is there room for improvement?” There were cases where Rust, despite having flexibility, managed to implement ABI worse than C. The result being some Russian C++ trolls pointing finger at it and saying “look, Rust’s slower than C++” :-)
And it seems to me that in this particular case, the actual existing implementation decisions do affect library design, so this can create a bit of technical debt. To give a specific example, let’s say I have a largish Foo struct, and it is used in a library’s public API. Due to separate compilation, the actual calling convention used by the compiler matters. If there are two translation units, compiler can’t see through the call, so by value or by ref matters.
My current rule of thumb is “pass by value if smaller than X bytes, and by ref otherwise”, where X is two pointer sizes.
But this rule doesn’t make sense if the actually implemented ABI works like you suggested. In that case, it’s better to always pass by value as the X, effectively, is determined by compiler.
So the answer to the “are large by-value structs passed without copying” question directly affects the API of the libraries I author.
In other words, I completely agree with everything you say about system ABI, which is set in stone. It’s just that I live in a happy Rust universe which is malleable, so the question I personally am interested in is different.
How do you solve this problem with proposed ABI?
I think the compiler would have to do some analysis and know that
qux
modifiesbig.a
andbar
callsqux
so also modifiesbig.a
by proxy. Because of this,bar
would be responsible for making a copy ofbar.a
before ti callsqux
.What makes this tricky is that you have to know what functions modify what even if it’s a nested function call and that information isn’t available in C headers so you would have to do some link time analysis. That’s a lot of extra linking complexity for some perf gains but it could be worth it.
The proposed ABI says that the passed object should be immutable, I think.
Looks like the immutability only applies to the reference that callee holds, not the original lvalue. This paragraph also says that only callee will ever need to do the copy. The only way to fix this that I see, is to do copy on caller side if it cannot guarantee that callee holds the only mutable pointer to the data.
When you say “will print 42 instead of 11”, what does that mean? Is it the intended semantics, or is it the semantics guaranteed by C? If no copy is ever intended, how can it be advantageous to have an ABI copy more, as is suggested in OP?
Passing by value is semantically a copy
It’s a good point. I guess my answer is that structures should generally be passed by value, and that global variables should generally be avoided, meaning that there won’t generally be opportunity for aliasing. In the event that there is aliasing you have to make a copy on the caller side, but you’re not much worse off than you would otherwise have been.
XLEN is the width of registers. So you should read this as
see also:
Honestly even the fact that structs can be passed by value (in C specifically) gets me every time. I really should not be surprised but I am surprised, every time :D
It wasn’t always that way. As I recall, k&r uses
memcpy
to assign structs for the same reason.Oh wow, l’ve spend some time trying to figure out the best rules of thumb for when to go from by-value to by-immutable-reference, and this ABI argument was never mentioned. Thanks for bringing it up!
Curious, does the ABI used by the Rust compiler by default repeats this mistake? Rust doesn’t use system’s ABI, so this should be fixable there.
I don’t know, but I will say: it doesn’t really matter. There is no stable rust ABI, so poor decisions made by the rust compiler don’t have long-reaching effects that are difficult to reverse. If some choice they make negatively affects performance, they can simply change it, and—as happens frequently enough—when you upgrade your compiler, performance improves. Which really makes it a not particularly interesting question.
Flaws in stable ABIs are like optimizations you’re not allowed to perform because they would break compatibility.
I wouldn’t say that it doesn’t matter. Performance is important, and “is passing by value idiomatic?” has rather long reading effects. But it is indeed comparatively significantly less important than system’s abi.
In rust, passing by value or by (actually immutable) reference is idiomatic. Both of which give the compiler the same degree of freedom as the c compiler has when passing a structure by value.
But the question is not whether performance is important, the question is whether you’re locked into something that prevents you from improving your performance later. C on most popular architectures with most popular implementations is. Rust is not.
I mean, once you are not locked into something, the next question is “ok, is the current implementation the best one, or is there room for improvement?” There were cases where Rust, despite having flexibility, managed to implement ABI worse than C. The result being some Russian C++ trolls pointing finger at it and saying “look, Rust’s slower than C++” :-)
And it seems to me that in this particular case, the actual existing implementation decisions do affect library design, so this can create a bit of technical debt. To give a specific example, let’s say I have a largish Foo struct, and it is used in a library’s public API. Due to separate compilation, the actual calling convention used by the compiler matters. If there are two translation units, compiler can’t see through the call, so by value or by ref matters.
My current rule of thumb is “pass by value if smaller than X bytes, and by ref otherwise”, where X is two pointer sizes.
But this rule doesn’t make sense if the actually implemented ABI works like you suggested. In that case, it’s better to always pass by value as the X, effectively, is determined by compiler.
So the answer to the “are large by-value structs passed without copying” question directly affects the API of the libraries I author.
In other words, I completely agree with everything you say about system ABI, which is set in stone. It’s just that I live in a happy Rust universe which is malleable, so the question I personally am interested in is different.