Maybe I’m missing something, but I get the sense this would be harmful to the memory-efficiency of Rust’s ecosystem because I don’t feel you’ve convincingly argued that:
…this wouldn’t establish an “If you make the more manual way optional, then people will generally choose the more automatic one.” dynamic that results in sloppier, more leak-prone memory management in the general ecosystem, similar to how D more or less gives you a “Do you want no GC or do you want an ecosystem?” choice.
One of my biggest issues with GCed languages in general (and I count automatic reference counting as a form of GC, as academia does), is that it leads to more wasteful memory usage and more chance of “accidentally held reference” leaks compared to the current state of Rust, where you get compiler errors that boil down to “Whoa, there! Clarify this for me. Did you intend to reference this but get your lifetimes wrong? Did you want multiple ownership? Did you want a clone?)
This idea feels like it’d be too much of a lean toward “do what I mean” design and, though not strictly the same thing, its spirit reminds me of this passage from Boats’s Notes on a smaller Rust:
In other words, the core, commonly identified “hard part” of Rust - ownership and borrowing - is essentially applicable for any attempt to make checking the correctness of an imperative program tractable. So trying to get rid of it would be missing the real insight of Rust, and not building on the foundations Rust has laid out.
So far, it looks like claim() would just be a C++-style copy constructor by another name, guarded by a pinkie promise to not impl it in heavy ways more suited to Clone.
While automatically memcpying a [u8; 1024]is an issue, it doesn’t naturally follow that because this solution for “make costs explicit” is imperfect, that should justify allowing other “trust the human to not do that” solutions.
Likewise, I don’t think y.clone()is visual clutter. To me, “y.clone() is visual clutter” feels like a PHP or JavaScript developer arguing that .to_string() is visual clutter rather than protecting you from accidentally doing something like 5 + " 10-ton weights" and getting 15… it’s just a question of where in the Blub paradox you personally stand when you draw a line between clutter and useful hints for compile-time correctness checking.
By distinguishing claim from clone, your project avoids surprising performance footguns (this seems inarguably good).
I still don’t see how that follows from the arguments made. Developers could already have used Vec for large arrays. Large on-stack arrays occur because of an ill-considered use of the tools, and this would give another tool that a developer could use with insufficient consideration.
How does Claim magically draw a bright line between the lightweight uses of arrays where people want to auto-copy a collection of primitives and the big ones where, for whatever reason, someone didn’t realize the implications of what they wrote? From the compiler’s perspective, it’s a continuous, quantitative difference, not stepwise to a relevant degree or qualitative.
It just feels like giving users the same footgun, but with different engraving on the handle and a slightly different firing mechanism.
As for your “Doesn’t having these “profile lints” split Rust?”, my view is that, sure, I can set #![deny(automatic_claims)], but we already have enough trouble with the lack of a maintained cargo-geiger analogue for panicking code paths, like Rustig or findpanics could have been.
I don’t want yet another, far less greppable thing thing that I have to manually audit my dependencies for to feel confident that they’re not going to undermine my efforts to write correct, robust code.
In a sense, arguing that I can set #![deny(automatic_claims)], so it’s OK, is like a less hyperbolic version of arguing that I can embed a Python runtime and depend on Python packages, so it’s OK that Rust doesn’t have a crate for X, Y, or Z.
As for the loss in “Whoa, there! Double-check this.”-ness, if anything, it feels like a small step toward being C++.
In the end, this feels to me like false simplicity… the human willingness to fool ourselves into believing a simple solution exists because we don’t want to believe the problem is as complex as it actually is, so we implement a solution that just moves the source of complexity from using the language to writing quality packages… packagess Rust explicitly chooses to rely on by not having a batteries-included standard library.
So far, it looks like claim() would just be a C++-style copy constructor by another name, guarded by a pinkie promise to not impl it in heavy ways more suited to Clone.
To me this is the biggest issue - it has the potential to make Rust code considerably harder to reason about, at least in terms of performance. Whether there are correctness bugs possible to introduce with auto-claim isn’t clear to me, but regardless it does make implicit an operation that should probably be explicit in a performance-oriented language IMO.
I think the effect would be marginally less bad than C++ at least, since Rust makes it difficult to do really egregious things in Clone (and thus Claim), but it wouldn’t do anything to prevent violating the intended semantics of Claim (i.e. cheap, infallible, transparent), at least as far as I can tell. The only thing stopping you from violating all three seems to be someone saying “don’t do that”.
You’ve quoted my blog post to support your case, but I actually think this change would be a huge improvement to Rust. I can’t find a good counterargument for this change anywhere in your very long comment.
First of all, from the perspective of correctness as I meant it in the post of mine you quoted, literally every type that implements Clone could implement Claim, and nothing would change. Requiring explicit clone rather than giving types “normal type” semantics (in the sense of substructural types: https://en.wikipedia.org/wiki/Substructural_type_system) is strictly meant as a performance guidance to help users avoid computationally expensive algorithms, and has nothing to do with correctness in that sense (it has something to do with “fitness,” in that an unnecessarily slow but correct algorithm may still not be fit). I don’t think every type that implements Clone should implement Claim, but there’s nothing semantically sketchy about pure copy constructors that aren’t memcpy.
Tying “normal type semantics” to “can be copied by executing a memcpy” was always ridiculous. Huge arrays that are very expensive to memcpy have normal type semantics, whereas extremely cheap ref increments require adding a call to clone. This doesn’t make any sense! Yes, it prevents bad libraries from implementing normal type semantics which are unduly expensive (unless they’re unduly expensive because they’re huge memcpys), but there are way worse things you can do in a Rust library (like write undefined behavior using unsafe) and the way we police that is by establishing community standards and not using libraries that are terrible. This is how “implements Claim even though its slow,” which is a way less serious problem, should also be policed.
You can make some inane guilt-by-association with languages you don’t like like C++ or JavaScript or PHP, but my experience is that I often have to read some new code which spawns threads or async tasks and shares state between them, and the clone ceremony really obscures the meaningful behavior that is happening in this code. It sucks! In contrast, being forced to clone an Rc or an Arc has never benefited me in the slightest; I do not refactor my algorithm, I just perform the required ritual to clone it. This is very different from - for example - the affine semantics of Vec or String - which do help alert me to potential performance problems in my code.
I don’t have any specific loyalty to keeping Copy and memcpy equivalent in theory.
However, in practice, my concerns are:
I don’t think it’s reasonable to make such a fundamental change to users’ mental models of Rust at this point in Rust’s lifecycle (as opposed to closer to v1.0).
While I certainly see plenty of solid argumentation that the status quo has problems, I didn’t see convincing argumentation that the proposed solution is the best replacement. (Without sufficiently compelling argumentation for the replacement, it feels like those Young Earth Creationist arguments from the 2000s which were built around the fundamental misconception that, if only they could disprove evolution, everyone would recognize that their denomination of Christianity was the only remaining option.)
One of the properties that has become an established trait of Rust is that its “make costs explicit” design tends to lead to memory-efficient code and clear lifespans for data. From my perspective, you’re arguing to pessimize writing any code that values that property (especially if it’s synchronous code) for the benefit of asynchronous code. Given that Rust is not Go, I don’t think that can be taken as self-evident. (And, given that it hasn’t bothered me in the asynchronous code I write, the degree to which it’s a benefit is in dispute.)
It was points 2 and 3 that I felt are not unlike your argument against throwing the baby out with the bathwater when it comes to ownership and borrowing. I’m not talking about correctness, but fitness. There are a lot of things people have come to expect from Rust and language semantics which encourage a more laissez faire attitude toward memory management in dependencies is not something I see as desirable any more than it would be to make it easier to abuse catch_unwind as an exception-handling mechanism. It’s bad enough that tools like Rustig and findpanics are abandoned.
I mention JavaScript because, in my experience, the world of JavaScript struggles with needless memory consumption caused by data being held alive unintentionally through things like overlooked event handlers and I foresee auto-claiming Rc/Arc making it as difficult to audit for unexpected liveness as exceptions make auditing for unexpected failure paths.
I mention both JavaScript and PHP because they both have “do what i mean” features in the name of convenience which are now agreed to have been bad ideas that make it more difficult to ensure well-behaved programs (eg. type coercion), and I see auto-claiming in the name of things like Rc/Arc to be another such “feature we’ll regret”… especially since, unlike with a tracing garbage collector, all it takes to leak an Rc or Arc is a reference cycle.
I mention C++ because we are already seeing the 2023 Rust Survey saying that people’s biggest worry is about Rust’s complexity rising, and that worry has grown 5 percentage points just from the 2022 to 2023 survey. This feels like that proposal to do automatic Result wrapping… a minor ergonomic gain in exchange for a significant additional thing that needs to be internalized in order for a newcomer to read code.
…and I say minor because, just from the two of us as data points, it’s clear to me that .clone() obscuring the meaning of code is a subjective thing. I don’t find it obscures the meaning at all… but one of the things I treasure most in Rust is the explicitness of incrementing reference counts, because it allows me to more efficiently gain an understanding of how the program makes use of memory and data lifespans. (In fact, the fact that you can’t ripgrep for an implicit .claim() call is one of the reasons I’m against this.)
…but the main thing for me is that “OK, you’ve argued strongly that the status quo is wrong… but where is the compelling argument that your proposed replacement is right?” part.
There may also be other reasons, but it’s getting late for me, so I probably won’t remember them until tomorrow.
Having slept now, I realize that I should clarify a few things:
While I certainly see plenty of solid argumentation that the status quo has problems, I didn’t see convincing argumentation that the proposed solution is the best replacement. (Without sufficiently compelling argumentation for the replacement, it feels like those Young Earth Creationist arguments from the 2000s which were built around the fundamental misconception that, if only they could disprove evolution, everyone would recognize that their denomination of Christianity was the only remaining option.)
What I meant it to be read as is:
Yes, it argues well that there are problems with locking Copy to memcpy
It’s not that I find that the case for auto-claiming unconvincing… it just feels like it assumes the desirability of it will be obvious enough that it need not be argued… even in the face of the cons that come just from changing things at all.
It also feels, to me at least, that the argument made for Claim is so dependent on auto-claiming that, if auto-claiming doesn’t make its case, then the case for Claim on its own as being worth the churn is left effectively un-argued.
I’m not talking about correctness, but fitness.
I’m coming from an “If correctness mattered so much more than fitness, we’d all be writing Haskell or Agda or Coq or Idris or something else that prioritizes correctness at the expense of other things” angle. A big part of Rust’s success is the balance-point it occupies between various concerns and I think this change, as argued, would reduce its fitness for usually-synchronous, often low-level-sensitive coding (the niche that only Rust, C, and C++ have broken into in any significant way) in pursuit of pushing it toward suitability for niches occupied by languages like Go and CLR- and JVM-based languages. (And Python, Ruby, PHP, and JavaScript, to a lesser extent.)
and I see auto-claiming in the name of things like Rc/Arc to be another such “feature we’ll regret”… especially since, unlike with a tracing garbage collector, all it takes to leak an Rc or Arc is a reference cycle.
…and, unlike Go, CLR languages, JVM languages, Python, Ruby, etc., Rust is AOT-compiled to native code and lacks RTTI, which makes it more difficult to design an intuitive, ergonomic tool for diagnosing things like reference cycle-related leaks at runtime, even if you are using an allocator with an API for retrieving a list of all active allocations.
but one of the things I treasure most in Rust is the explicitness of incrementing reference counts, because it allows me to more efficiently gain an understanding of how the program makes use of memory and data lifespans.
This actually does have relevance to your “In contrast, being forced to clone an Rc or an Arc has never benefited me in the slightest; I do not refactor my algorithm, I just perform the required ritual to clone it.” because, whenever I see the compiler complain, I do stop to double-check my mental model for what should be going on matches what the code is doing… even a brief stop when encountering an Rc/Arc where I didn’t write .clone() from the outset. That’s why I consider removing the need to .clone() them a “‘Do what I mean’ footgun”.
Now, I would be receptive to something along the lines of reviving the old @ sigil as shorthand for .clone(), similar to how ? became shorthand for try!(), but I think it’s important to retain that speed bump on having your mental model of the code diverge from the compiler’s.
It also feels, to me at least, that the argument made for Claim is so dependent on auto-claiming that, if auto-claiming doesn’t make its case, then the case for Claim on its own as being worth the churn is left effectively un-argued.
Yes, there would be no reason to add Claim without auto-claiming. The whole purpose is to detach copy by memcpy (Copy) from implicitly copy (Claim).
whenever I see the compiler complain, I do stop to double-check my mental model for what should be going on matches what the code is doing…
I really think that you’re performing premature optimizations that really don’t matter and wasting your own time. I’m open to being wrong: I would be very interested in examining evidence to the contrary.
I really think that you’re performing premature optimizations that really don’t matter and wasting your own time. I’m open to being wrong: I would be very interested in examining evidence to the contrary.
I’ll need to think about the best way to make a strong argument.
It feels like something that could land very differently with different people, similar to how someone’s prior experience can easily colour their perception of the relative values of monadic vs. exception-based error handling.
For example, personally, I find a big part of its value is in helping to amortize the cost of being aware of the relationship between my designs and their memory usage across the whole development process.
My gut reaction to this was not positive, but I couldn’t initially explain to myself why, and thought that perhaps it was just due to not wanting such a fundamental change to the status quo. So I started trying to write down my thoughts, and eventually got at the core of what was bothering me, and I think there are some actual problems with this proposal:
Just for context: I haven’t felt the pain mentioned in the post regarding ref-counted values and closures - the intuition around how Rust captures values for a closure is not especially difficult to build (IMO), but maybe I’m missing the specific pain point that others have encountered. I would be in favor of supporting more explicit capture syntax, e.g. move(&a, b) like mentioned in the post; but I actually don’t find it annoying that I have to explicitly clone values and use move when needed. Abstracting that away via Claim/Capture or whatever, just so that the compiler can generate that kind of code automatically, seems like a fair amount of effort to make something more ergonomic that maybe should be explicit.
I think the essence of what bothers me though, boils down to auto-claim:
It was mentioned in the post a few times, but I want to highlight that auto-claim in conjunction with ref-counted values seems like a major footgun. It might be nice in cases where you are just trying to get GC-like behavior for some data in your program, but I don’t think Rust should be optimizing for that kind of thing (and I totally understand that this is just my opinion). The fact that Rust isn’t as ergonomic as a managed language for things like this isn’t a problem in my mind - it is part of the tradeoff that you get by using Rust instead of a managed language. Today, when you are writing performance-sensitive code where you need to be aware of every point where a reference count might be modified (and as a result, may be a performance bottleneck if that point in the program is on a hot path), you can trust that Rust isn’t doing something expensive behind your back. With auto-claim however, that would no longer be the case - you’d need to explicitly opt-in to a separate “profile” of the language for performance-sensitive folk, or at least that’s the impression I get, and that feels super weird to me. To be clear, I don’t so much object to the idea of Claim, as I do the notion of auto-claim. It makes writing code that does things behind the scenes that you don’t expect so much more likely, with little gained for that kind of risk IMO.
As an aside, the current way that Copy works has never burned me, but it is also just memcpy, so the worst-case scenario is that you end up implicitly copying excessive amounts of data around - but that could be trivially linted, either by rustc or clippy. In general though, you have to be really not paying attention to end up in a situation where you are copying around something like [u8; 1024] - there just aren’t many situations where large Copy types can arise, at least in my experience.
On the other hand, Claim (in conjunction with auto-claim) is a can of worms - it would occur implicitly by default, like Copy, so you get no visual signal at points where claim() is invoked by the compiler. Furthermore, because claim() invokes clone(), it can hide all sorts of implicit behavior, so even if you are aware that an auto-claim will occur, unless you know exactly how Clone is implemented for a given type, you are completely reliant on the implementor of that type to have respected the intended semantics of Claim. This is to some extent true with Clone as well, but at least in the case of Clone, cloning is explicit - so there is always a visual reminder that a potentially-expensive operation is happening at that point in the program.
As an aside, would Claim be mutually-exclusive with Copy? If both are implemented for a type, does the compiler implicitly-copy or implicitly-claim? As you’ve described it, it seems like the latter would be the case, otherwise the point about [u8; 1024] and such wouldn’t be solved via Claim (unless types like that would no longer be Copy, but since that would be a pretty major breaking change, I’m assuming not).
I find it very hard to believe that you write code where you need to carefully track increments of a ref count on your hot path for performance but aren’t worried about memcpying arbitrarily large structs. My assumption when I read comments like this is that the author has been misled by Rust’s current behavior about what is computationally expensive.
If you just assume that Rust’s current distinction between Copy/Clone is an accurate division between “fast to copy” and “potentially slow to copy,” of course the idea of changing the distinction for implicit copy vs explicit clone would make no sense! But the problem is that the current distinction is not always right: it is indeed so cheap to increment a ref counter that you do not actually need to be alerted to it, and some Copy types are so expensive to copy that you really should be.
I find it very hard to believe that you write code where you need to carefully track increments of a ref count on your hot path for performance but aren’t worried about memcpying arbitrarily large structs. My assumption when I read comments like this is that the author has been misled by Rust’s current behavior about what is computationally expensive.
I guess you misunderstood - I’m not worried about memcpy’ing arbitrarily large structs, because I’m not writing code with large Copy-able structs that get passed around by value to begin with. It wouldn’t even occur to me to pass around an array with 1k elements in it by value - that’s asinine. Obviously there are people out there that maybe don’t pay close attention to the types they are coding against, and assume that if it is Copy it is cheap to pass around by value - this feels like exactly the sort of thing tools like Clippy are intended to solve.
But the problem is that the current distinction is not always right: it is indeed so cheap to increment a ref counter that you do not actually need to be alerted to it, and some Copy types are so expensive to copy that you really should be.
It is unfortunate that my point has gotten tied up in the weeds of refcounting - but refcounting is not always cheap when thread contention is in play, though I’ll readily admit that there are many uses of refcounting for which the overhead is essentially irrelevant, maybe most uses. Regardless, my point wasn’t about refcounting specifically, I just picked on the mention of it in the OP.
The bigger issue in my mind has to do with auto-claim effectively injecting .clone() where the compiler sees fit, under the assumption that Claim impls are “cheap”, “infallible”, and “transparent”. None of those properties are upheld by the compiler (so far as I can determine from what is outlined in the OP anyway), so it seems that no matter how you slice it, you can’t avoid the necessity to know what types you are working with, and whether (and importantly, how) they implement a given trait. Except rather than the worst case scenario being copying too much data around, the scope of things that might happen via compiler magic grows larger.
Now, if the compiler could actually enforce those properties? I’d absolutely be in favor - but without that, this change as described seems like it would make most Rust code harder to review and reason about, not easier.
The bigger issue in my mind has to do with auto-claim effectively injecting .clone() where the compiler sees fit, under the assumption that Claim impls are “cheap”, “infallible”, and “transparent”. None of those properties are upheld by the compiler (so far as I can determine from what is outlined in the OP anyway), so it seems that no matter how you slice it, you can’t avoid the necessity to know what types you are working with, and whether (and importantly, how) they implement a given trait.
Now, if the compiler could actually enforce those properties? I’d absolutely be in favor - but without that, this change as described seems like it would make most Rust code harder to review and reason about, not easier.
Panicking is a necessary evil. To me, any use of the possibility to panic to justify something else the compiler can’t check is fundamentally an “It’s not perfect. Therefore, it’s OK to make it worse.” argument and, thus, doesn’t change the fact that a feature needs to prove itself on its own merits.
I’m in love with this proposal tbh. I understand the concerns that autoclaim loses “explicitness” (the word already appears over 20 times in these comments), but I view this change in behavior as a huge ergonomic win. I was just reading @withoutboats’ post Not Explicit (2017) earlier this week and find its splitting of “explicit” into a few different buckets quite useful here. Allow me to quote a bit:
Sometimes in frustration at “explicit is better than implicit” I am tempted to take the opposite position just to be contrarian - explicitness is bad, implicitness is better. In reality I do think that Rust is an uncommonly explicit language, but when I use the word explicit, I mean something more specific than most users seem to mean. To me, Rust is explicit because you can figure out a lot about your program from the source of it.
…
Sometimes, explicit is used to refer to requiring users to write code to make something happen. But if the thing will happen deterministically in a manner that can be derived from the source, it is still explicit in the narrow sense that I laid out earlier. Instead, this is about saying that certain actions should be manual - users have to opt in to making them happen.
Now allow me to juxtapose this with a snippet from Niko’s post:
// Rust today
tokio::spawn({
let io = cx.io.clone():
let disk = cx.disk.clone():
let health_check = cx.health_check.clone():
async move {
do_something(io, disk, health_check)
}
})
// Rust with autoclaim
tokio::spawn(async move {
do_something(cx.io, cx.disk, cx.health_check)
})
This is a very manual ritual we are all familiar with. Manual actions can be great for making expensive operations clear to readers, something Rust users are keenly aware of, but these .clone() calls are cheap operations that do not contribute to any meaningful insight of your code. You know what’s going on here: a(n atomic ref to a) number has been incremented. This is not a cost that I believe should be manual.
Even worse, if these types weren’t Rc/Arc, the .clone() ritual doesn’t communicate if these clones are cheap or expensive, and our existing mental pattern matching on this lambda ceremony may let us erroneously assume these areRc/Arc. Further, refactors to these types won’t require updating usages. With this proposal, if ctx.disk is replaced with a non-Claim type that is expensive to copy, it could not be autoclaimed and this usage would be a compilation error (iiuc).
On top of removing the auto-copying of arrays, this proposal looks great! I hope it goes far.
but these .clone() calls are cheap operations that do not contribute to any meaningful insight of your code.
That’s not necessarily true. When I see this code, there’s no guarantee that the nearest indication that those values are Rc/Arc‘d will be close by. Unless it is, that’s a loss of locality.
The explicit reading of it, currently, is that there will be no aliasing going on, lock-guarded or otherwise, because they’ll be moved unless they’re memcpy-safe.
That said, I wouldn’t have anything against something like this hypothetical syntax, which relies on some hypothetical RefCount trait that non-Rc/ArcClone-ables don’t implement (thus addressing the “One-clone-fits-all creates a maintenance hazard” aspect of the proposal):
I write a lot of code similar to the Cloudflare example and I would love to have a better solution for capturing cheap refcounted clones, however this autoclaim notion makes me uncomfortable. It’s not that there’s anything wrong with the definition - it totally makes sense that there’s “true copy” vs “cheap copy” vs “expensive copy” and we’re failing to capture that. The problem is that this mechanism will kick in for regular function parameters and now it’s much more appealing to pop something in an Arc than to use references properly. I predict some crates will have a MyRealType and then export a MyType which is an alias for Arc<MyRealType> which implements Claim and this will be seen to be a good thing in some circles because it helps with using your types inside closures. I love me some everything-is-refcounted semantics but that’s not the point of Rust.
I would much rather tackle this with explicit capture clauses, as described in the FAQ. There are two entangled problems with cloning an Arc - in my view, 90% of the problem is the tedious syntax, 10% of the problem is that it’s “fairly cheap actually”.
So I’m reading this and I think I’m semi-sold. I wondered at first about a few things:
Shouldn’t some of/ all of the cases where we care about Claim be handled by references? References are cheap for [u8; 1024] and avoid the accidental copy. But I guess the accidental copy can happen trivially with a * and that * is hiding a very large memcpy. But do we need Claim to solve that vs, say, a lint for “copying large value” ?
Ah, yes. The closure ceremony of let _foo = foo.clone(); or let foo2 = foo.clone() just to pass it into the closure. Solving this is very compelling.
What you really want is to just write something like this, like you would in Swift or Go or most any other modern language:3
I think there was a crate like this. Obviously this requires entirely new language features to work, but I think there was a crate that provided a macro like this and I thought “that looks nice”.
I see, continuing to read, that this is addressed in the FAQ with closure capture clauses.
The real goal should be to disconnect “can be memcopied” and “can be automatically copied”4.
nod.
I totally get that! And in fact I think this proposal actually helps your code:
Indeed, I think this is actually quite interesting for the stated reason. You can monitor your code for copy, clone, and claim, which have different characteristics and indicators. I think this would be valuable. IDEs could help here as well, perhaps.
I am curious to learn more. I’m not “afraid” of this getting approved at all. Nothing about this sets off alarm bells to me, it seems to address interesting problems, and it feels fairly rusty. The iterator and closure examples are pretty compelling to me, and footgun of [u8; 1024] is as well.
The real goal should be to disconnect “can be memcopied” and “can be automatically copied”4.
nod.
I’ll second this - the desire of the proposal to accomplish this goal is certainly something I’m on board with.
Ah, yes. The closure ceremony of let _foo = foo.clone(); or let foo2 = foo.clone() just to pass it into the closure. Solving this is very compelling.
I’d like to better understand why you find this of particular compelling interest. I mean, I can relate to the feeling that it is verbose, but on the flip side, explicitly cloning a refcounted value feels to me like that something that should be explicit, since it represents a side-effect. Cloning a value which requires allocating even more so. Rust making such things explicit is one of its strengths IMO. However, maybe I just am not writing the sort of programs where this pain is truly felt; or maybe it occurs as a result of using Rust more like a managed language via reference counting, where you really don’t care when/why something gets cloned - in any case, I definitely feel like I’m missing something.
I do think it would be nice to have something like Claim, to allow one to opt-in to automatic cloning on a case-by-case basis by implementing it for your own types. I really don’t like the idea of implementing it for Rc or Arc and things like that, since there would be no way to meaningfully constrain its use in such a way as to make it more of an opt-in kind of feature. Same with automatically deriving it. Otherwise I think it would be really easy for it to become a situation where it is either all or nothing - you can’t let your own types be auto-claimed without letting all claimable types be auto-claimed, so if you aren’t willing to be hyper-vigilant, you end up better off disabling auto-claim entirely.
but on the flip side, explicitly cloning a refcounted value feels to me like that something that should be explicit, since it represents a side-effect.
I don’t really care about that side effect too much tbh but if I did it sounds like it would be something I can still lint for, or I’d just wrap Rc in a non-Claim type. And my mental model of the code will just tell me “that’s a refcount” now, which should be something I can pretty easily understand.
Contrast that to the many times I’ve had to do the ‘clone 5 things before moving into a closure’ and I don’t think it’s really close in terms of value tbh. I’ve seen that in just about every Rust codebase I’ve dug into (primarily web services).
However, maybe I just am not writing the sort of programs where this pain is truly felt; or maybe it occurs as a result of using Rust more like a managed language via reference counting, where you really don’t care when/why something gets cloned - in any case, I definitely feel like I’m missing something.
Do you do a lot of async/ web services? That’s primarily where I run into it, I suppose. And it can be quite annoying, often adding 5+ lines, taking up valuable vertical screen space.
I do think it would be nice […]
I guess I don’t see the problem. Correct me if I am wrong, but it seems as if the reason you don’t want Claim to be implemented on Rc/Arc has to do with Claim having a side effect, right? But why does that matter? I don’t see any issue with side effects occurring, in and of themselves. Lots of side effects occur and we don’t think much about them - register states, cache lines, etc, all get mucked up implicitly. I don’t see much of a problem unless your code relies on an Rc not being incremented, but then don’t use Rc, wrap up your type in a type that does not impl Claim, right?
Otherwise I think it would be really easy for it to become a situation where it is either all or nothing - you can’t let your own types be auto-claimed without letting all claimable types be auto-claimed, so if you aren’t willing to be hyper-vigilant, you end up better off disabling auto-claim entirely.
But why would you want to let your own types be auto-claimed but not all claimable types be auto-claimed? What’s the purpose there? It seems to me that what you would actually end up with is a default of “all claimable types are autoclaimed, but for some reason I have some types I don’t want autoclaimed that are claimable, so I wrap them up in a non-Claim impl” or something like that.
Contrast that to the many times I’ve had to do the ‘clone 5 things before moving into a closure’ and I don’t think it’s really close in terms of value tbh. I’ve seen that in just about every Rust codebase I’ve dug into (primarily web services).
To be perfectly frank, to me, this feels like a problem that would be better and more idiomatically solved by extending the closure syntax for explicit capture or, if a strong argument does exist for it occuring in more places, coming up with a try!→?-esque sigil for .clone().
Maybe it’s just how I write my closure-using code, but I haven’t felt much pain from this and maybe it’s just my Python background showing through but, on the rare occasions when my use_small_heuristics = "Max" mind does have a problem with it (yes, I’m one of those people who would manually cherry-pick rustfmt changes before use_small_heuristics = "Max"), I use tuple-unpacking syntax to put them on a single line:
let (x, y, z) = (x.clone(), y.clone(), z.clone());
Well yeah, I mention that and it’s in the FAQ. The closure syntax is fixed by the capture changes. But the approach in the article solves it too, and also addresses other issues.
The problem is that, as argued, I feel that it’s a case of “Killing a fly with a sledgehammer and ignoring the potential for leaving holes in the floor in the process”.
A trait for cheap clones would be nice to distinguish between deep clones and duplicating handles. Even if just delegated to Clone, it’d make code more locally explicit, and less noisy than Arc::clone(&).
However, I’m not sure about auto incrementing refcounts. For example, it’s dangerous for channels which can deadlock if the last sender isn’t dropped when expected. So Channels would probably opt out of Claim like Cell and Range opt out of Copy?
Could claim try to minimize number of auto clones? Copy being semantically always a copy isn’t a big deal, because the optimizer can mostly remove redundant memcpys. However, Clone has bigger side effects, and that would make programs have observable differences depending on the optimizer. OTOH if it always naively cloned on redundant assignments, I’d be unhappy about it (I like microoptimizing such things, and this is not rational…)
So Channels would probably opt out of Claim like Cell and Range opt out of Copy?
That is another concern that I forgot to mention.
The argumentation doesn’t feel like it adequately addresses the risk that actually benefiting from auto-claim might be blocked by the whole “When in doubt, C and C++ trust the programmer. By contrast, safe Rust must fail safe.” aspect of things if the APIs are misaligned to the invariants they need to uphold.
(Especially since, given the limited set of supporting examples, it feels like a broad-spectrum fix for specific issues… which is always harder to get right.)
However, Clone has bigger side effects, and that would make programs have observable differences depending on the optimizer. OTOH if it always naively cloned on redundant assignments, I’d be unhappy about it (I like microoptimizing such things, and this is not rational…)
That’s another reason I’m against this. My mental model of Rust is that an assignment without a .clone() can be elided by the optimizers until proven otherwise and auto-claim would increase the odds that, when it comes time to profile my code, I might run into something that requires more significant re-architecting to make it still pass borrow-check.
Nitpick. So they would choose not to opt-in to Claim.
Channels are equivalent to (and often literally) Arc<Inner>, so implementation-wise they’re cheaply refcounted. It’s just that the count is particularly important for them.
It felt worth noting that it’s opt-in, not opt-out. That’s a significant difference, although perhaps not to your specific point.
I think, to your point, there is some contention between the various definitions. There’s “memcpy” (Copy), “cheap to copy” (Claim), and “deep copy” (Clone). But nowhere in there is “copy, but be careful because different copies actually care about one another in some way” the way that channels might. In those cases it may make sense to rely on Clone because there’s care needed in managing the copies.
I see the theoretical value in making a distinction between deep clones and incref, I already wished this existed a few times when I was unsure if a clone was deep or not and I ended up reading some source code.
On the other hand on a practical level I don’t find it that useful, I see the following happening:
Some lib authors start requiring bounds on Claim instead of Clone in generics
Some lib users start implementing Claim on objects that have no… claim to implement it, so they can use the trait bound
back to square one, now with 3 traits
Rather than a separate trait, maybe an annotation (possibly automatically derived?) on Clone implementation would be better IMO
I don’t see much value in auto-claim. Cloning in lambda is an ergonomic issue, but IMO a small one. I add a scope to perform the clone before the lambda. I don’t think it warrants such complicated machinery. Especially when ref-counting isn’t that common.
I think tying the move behaviour to the size of objects could be a back-compatibility issue for lib authors. When deciding to make a type Copy, it is already hard enough to decide whether or not the type will manage resources or not in the future. Adding a condition on the size of the type sounds like an ergonomic hurdle much bigger than the Clone issue. As a lib author, will I even notice it when my type will stop auto-implementing Claim because I added some bytes to an internal array?
Rather than this, I’d like:
memcpy being elided whenever possible, with tail call optimization if needs be (become) or some ergonomic syntax for places (which would also solve other issues)
If we want to change the semantics of move, I think adding some kind of transfer semantics where self-referential pointers are “fixed up” would be a much better… move (Niko also had an article about “self-referential references” recently)
Totally caveman suggestion: could you mitigate the pain of using .clone() for Rc and Arc by using a shorter different method name like .inc() or .i() or something?
This is distinct from .clone() so it’s visually obvious that it doesn’t malloc+memcpy a large structure. It won’t accidentally start copying a large structure if the object is later changed from Rc<BigThing> to BigThing. It’s short.
No, all Clone types, and by extension Copy, must be Sized, and that implies knowing the size at compile-time. Clippy has some lints related to large values, like large_enum_variant and trivially_copy_pass_by_ref, but nothing for this specific problem, though it seems like it would be trivial to catch.
Maybe I’m missing something, but I get the sense this would be harmful to the memory-efficiency of Rust’s ecosystem because I don’t feel you’ve convincingly argued that:
…this wouldn’t establish an “If you make the more manual way optional, then people will generally choose the more automatic one.” dynamic that results in sloppier, more leak-prone memory management in the general ecosystem, similar to how D more or less gives you a “Do you want no GC or do you want an ecosystem?” choice.
One of my biggest issues with GCed languages in general (and I count automatic reference counting as a form of GC, as academia does), is that it leads to more wasteful memory usage and more chance of “accidentally held reference” leaks compared to the current state of Rust, where you get compiler errors that boil down to “Whoa, there! Clarify this for me. Did you intend to reference this but get your lifetimes wrong? Did you want multiple ownership? Did you want a clone?)
This idea feels like it’d be too much of a lean toward “do what I mean” design and, though not strictly the same thing, its spirit reminds me of this passage from Boats’s Notes on a smaller Rust:
So far, it looks like
claim()would just be a C++-style copy constructor by another name, guarded by a pinkie promise to notimplit in heavy ways more suited toClone.While automatically
memcpying a[u8; 1024]is an issue, it doesn’t naturally follow that because this solution for “make costs explicit” is imperfect, that should justify allowing other “trust the human to not do that” solutions.Likewise, I don’t think
y.clone()is visual clutter. To me, “y.clone()is visual clutter” feels like a PHP or JavaScript developer arguing that.to_string()is visual clutter rather than protecting you from accidentally doing something like5 + " 10-ton weights"and getting15… it’s just a question of where in the Blub paradox you personally stand when you draw a line between clutter and useful hints for compile-time correctness checking.I still don’t see how that follows from the arguments made. Developers could already have used
Vecfor large arrays. Large on-stack arrays occur because of an ill-considered use of the tools, and this would give another tool that a developer could use with insufficient consideration.How does
Claimmagically draw a bright line between the lightweight uses of arrays where people want to auto-copy a collection of primitives and the big ones where, for whatever reason, someone didn’t realize the implications of what they wrote? From the compiler’s perspective, it’s a continuous, quantitative difference, not stepwise to a relevant degree or qualitative.It just feels like giving users the same footgun, but with different engraving on the handle and a slightly different firing mechanism.
As for your “Doesn’t having these “profile lints” split Rust?”, my view is that, sure, I can set
#![deny(automatic_claims)], but we already have enough trouble with the lack of a maintained cargo-geiger analogue for panicking code paths, like Rustig or findpanics could have been.I don’t want yet another, far less greppable thing thing that I have to manually audit my dependencies for to feel confident that they’re not going to undermine my efforts to write correct, robust code.
In a sense, arguing that I can set
#![deny(automatic_claims)], so it’s OK, is like a less hyperbolic version of arguing that I can embed a Python runtime and depend on Python packages, so it’s OK that Rust doesn’t have a crate for X, Y, or Z.As for the loss in “Whoa, there! Double-check this.”-ness, if anything, it feels like a small step toward being C++.
In the end, this feels to me like false simplicity… the human willingness to fool ourselves into believing a simple solution exists because we don’t want to believe the problem is as complex as it actually is, so we implement a solution that just moves the source of complexity from using the language to writing quality packages… packagess Rust explicitly chooses to rely on by not having a batteries-included standard library.
To me this is the biggest issue - it has the potential to make Rust code considerably harder to reason about, at least in terms of performance. Whether there are correctness bugs possible to introduce with auto-claim isn’t clear to me, but regardless it does make implicit an operation that should probably be explicit in a performance-oriented language IMO.
I think the effect would be marginally less bad than C++ at least, since Rust makes it difficult to do really egregious things in
Clone(and thusClaim), but it wouldn’t do anything to prevent violating the intended semantics ofClaim(i.e. cheap, infallible, transparent), at least as far as I can tell. The only thing stopping you from violating all three seems to be someone saying “don’t do that”.You’ve quoted my blog post to support your case, but I actually think this change would be a huge improvement to Rust. I can’t find a good counterargument for this change anywhere in your very long comment.
First of all, from the perspective of correctness as I meant it in the post of mine you quoted, literally every type that implements Clone could implement Claim, and nothing would change. Requiring explicit clone rather than giving types “normal type” semantics (in the sense of substructural types: https://en.wikipedia.org/wiki/Substructural_type_system) is strictly meant as a performance guidance to help users avoid computationally expensive algorithms, and has nothing to do with correctness in that sense (it has something to do with “fitness,” in that an unnecessarily slow but correct algorithm may still not be fit). I don’t think every type that implements Clone should implement Claim, but there’s nothing semantically sketchy about pure copy constructors that aren’t memcpy.
Tying “normal type semantics” to “can be copied by executing a memcpy” was always ridiculous. Huge arrays that are very expensive to memcpy have normal type semantics, whereas extremely cheap ref increments require adding a call to clone. This doesn’t make any sense! Yes, it prevents bad libraries from implementing normal type semantics which are unduly expensive (unless they’re unduly expensive because they’re huge memcpys), but there are way worse things you can do in a Rust library (like write undefined behavior using unsafe) and the way we police that is by establishing community standards and not using libraries that are terrible. This is how “implements Claim even though its slow,” which is a way less serious problem, should also be policed.
You can make some inane guilt-by-association with languages you don’t like like C++ or JavaScript or PHP, but my experience is that I often have to read some new code which spawns threads or async tasks and shares state between them, and the clone ceremony really obscures the meaningful behavior that is happening in this code. It sucks! In contrast, being forced to clone an Rc or an Arc has never benefited me in the slightest; I do not refactor my algorithm, I just perform the required ritual to clone it. This is very different from - for example - the affine semantics of Vec or String - which do help alert me to potential performance problems in my code.
I don’t have any specific loyalty to keeping
Copyandmemcpyequivalent in theory.However, in practice, my concerns are:
It was points 2 and 3 that I felt are not unlike your argument against throwing the baby out with the bathwater when it comes to ownership and borrowing. I’m not talking about correctness, but fitness. There are a lot of things people have come to expect from Rust and language semantics which encourage a more laissez faire attitude toward memory management in dependencies is not something I see as desirable any more than it would be to make it easier to abuse
catch_unwindas an exception-handling mechanism. It’s bad enough that tools like Rustig and findpanics are abandoned.I mention JavaScript because, in my experience, the world of JavaScript struggles with needless memory consumption caused by data being held alive unintentionally through things like overlooked event handlers and I foresee auto-claiming Rc/Arc making it as difficult to audit for unexpected liveness as exceptions make auditing for unexpected failure paths.
I mention both JavaScript and PHP because they both have “do what i mean” features in the name of convenience which are now agreed to have been bad ideas that make it more difficult to ensure well-behaved programs (eg. type coercion), and I see auto-claiming in the name of things like
Rc/Arcto be another such “feature we’ll regret”… especially since, unlike with a tracing garbage collector, all it takes to leak anRcorArcis a reference cycle.I mention C++ because we are already seeing the 2023 Rust Survey saying that people’s biggest worry is about Rust’s complexity rising, and that worry has grown 5 percentage points just from the 2022 to 2023 survey. This feels like that proposal to do automatic
Resultwrapping… a minor ergonomic gain in exchange for a significant additional thing that needs to be internalized in order for a newcomer to read code.…and I say minor because, just from the two of us as data points, it’s clear to me that
.clone()obscuring the meaning of code is a subjective thing. I don’t find it obscures the meaning at all… but one of the things I treasure most in Rust is the explicitness of incrementing reference counts, because it allows me to more efficiently gain an understanding of how the program makes use of memory and data lifespans. (In fact, the fact that you can’tripgrepfor an implicit.claim()call is one of the reasons I’m against this.)…but the main thing for me is that “OK, you’ve argued strongly that the status quo is wrong… but where is the compelling argument that your proposed replacement is right?” part.
There may also be other reasons, but it’s getting late for me, so I probably won’t remember them until tomorrow.
Having slept now, I realize that I should clarify a few things:
What I meant it to be read as is:
CopytomemcpyClaimis so dependent on auto-claiming that, if auto-claiming doesn’t make its case, then the case forClaimon its own as being worth the churn is left effectively un-argued.I’m coming from an “If correctness mattered so much more than fitness, we’d all be writing Haskell or Agda or Coq or Idris or something else that prioritizes correctness at the expense of other things” angle. A big part of Rust’s success is the balance-point it occupies between various concerns and I think this change, as argued, would reduce its fitness for usually-synchronous, often low-level-sensitive coding (the niche that only Rust, C, and C++ have broken into in any significant way) in pursuit of pushing it toward suitability for niches occupied by languages like Go and CLR- and JVM-based languages. (And Python, Ruby, PHP, and JavaScript, to a lesser extent.)
…and, unlike Go, CLR languages, JVM languages, Python, Ruby, etc., Rust is AOT-compiled to native code and lacks RTTI, which makes it more difficult to design an intuitive, ergonomic tool for diagnosing things like reference cycle-related leaks at runtime, even if you are using an allocator with an API for retrieving a list of all active allocations.
This actually does have relevance to your “In contrast, being forced to clone an Rc or an Arc has never benefited me in the slightest; I do not refactor my algorithm, I just perform the required ritual to clone it.” because, whenever I see the compiler complain, I do stop to double-check my mental model for what should be going on matches what the code is doing… even a brief stop when encountering an
Rc/Arcwhere I didn’t write.clone()from the outset. That’s why I consider removing the need to.clone()them a “‘Do what I mean’ footgun”.Now, I would be receptive to something along the lines of reviving the old
@sigil as shorthand for.clone(), similar to how?became shorthand fortry!(), but I think it’s important to retain that speed bump on having your mental model of the code diverge from the compiler’s.Yes, there would be no reason to add Claim without auto-claiming. The whole purpose is to detach copy by memcpy (Copy) from implicitly copy (Claim).
I really think that you’re performing premature optimizations that really don’t matter and wasting your own time. I’m open to being wrong: I would be very interested in examining evidence to the contrary.
I’ll need to think about the best way to make a strong argument.
It feels like something that could land very differently with different people, similar to how someone’s prior experience can easily colour their perception of the relative values of monadic vs. exception-based error handling.
For example, personally, I find a big part of its value is in helping to amortize the cost of being aware of the relationship between my designs and their memory usage across the whole development process.
My gut reaction to this was not positive, but I couldn’t initially explain to myself why, and thought that perhaps it was just due to not wanting such a fundamental change to the status quo. So I started trying to write down my thoughts, and eventually got at the core of what was bothering me, and I think there are some actual problems with this proposal:
Just for context: I haven’t felt the pain mentioned in the post regarding ref-counted values and closures - the intuition around how Rust captures values for a closure is not especially difficult to build (IMO), but maybe I’m missing the specific pain point that others have encountered. I would be in favor of supporting more explicit capture syntax, e.g.
move(&a, b)like mentioned in the post; but I actually don’t find it annoying that I have to explicitly clone values and usemovewhen needed. Abstracting that away viaClaim/Captureor whatever, just so that the compiler can generate that kind of code automatically, seems like a fair amount of effort to make something more ergonomic that maybe should be explicit.I think the essence of what bothers me though, boils down to auto-claim:
It was mentioned in the post a few times, but I want to highlight that auto-claim in conjunction with ref-counted values seems like a major footgun. It might be nice in cases where you are just trying to get GC-like behavior for some data in your program, but I don’t think Rust should be optimizing for that kind of thing (and I totally understand that this is just my opinion). The fact that Rust isn’t as ergonomic as a managed language for things like this isn’t a problem in my mind - it is part of the tradeoff that you get by using Rust instead of a managed language. Today, when you are writing performance-sensitive code where you need to be aware of every point where a reference count might be modified (and as a result, may be a performance bottleneck if that point in the program is on a hot path), you can trust that Rust isn’t doing something expensive behind your back. With auto-claim however, that would no longer be the case - you’d need to explicitly opt-in to a separate “profile” of the language for performance-sensitive folk, or at least that’s the impression I get, and that feels super weird to me. To be clear, I don’t so much object to the idea of
Claim, as I do the notion of auto-claim. It makes writing code that does things behind the scenes that you don’t expect so much more likely, with little gained for that kind of risk IMO.As an aside, the current way that
Copyworks has never burned me, but it is also just memcpy, so the worst-case scenario is that you end up implicitly copying excessive amounts of data around - but that could be trivially linted, either byrustcorclippy. In general though, you have to be really not paying attention to end up in a situation where you are copying around something like[u8; 1024]- there just aren’t many situations where largeCopytypes can arise, at least in my experience.On the other hand,
Claim(in conjunction with auto-claim) is a can of worms - it would occur implicitly by default, likeCopy, so you get no visual signal at points whereclaim()is invoked by the compiler. Furthermore, becauseclaim()invokesclone(), it can hide all sorts of implicit behavior, so even if you are aware that an auto-claim will occur, unless you know exactly howCloneis implemented for a given type, you are completely reliant on the implementor of that type to have respected the intended semantics ofClaim. This is to some extent true withCloneas well, but at least in the case ofClone, cloning is explicit - so there is always a visual reminder that a potentially-expensive operation is happening at that point in the program.As an aside, would
Claimbe mutually-exclusive withCopy? If both are implemented for a type, does the compiler implicitly-copy or implicitly-claim? As you’ve described it, it seems like the latter would be the case, otherwise the point about[u8; 1024]and such wouldn’t be solved viaClaim(unless types like that would no longer beCopy, but since that would be a pretty major breaking change, I’m assuming not).I find it very hard to believe that you write code where you need to carefully track increments of a ref count on your hot path for performance but aren’t worried about memcpying arbitrarily large structs. My assumption when I read comments like this is that the author has been misled by Rust’s current behavior about what is computationally expensive.
If you just assume that Rust’s current distinction between Copy/Clone is an accurate division between “fast to copy” and “potentially slow to copy,” of course the idea of changing the distinction for implicit copy vs explicit clone would make no sense! But the problem is that the current distinction is not always right: it is indeed so cheap to increment a ref counter that you do not actually need to be alerted to it, and some Copy types are so expensive to copy that you really should be.
I guess you misunderstood - I’m not worried about memcpy’ing arbitrarily large structs, because I’m not writing code with large
Copy-able structs that get passed around by value to begin with. It wouldn’t even occur to me to pass around an array with 1k elements in it by value - that’s asinine. Obviously there are people out there that maybe don’t pay close attention to the types they are coding against, and assume that if it isCopyit is cheap to pass around by value - this feels like exactly the sort of thing tools like Clippy are intended to solve.It is unfortunate that my point has gotten tied up in the weeds of refcounting - but refcounting is not always cheap when thread contention is in play, though I’ll readily admit that there are many uses of refcounting for which the overhead is essentially irrelevant, maybe most uses. Regardless, my point wasn’t about refcounting specifically, I just picked on the mention of it in the OP.
The bigger issue in my mind has to do with auto-claim effectively injecting
.clone()where the compiler sees fit, under the assumption thatClaimimpls are “cheap”, “infallible”, and “transparent”. None of those properties are upheld by the compiler (so far as I can determine from what is outlined in the OP anyway), so it seems that no matter how you slice it, you can’t avoid the necessity to know what types you are working with, and whether (and importantly, how) they implement a given trait. Except rather than the worst case scenario being copying too much data around, the scope of things that might happen via compiler magic grows larger.Now, if the compiler could actually enforce those properties? I’d absolutely be in favor - but without that, this change as described seems like it would make most Rust code harder to review and reason about, not easier.
These sorts of conventions are prevalent in Rust. For example, https://doc.rust-lang.org/std/convert/trait.From.html says
This is not enforced by the compiler. Nothing is stopping me from implementing this trait in such a way that it does fail (via
unwrap, etc).Given this, are you auditing every call to
.into()to make sure the implementation does not panic?Panicking is a necessary evil. To me, any use of the possibility to panic to justify something else the compiler can’t check is fundamentally an “It’s not perfect. Therefore, it’s OK to make it worse.” argument and, thus, doesn’t change the fact that a feature needs to prove itself on its own merits.
Similarly, nothing fundamentally stops you doing cursed things with
deref, and that gets called implicitly.April fools proposal: copy(cycles: u32). It specifies how many cycles the compiler is allowed to use for the copy. If it is too expensive it fails.
I’m in love with this proposal tbh. I understand the concerns that autoclaim loses “explicitness” (the word already appears over 20 times in these comments), but I view this change in behavior as a huge ergonomic win. I was just reading @withoutboats’ post Not Explicit (2017) earlier this week and find its splitting of “explicit” into a few different buckets quite useful here. Allow me to quote a bit:
Now allow me to juxtapose this with a snippet from Niko’s post:
This is a very manual ritual we are all familiar with. Manual actions can be great for making expensive operations clear to readers, something Rust users are keenly aware of, but these
.clone()calls are cheap operations that do not contribute to any meaningful insight of your code. You know what’s going on here: a(n atomic ref to a) number has been incremented. This is not a cost that I believe should be manual.Even worse, if these types weren’t
Rc/Arc, the.clone()ritual doesn’t communicate if these clones are cheap or expensive, and our existing mental pattern matching on this lambda ceremony may let us erroneously assume these areRc/Arc. Further, refactors to these types won’t require updating usages. With this proposal, ifctx.diskis replaced with a non-Claimtype that is expensive to copy, it could not be autoclaimed and this usage would be a compilation error (iiuc).On top of removing the auto-copying of arrays, this proposal looks great! I hope it goes far.
That’s not necessarily true. When I see this code, there’s no guarantee that the nearest indication that those values are
Rc/Arc‘d will be close by. Unless it is, that’s a loss of locality.The explicit reading of it, currently, is that there will be no aliasing going on, lock-guarded or otherwise, because they’ll be
moved unless they’rememcpy-safe.That said, I wouldn’t have anything against something like this hypothetical syntax, which relies on some hypothetical
RefCounttrait that non-Rc/ArcClone-ables don’t implement (thus addressing the “One-clone-fits-all creates a maintenance hazard” aspect of the proposal):I write a lot of code similar to the Cloudflare example and I would love to have a better solution for capturing cheap refcounted clones, however this autoclaim notion makes me uncomfortable. It’s not that there’s anything wrong with the definition - it totally makes sense that there’s “true copy” vs “cheap copy” vs “expensive copy” and we’re failing to capture that. The problem is that this mechanism will kick in for regular function parameters and now it’s much more appealing to pop something in an
Arcthan to use references properly. I predict some crates will have aMyRealTypeand then export aMyTypewhich is an alias forArc<MyRealType>which implementsClaimand this will be seen to be a good thing in some circles because it helps with using your types inside closures. I love me some everything-is-refcounted semantics but that’s not the point of Rust.I would much rather tackle this with explicit capture clauses, as described in the FAQ. There are two entangled problems with cloning an
Arc- in my view, 90% of the problem is the tedious syntax, 10% of the problem is that it’s “fairly cheap actually”.So I’m reading this and I think I’m semi-sold. I wondered at first about a few things:
Shouldn’t some of/ all of the cases where we care about Claim be handled by references? References are cheap for [u8; 1024] and avoid the accidental copy. But I guess the accidental copy can happen trivially with a
*and that*is hiding a very large memcpy. But do we need Claim to solve that vs, say, a lint for “copying large value” ?Ah, yes. The closure ceremony of
let _foo = foo.clone();orlet foo2 = foo.clone()just to pass it into the closure. Solving this is very compelling.TBH I would also settle for this:
I think there was a crate like this. Obviously this requires entirely new language features to work, but I think there was a crate that provided a macro like this and I thought “that looks nice”.
I see, continuing to read, that this is addressed in the FAQ with closure capture clauses.
nod.
Indeed, I think this is actually quite interesting for the stated reason. You can monitor your code for copy, clone, and claim, which have different characteristics and indicators. I think this would be valuable. IDEs could help here as well, perhaps.
I am curious to learn more. I’m not “afraid” of this getting approved at all. Nothing about this sets off alarm bells to me, it seems to address interesting problems, and it feels fairly rusty. The iterator and closure examples are pretty compelling to me, and footgun of [u8; 1024] is as well.
I’ll second this - the desire of the proposal to accomplish this goal is certainly something I’m on board with.
I’d like to better understand why you find this of particular compelling interest. I mean, I can relate to the feeling that it is verbose, but on the flip side, explicitly cloning a refcounted value feels to me like that something that should be explicit, since it represents a side-effect. Cloning a value which requires allocating even more so. Rust making such things explicit is one of its strengths IMO. However, maybe I just am not writing the sort of programs where this pain is truly felt; or maybe it occurs as a result of using Rust more like a managed language via reference counting, where you really don’t care when/why something gets cloned - in any case, I definitely feel like I’m missing something.
I do think it would be nice to have something like
Claim, to allow one to opt-in to automatic cloning on a case-by-case basis by implementing it for your own types. I really don’t like the idea of implementing it forRcorArcand things like that, since there would be no way to meaningfully constrain its use in such a way as to make it more of an opt-in kind of feature. Same with automatically deriving it. Otherwise I think it would be really easy for it to become a situation where it is either all or nothing - you can’t let your own types be auto-claimed without letting all claimable types be auto-claimed, so if you aren’t willing to be hyper-vigilant, you end up better off disabling auto-claim entirely.I don’t really care about that side effect too much tbh but if I did it sounds like it would be something I can still lint for, or I’d just wrap Rc in a non-Claim type. And my mental model of the code will just tell me “that’s a refcount” now, which should be something I can pretty easily understand.
Contrast that to the many times I’ve had to do the ‘clone 5 things before moving into a closure’ and I don’t think it’s really close in terms of value tbh. I’ve seen that in just about every Rust codebase I’ve dug into (primarily web services).
Do you do a lot of async/ web services? That’s primarily where I run into it, I suppose. And it can be quite annoying, often adding 5+ lines, taking up valuable vertical screen space.
I guess I don’t see the problem. Correct me if I am wrong, but it seems as if the reason you don’t want Claim to be implemented on Rc/Arc has to do with Claim having a side effect, right? But why does that matter? I don’t see any issue with side effects occurring, in and of themselves. Lots of side effects occur and we don’t think much about them - register states, cache lines, etc, all get mucked up implicitly. I don’t see much of a problem unless your code relies on an Rc not being incremented, but then don’t use Rc, wrap up your type in a type that does not impl Claim, right?
But why would you want to let your own types be auto-claimed but not all claimable types be auto-claimed? What’s the purpose there? It seems to me that what you would actually end up with is a default of “all claimable types are autoclaimed, but for some reason I have some types I don’t want autoclaimed that are claimable, so I wrap them up in a non-Claim impl” or something like that.
To be perfectly frank, to me, this feels like a problem that would be better and more idiomatically solved by extending the closure syntax for explicit capture or, if a strong argument does exist for it occuring in more places, coming up with a
try!→?-esque sigil for.clone().Maybe it’s just how I write my closure-using code, but I haven’t felt much pain from this and maybe it’s just my Python background showing through but, on the rare occasions when my
use_small_heuristics = "Max"mind does have a problem with it (yes, I’m one of those people who would manually cherry-pickrustfmtchanges beforeuse_small_heuristics = "Max"), I use tuple-unpacking syntax to put them on a single line:Well yeah, I mention that and it’s in the FAQ. The closure syntax is fixed by the capture changes. But the approach in the article solves it too, and also addresses other issues.
The problem is that, as argued, I feel that it’s a case of “Killing a fly with a sledgehammer and ignoring the potential for leaving holes in the floor in the process”.
Sure, I’m not totally swayed at this point. It just seems to make sense.
A trait for cheap clones would be nice to distinguish between deep clones and duplicating handles. Even if just delegated to Clone, it’d make code more locally explicit, and less noisy than
Arc::clone(&).However, I’m not sure about auto incrementing refcounts. For example, it’s dangerous for channels which can deadlock if the last sender isn’t dropped when expected. So Channels would probably opt out of Claim like Cell and Range opt out of Copy?
Could claim try to minimize number of auto clones? Copy being semantically always a copy isn’t a big deal, because the optimizer can mostly remove redundant memcpys. However, Clone has bigger side effects, and that would make programs have observable differences depending on the optimizer. OTOH if it always naively cloned on redundant assignments, I’d be unhappy about it (I like microoptimizing such things, and this is not rational…)
That is another concern that I forgot to mention.
The argumentation doesn’t feel like it adequately addresses the risk that actually benefiting from auto-claim might be blocked by the whole “When in doubt, C and C++ trust the programmer. By contrast, safe Rust must fail safe.” aspect of things if the APIs are misaligned to the invariants they need to uphold.
(Especially since, given the limited set of supporting examples, it feels like a broad-spectrum fix for specific issues… which is always harder to get right.)
That’s another reason I’m against this. My mental model of Rust is that an assignment without a
.clone()can be elided by the optimizers until proven otherwise and auto-claim would increase the odds that, when it comes time to profile my code, I might run into something that requires more significant re-architecting to make it still pass borrow-check.There’s no opt-out, Claim would be opt-in. And Channels aren’t Copy so they wouldn’t fall under any sort of auto-claim rules.
Not very big though. Incrementing an integer is something that the compiler should be able to reason about well.
Nitpick. So they would choose not to opt-in to Claim.
Channels are equivalent to (and often literally)
Arc<Inner>, so implementation-wise they’re cheaply refcounted. It’s just that the count is particularly important for them.It felt worth noting that it’s opt-in, not opt-out. That’s a significant difference, although perhaps not to your specific point.
I think, to your point, there is some contention between the various definitions. There’s “memcpy” (Copy), “cheap to copy” (Claim), and “deep copy” (Clone). But nowhere in there is “copy, but be careful because different copies actually care about one another in some way” the way that channels might. In those cases it may make sense to rely on Clone because there’s care needed in managing the copies.
I’m really not sure about that one.
Claiminstead ofClonein genericsClaimon objects that have no… claim to implement it, so they can use the trait boundCopy, it is already hard enough to decide whether or not the type will manage resources or not in the future. Adding a condition on the size of the type sounds like an ergonomic hurdle much bigger than the Clone issue. As a lib author, will I even notice it when my type will stop auto-implementingClaimbecause I added some bytes to an internal array?Rather than this, I’d like:
memcpybeing elided whenever possible, with tail call optimization if needs be (become) or some ergonomic syntax for places (which would also solve other issues)transfersemantics where self-referential pointers are “fixed up” would be a much better… move (Niko also had an article about “self-referential references” recently)Totally caveman suggestion: could you mitigate the pain of using .clone() for Rc and Arc by using a shorter different method name like .inc() or .i() or something?
This is distinct from .clone() so it’s visually obvious that it doesn’t malloc+memcpy a large structure. It won’t accidentally start copying a large structure if the object is later changed from
Rc<BigThing>toBigThing. It’s short.Dumb question: is there any type which is Copy but has size not known at compile time?
If not so, can’t the automatic
memcpybigger than, let’s say, 64 bytes just be a warning/diagnostic?No, all
Clonetypes, and by extensionCopy, must beSized, and that implies knowing the size at compile-time. Clippy has some lints related to large values, likelarge_enum_variantandtrivially_copy_pass_by_ref, but nothing for this specific problem, though it seems like it would be trivial to catch.