One additional thing worth mentioning is that when given a Cow<'a, T>, when you deref that, you don’t get back the borrowed version with a lifetime of 'a. (Where 'a in this case refers to the lifetime of any borrowed data that might be inside the Cow. If the Cow is owned, then 'a == 'static.) Instead, you get back a lifetime derived from the Cow itself. You can see this in the Deref impl:
If we write in the elided lifetime, then the method signature becomes
fn deref<'b>(&'b self) -> &'b B {
In other words, the lifetime associated with &B is tied to the borrow of the Cow itself and not to the data that is inside the Cow.
This is of course because the data inside the Cow might be owned. And in that case, borrowing data from it requires borrowing from the Cow. There is no “longer” lifetime available.
Thanks for the extra insight. That helps unravel the puzzle a bit more for me. I have always been confused about Cow ’static str vs Cow String and what the natural thing to use is, but this comment uncovers some related characteristics that start to explain the differences.
This made me wonder how we’d implement the same thing in Verona. I believe a Cow[T] would just be a wrapper around a T & (imm | iso), i.e. a thing that is either isolated or immutable. A T & iso in Verona is the entry point to a region, an isolated (possibly cyclic) object graph where every live object is reachable from the entry point. A T & imm is an immutable object graph constructed by freezing a region identified by a T & iso. The type would have two accessors, one that returned a T & mut (i.e. a mutable view of an object in a region owned by the Cow[T]), the other that returned a T & readonly (i.e. a read-only view of a thing that may or may not be immutable but definitely can’t be mutated via this view). If the Cow[T] currently holds a T & imm and you ask for a T & mut, it would clone the object, otherwise it returns a reference to whatever it holds. The clone implementation would freeze the region if it currently holds a T & iso. This means that you’d get shallow CoW for free: any object that contained a CoW[T] field would get a shallow clone that shared the immutable object pointed to by the field.
Unlike Rust, this will be possible to implement entirely in type-safe Verona, it doesn’t require an unsafe escape hatch.
Looking at the code, it doesn’t seem to. I assumed it used [A]RC under the hood. I can’t see from the implementation how it handles deallocation when multiple threads are accessing the same underlying immutable object, but the trait itself doesn’t seem to be Sync or Send, so maybe it just doesn’t support multiple threads?
The key here is that Cow::to_mut requires a mutable borrow. So if you’re using a Cow from multiple threads simultaneously, the compiler will force you to do your own synchronization to call Cow::to_mut.
I think this is less about Cow and just more about how Rust works. If a bunch of Cow values are sharing access to &T, then dropping any or all of them does nothing to deallocate &T because &T is just a borrow. What matters is when T gets dropped.
So in consumers of Cow, who owns the data? If I create a Cow[T] and initialise it with a T, who is responsible for deallocating it? If I clone the Cow[T] a few times and they’re all sharing the same object, is one the canonical owner? Do I need to statically (with the aid of the borrow checker) ensure that the clones don’t outlive the original? In the Verona version that I outlined, I wouldn’t have to think about any of these things.
To be honest, I think you should go write some Rust code if you want to compare other languages to it. The questions you’re asking are somewhat difficult to answer as asked, because there is some tangled misunderstanding of Rust lurking beneath them. You also seem to try to be making a comparison with another language, but clearly, neither of us know both. And you’ve kinda already got the Rust side of things severely wrong…
If the Cow owns the data, then dropping the Cow drops the data. You don’t have to think about it. If you clone a Cow that owns its data, then you clone the data too. There is no automatic reference counting going on.
I think I understand. I assumed Cow was a lazy copy-on-write, but it sounds as if it’s actually an eager copy on clone that can give you a mutable copy if required. This means that you can’t use Cow fields to do lazy deep copies when you mutate a part of a deep object tree.
I think “cannot” is perhaps too strong of a word, but I think the essence of what you’re saying is probably right. The lifetime would need to be dealt with somehow. It should be possible to get what you’re asking for, it just might take a bit of extra work. e.g., something like Cow<Rc<T>> along with Rc::make_mut.
To expound on this point sightly, there is no logical difference between a program using a cow and a program that simply copies everything. Cows simple allow a convenient interface for eliding copies when they are semantically unnecessary.
One additional thing worth mentioning is that when given a
Cow<'a, T>
, when you deref that, you don’t get back the borrowed version with a lifetime of'a
. (Where'a
in this case refers to the lifetime of any borrowed data that might be inside theCow
. If theCow
is owned, then'a == 'static
.) Instead, you get back a lifetime derived from theCow
itself. You can see this in theDeref
impl:If we write in the elided lifetime, then the method signature becomes
In other words, the lifetime associated with
&B
is tied to the borrow of theCow
itself and not to the data that is inside theCow
.This is of course because the data inside the
Cow
might be owned. And in that case, borrowing data from it requires borrowing from theCow
. There is no “longer” lifetime available.Thanks for the extra insight. That helps unravel the puzzle a bit more for me. I have always been confused about Cow ’static str vs Cow String and what the natural thing to use is, but this comment uncovers some related characteristics that start to explain the differences.
This made me wonder how we’d implement the same thing in Verona. I believe a
Cow[T]
would just be a wrapper around aT & (imm | iso)
, i.e. a thing that is either isolated or immutable. AT & iso
in Verona is the entry point to a region, an isolated (possibly cyclic) object graph where every live object is reachable from the entry point. AT & imm
is an immutable object graph constructed by freezing a region identified by aT & iso
. The type would have two accessors, one that returned aT & mut
(i.e. a mutable view of an object in a region owned by theCow[T]
), the other that returned aT & readonly
(i.e. a read-only view of a thing that may or may not be immutable but definitely can’t be mutated via this view). If theCow[T]
currently holds aT & imm
and you ask for aT & mut
, it would clone the object, otherwise it returns a reference to whatever it holds. The clone implementation would freeze the region if it currently holds aT & iso
. This means that you’d get shallow CoW for free: any object that contained aCoW[T]
field would get a shallow clone that shared the immutable object pointed to by the field.Unlike Rust, this will be possible to implement entirely in type-safe Verona, it doesn’t require an
unsafe
escape hatch.Eh? Where is
unsafe
used in Rust’sCow
implementation?Looking at the code, it doesn’t seem to. I assumed it used
[A]RC
under the hood. I can’t see from the implementation how it handles deallocation when multiple threads are accessing the same underlying immutable object, but the trait itself doesn’t seem to beSync
orSend
, so maybe it just doesn’t support multiple threads?No, it doesn’t use reference counting.
Cow
implementsSend
andSync
.The key here is that
Cow::to_mut
requires a mutable borrow. So if you’re using aCow
from multiple threads simultaneously, the compiler will force you to do your own synchronization to callCow::to_mut
.Ah, I see, that makes sense. How does deallocation work if multiple
Cow
objects are sharing access to the same immutable object?I think this is less about
Cow
and just more about how Rust works. If a bunch ofCow
values are sharing access to&T
, then dropping any or all of them does nothing to deallocate&T
because&T
is just a borrow. What matters is whenT
gets dropped.So in consumers of
Cow
, who owns the data? If I create aCow[T]
and initialise it with aT
, who is responsible for deallocating it? If I clone theCow[T]
a few times and they’re all sharing the same object, is one the canonical owner? Do I need to statically (with the aid of the borrow checker) ensure that the clones don’t outlive the original? In the Verona version that I outlined, I wouldn’t have to think about any of these things.To be honest, I think you should go write some Rust code if you want to compare other languages to it. The questions you’re asking are somewhat difficult to answer as asked, because there is some tangled misunderstanding of Rust lurking beneath them. You also seem to try to be making a comparison with another language, but clearly, neither of us know both. And you’ve kinda already got the Rust side of things severely wrong…
If the Cow owns the data, then dropping the Cow drops the data. You don’t have to think about it. If you clone a Cow that owns its data, then you clone the data too. There is no automatic reference counting going on.
I think I understand. I assumed
Cow
was a lazy copy-on-write, but it sounds as if it’s actually an eager copy on clone that can give you a mutable copy if required. This means that you can’t useCow
fields to do lazy deep copies when you mutate a part of a deep object tree.I think “cannot” is perhaps too strong of a word, but I think the essence of what you’re saying is probably right. The lifetime would need to be dealt with somehow. It should be possible to get what you’re asking for, it just might take a bit of extra work. e.g., something like
Cow<Rc<T>>
along withRc::make_mut
.Are Cows only useful for performance reasons? Or are there ergonomic reasons to use them?
Cow is entirely for performance.
To expound on this point sightly, there is no logical difference between a program using a cow and a program that simply copies everything. Cows simple allow a convenient interface for eliding copies when they are semantically unnecessary.