RAII is far from perfect… here are a few complaints:
Drop without error checking is also wrong by default. It may not be a big issue for closing files, but in the truly general case of external resources to-be-freed, you definitely need to handle errors. Consider if you wanted to use RAII for cloud resources, for which you need to use CRUD APIs to create/destroy. If you fail to destroy a resource, you need to remember that so that you can try again or alert.
High-performance resource management utilizes arenas and other bulk acquisition and release patterns. When you bundle a deconstructor/dispose/drop/whatever into some structure, you have to bend over backwards to decouple it later if you wish to avoid the overhead of piecemeal freeing as things go out of scope.
The Unix file API is a terrible example. Entire categories of usage of that API amount to open/write/close and could trivially be replaced by a bulk-write API that is correct-by-default regardless of whether it uses RAII internally. But beyond the common, trivial bulk case, most consumers of the file API actually care about file paths, not file descriptors. Using a descriptor is essentially an optimization to avoid path resolution. Considering the overhead of kernel calls, file system access, etc, this optimization is rarely valuable & can be minimized with a simple TTL cache on the kernel side. Unlike a descriptor-based API, a path-based API doesn’t need a close operation at all – save for some cases where files are being abused as locks or other abstractions that would be better served by their own interfaces.
It encourages bad design in which too many types get tangled up with resource management. To be fair, this is still a cultural problem in Go. I see a lot of func NewFoo() (*Foo, error) which then of course is followed by an error check and potentially a .Close() call. Much more often than not, Foo has no need to manage its resources and could instead have those passed in: foo := &Foo{SomeService: svc} and now you never need to init or cleanup the Foo, nor check any initialization errors. I’ve worked on several services where I have systematically made this change and the result was a substantial reduction in code, ultimately centralizing all resource acquisition and release into essentially one main place where it’s pretty obvious whether or not cleanup is happening.
The error handling question for RAII is a very good point! This is honestly where I’m pretty glad with Python’s exception story (is there a bad error? Just blow up! And there’s a good-enough error handling story that you can wrap up your top level to alert nicely). As the code writer, you have no excuse to, at least, just throw an exception if there’s an issue that the user really needs to handle.
I’ll quibble with 2 though. I don’t think RAII and arenas conflict too much? So many libraries are actually handlers to managed memory elsewhere, so you don’t have to release memory the instant you destruct your object if you don’t want to. Classically, reference counted references could just decrement a number by 1! I think there’s a a lot of case-by-case analysis here but I feel like common patterns don’t conflict with RAII that much?
EDIT: sorry, I guess your point was more about decoupling entirely. I know there are libs that parametrize by allocator, but maybe you meant something even a bit more general
A careful programmer will check the return value of close(), since it is quite possible that errors on a previous write(2) operation are reported only on the final close() that releases the open file description. Failing to check the return value when closing a file may lead to silent loss of data. This can especially be observed with NFS and with disk quota.
so it’s actually quite important to handle errors from close!
The reason for this is that (AFAIK) write is just a request to do an actual write at some later time. This lets Linux coalesce writes/reschedule them without making your code block. As I understand it this is important for performance (and OSs have a long history of lying to applications about when data is written). A path-based API without close would make it difficult to add these kinds of optimizations.
My comment about close not being a big issue is with respect to disposing of the file descriptor resource. The actual contents of the file is another matter entirely.
Failing to check the return value when closing a file may lead to silent loss of data.
Is that still true if you call fsync first?
A path-based API without close would make it difficult to add these kinds of optimizations.
Again, I think fsync is relevant. The errors returned from write calls (to either paths or file descriptors) are about things like whether or not you have access to a file that actually exists, not whether or not the transfer to disk was successful.
This is also related to a more general set of problems with distributed systems (which includes kernel vs userland) that can be addressed with something like Promise Pipelining.
Note that in the context of comparison with defer, it doesn’t improve the defaults that much. The common pattern of defer file.close() doesn’t handle errors either. You’d need to manually set up a bit of shared mutable state to replace the outer return value in the defer callback before the outer function returns. OTOH you could throw from a destructor by default.
I disagree about “bend over backwards”, because destructors are called automatically, so you have no extra code to refactor. It’s even less work than finding and changing all relevant defers that were releasing resources piecemeal. When resources are owned by a pool, its use looks like your fourth point, and the pool can release them in bulk.
3/4 are general API/architecture concerns about resource management, which may be valid, but not really specific to RAII vs defer, which from perspective of these issues are just an implementation detail.
I disagree about “bend over backwards”, because destructors are called automatically, so you have no extra code to refactor
You’re assuming you control the library that provides the RAII-based resource. Forget refactoring, thinking only about the initial code being written: If you don’t control the resource providing library, you need to do something unsavory in order to prevent deconstructors from running.
Rust has a ManuallyDrop type wrapper if you need it. It prevents destructors from running on any type, without changing it.
Additionally, using types via references never runs destructors when the reference goes out of scope, so if you refactor T to be &T coming from a pool, it just works.
Fully agree. I also think it’s important to note that RAII is strictly more powerful than most other models in that they can be implemented in RAII. Some years ago I made this point for implementing defer in Rust: https://code.tvl.fyi/about/fun/defer_rs/README.md
What other models are you talking about? Off the top of my head, linear types are more powerful; with-macros (cf common lisp) are orthogonal; and unconstrained resource management strategies are also more powerful.
That’s interesting, but it wouldn’t work in Go because of garbage collection. You could have a magic finalizer helper, but you wouldn’t be able to guarantee it runs at the end of a scope. For a language with explicit lifetimes though, it’s a great idea.
Lua (which has GC) has the concept of a “to-be-closed”. If you do:
local blah <close> = ...
That variable will be reclaimed when it goes out of scope right then and there (no need to wait for GC). Also, if an object has a __close method, it will be called at that time.
It doesn’t have to be such a clear-cut distinction. C# is a GC’d language but also has a using keyword for classes that implement IDispose, which runs their finaliser at the end of a lexical scope. This can be used to implement RAII and to manage the lifetimes of other resources.
What do you do when you want the thing to live longer? For the Rust case, you just don’t let the variable drop. For Go, you can deliberately not defer a close/unlock. What do you do in C#?
Basically except I believe C# will eventually run Dispose if you don’t do it explicitly unlike Python. I can’t find evidence of when C# introduced using but IDisposable has been there since 1.0 in 2002 while Python introduced with since 2.5 in 2005.
It still suffers from the point in the article where you don’t know who held a reference to your closeable thing and it’s not always super clear what is IDisposable in the tooling. I think VS makes you run the code analytics (whatever the full code scan is called) to see them.
Has anyone written a language where stack / manual allocation is the default but GC’d allocations are there if you want them?
It seems mainstream programming jumped to GC-all-the-things back in the 90s with Java/C# in response to the endless problems commercial outfits had with C/C++, but they seem to have thrown the proverbial baby out with the bathwater in the process. RAII is fantastic & it wasn’t until Rust came along & nicked it from C++ that anyone else really sat up & took notice.
Ruby does this right “open do..”. Python does it right, “with … do”. Any reasonably wrapped smart resource will catch this at compile time; we used to do this for memory and files in the ’80s. ‘defer’ is just a way to make mistakes.
open is the classic example where not using with will still allow you to do your work. But most libs will do with thing() as actual_object: ... . If you do actual_object = thing() then you’re not getting the object you want but just a manager object.
The way to get around using with is doing actual_object = thing().__enter__(), which is a bit unwieldy.
This is not “right by default” though. I’ve reviewed (and requested changes to) plenty of Ruby code that uses file = File.open(path) because the choice of using the block API is opt-in. If Ruby was “right by default” no one would have spent cycles on the issue at all.
I do C++ and Ruby… There are many places where GC is just soooo much easier.
C/C++ is as Rich Hickey describes, “Place Oriented Programming”… so much of you work and code is around specifying where something is going to be store and when you can reuse it, and so much of that vanishes in a GC’d language like Ruby.
However Rubies yield and ensure paradigm is so neat when you need it.
What happens if I store that file variable in a global variable? Can I still try and read/write to it, resulting in runtime errors? It’s certainly more right than an explicit open/close pair.
You don’t have to use the block syntax like in the example above. You can simply do File.new("foo") to open the file and store that into a variable if you so desire.
The problem with finaliser-based cleanup is that it relies on GC being run promptly. You can’t use it for things like lock release because it will run an unbounded amount of time in the future. It’s dangerous to use it for file descriptor cleanup because you can rapidly accept a lot of connections and then drop the last reference to the file descriptor object but if the GC doesn’t catch up fast enough then you don’t close the file descriptors and the will hit OS-defined limits (in the worst case, consume a lot of kernel resources in the best case). The Java documentation explicitly tells you not to use finalisers for file descriptor cleanup except as a last resort (i.e. if you’ve accidentally forgotten to close an open descriptor) for precisely this reason.
Fully agreed, it’s not something I’d recommend using as your primary cleanup mechanism either. Just wanted to put it out there as it’s not a well known feature.
I’ve always wondered why a GC’d language couldn’t know about a DeterministicRelease interface that users implement to switch to classic ref-counting scheme for values of that type. There’s no need to force every value to be GC’d, and or even GC’d the same way, esp when compiling it (be that AOT or JIT).
What would happen if you have a GC’d container of refcounted objects, like say an ArrayList<File>? When the ArrayList becomes unreachable, it won’t decrement the refcount on the Files right away, so the Files wouldn’t be released deterministically.
Maybe the compiler should specialize ArrayList<File> so that the whole thing is refcounted?
You’d have to propagate the marker, so ArrayList: DeterministicRelease if T: DeterministicRelease. That can be annoying and one of the reasons why rust doesn’t implement optional fully linear types (types which need to be explicitly dropped).
Which is part of the answer. Your type system would need to understand that any object that had a DeterministicRelease field was, itself, a DeterministicRelease type. This gets fun with generics, because a Foo[T] is either DeterministicRelease or not depending on whether T is, but only if T is used as a field.
This gets more difficult once you have any kind of structural typing. If I have an interface (or union type) I then this must correctly propagate the DeterministicRelease attribute such that cleanups are run if a variable of type I goes out of scope, if it holds a reference to an object that is DeterministicReleased, but not otherwise.
All of this is solvable but it touches a lot of the type system.
Sure. The tasks is quite clear, the devil is in some details. FWIW, there’s a nice article of what such a retrofit would mean for Rust. https://gankra.github.io/blah/linear-rust/
Go’s and Zig’s defer are rather different beasts
Go runs defered statements at the end of the function, Zig at the end of scope.
ant to lock a mutex insife a loop? Can’t use Go defer for that..
destructors can’t take arguments or return values
While most destructions only release acquired resources, passing an argument to a defered call can be very useful in many cases
hidden code
all defer code is visible in the scope. Look for all lines starting with defer in the current scope and you have all the calls.
Looking for destructors means looking how drop is implemented for all the types in the scopes.
Go’s and Zig’s defer are rather different beasts Go runs defered statements at the end of the function, Zig at the end of scope. ant to lock a mutex insife a loop? Can’t use Go defer for that..
This distinction doesn’t really matter in a language with first-class lambdas. If you want to unlock a mutex at the end of a loop iteration with Go, create and call a lambda in the loop that uses defer internally.
destructors can’t take arguments or return values
But constructors can. If you implement a Defer class to use RAII, it takes a lambda in the constructor and calls it in the destructor.
hidden code all defer code is visible in the scope
I’m not sure I buy that argument, given that the code in defer is almost always calling another function. The code inside the constructor for the object whose cleanup you are defering is also not visible in the calling function.
hidden code all defer code is visible in the scope
I’m not sure I buy that argument, given that the code in defer is almost always calling another function. The code inside the constructor for the object whose cleanup you are defering is also not visible in the calling function.
The point is that as a reader of zig, you can look at the function and see all the code which can be executed. You can see the call and breakpoint that line. As a reader of c++, it’s a bit more convoluted to breakpoint on destructors.
This can work sometimes, but other times packing pointers in a struct just so you can drop it later is wasteful. This happens a lot with for example the Vulkan API where a lot of the vkDestroy* functions take multiple arguments. I’m a big fan of RAII but it’s not strictly better.
At least in C++, most of this all goes away after inlining. First the constructor and destructor are both inlined in the enclosing scope. This turns the capture of the arguments in the constructor into local assignments in a structure in the current stack frame. Then scalar replacement of aggregates runs and splits the structure into individual allocas in the first phase and then into SSA values in the second. At this point, the ‘captured’ values are just propagated directly into the code from the destructor.
If you want to unlock a mutex at the end of a loop iteration with Go, create and call a lambda in the loop that uses defer internally.
Note that Go uses function scope for defer. So this will actually acquire locks slowly then release them all at the end of function. This is very likely not what you want and can even risk deadlocks.
Is a lambda not a function in Go? I wouldn’t expect defer in a lambda to release the lock at the end of the enclosing scope, because what happens if the lambda outlives the function?
The first point is minor, and not really changing the overall picture of leaking by default.
Destruction with arguments is sometimes useful indeed, but there are workarounds. Sometimes you can take arguments when constructing the object. In the worst case you can require an explicit function call to drop with arguments (just like defer does), but still use the default drop to either catch bugs (log or panic when the right drop has been forgotten) or provide a sensible default, e.g. delete a temporary file if temp_file.keep() hasn’t been called.
Automatic drop code is indeed implicit and can’t be grepped for, but you have to consider the trade-off: a forgotten defer is also invisible and can’t be grepped for either. This is the change in default: by default there may be drop code you may not be aware of, instead of by default there may be a leak you may not be aware of.
destructors can’t take arguments or return values. While most destructions only release acquired resources, passing an argument to a deferred call can be very useful in many cases.
Yes, more than useful:
Zero-cost abstraction in terms of state: A deferred call doesn’t artificially require objects to contain all state needed by their destructors. State is generally bad, especially references, and especially long lived objects that secretly know about each other.
Dependencies are better when they are explicit: If one function needs to run before another, letting it show (in terms of what arguments they require) is a good thing: It makes wrong code look wrong (yes, destruction order is a common problem in C++) and prevents it from compiling if you have lifetimes like Rust.
Expressiveness: In the harsh reality we live in, destructors can fail.
I think the right solution is explicit destructors: Instead of the compiler inserting invisible destructor calls, the compiler fails if you don’t. This would be a natural extension to an explicit language like C – it would only add safety. Not only that: It fits well with defer too – syntactic sugar doesn’t matter, because it just solves the «wrong default» problem. But more than anything, I think it would shine in a language with lifetimes, like Rust, where long lived references are precisely what you don’t want to mess with.
What if there was a way to opt-in to RAII? i.e., just like you can tell C++ to infer a variable’s type from the value assigned, you could tell Zig to “infer” the deferred cleanup to be run. The cleanup would then be written elsewhere in a deconstructor.
What do people think? Would this be convenient? Or would it produce clashing coding styles—namely, to a greater extent than systems programming languages already allow and/or encourage?
For me, the point of the article was that if it’s opt-in, it’s “wrong by default”. Leaving that aside, I feel like it’s a bad idea generally to have different “modes” for programming languages. If I’m looking at some piece of code, I have to remember what mode it works in to know if it’s correct; I have to be much more careful in copying chunks of code around, as well (sure, “shouldn’t do that”, but it happens).
I have to be much more careful in copying chunks of code around…
I mean, whether you’re using RAII or defer, any relevant block of code that isn’t purposefully obtuse will be contiguous. So I don’t think refactoring or moving code around should be an issue.
For me, the point of the article was that if it’s opt-in, it’s “wrong by default”.
As far as I can see, the whole point of system programming languages is that almost everything is opt-in. Rust is basically the sole exception. So I’m not sure I buy that this principle is generally applicable, at least not to the extent you’re agreeing with the article that, say, C++ does more than Zig to protect programmers from themselves.
What do people think? Would this be convenient? Or would it produce clashing coding styles—namely, to a greater extent than systems programming languages already allow and/or encourage?
I feel like it’s a bad idea generally to have different “modes” for programming languages.
I personally try to avoid sweeping statements about whether a feature is subjectively good or bad. I’m in more of a utilitarian camp. Does a feature generally help or hurt in practice? Perhaps I should have stated that more explicitly in my initial question.
I mean, whether you’re using RAII or defer, any relevant block of code that isn’t purposefully obtuse will be contiguous. So I don’t think refactoring or moving code around should be an issue.
What I meant was, you need to be careful about moving code that is written for one mode into a context where the other mode is active.
… at least not to the extent you’re agreeing with the article that, say, C++ does more than Zig to protect programmers from themselves.
I didn’t say that I agreed with the article (but I also don’t think the article makes that claim).
RAII comes with its own set of interesting footguns. Not to say that it’s a bad feature, but it’s not perfect. Languages that don’t employ RAII have a right to exist, and not just in the name of variety.
As that article points out, this is solved in C++11 with std::make_shared: any raw construction of std::shared_ptr is code smell. This kind of footgun is not really intrinsic to RAII, but to the way that C++ reports errors from constructors: the only mechanism is via an exception, which means that anything constructing an object as an argument needs to be excitingly exception safe. The usual fix for this is to have factory methods that validate that an object can be constructed from the arguments and return an option type.
The more subtle footgun with RAII is that it requires the result to be bound to a variable that is not necessarily used. In C++, you can write something like:
{
std::lock_guard(mutex);
// Some stuff that expects the lock to be held
} // Expect the lock to be released here.
Only, because you didn’t write the first line as std::lock_guard g{mutex} you’ve locked and unlocked the mutex in the same statement and now the lock isn’t held. I think the [[nodiscard]] attribute on the constructor can force a warning here but I’ve seen this bug in real-world code a couple of times.
The root cause here is that RAII isn’t a language feature, it’s a design pattern. The problem with a defer statement is that it separates the initialisation and cleanup steps. A language that had RAII as a language feature would want something that allowed a function to return an object that is not destroyed until the end of the parent scope even if it is not bound to a variable.
I completely agree this is a downside of defer, but I do think there are a few (contingent on the task in hand) reasons to prefer it over RAII. One thing that’s pretty nice about Zig code is that it’s really easy to reason about its performance due to things being pretty explicit, and RAII does erode this slightly as there is (if I understand correctly) potentially arbitrary amounts of code being executed when exiting some scope, so you have to read a lot more code before fully understanding the performance of a leaf function. I admit this is not really a real issue for most programmers most of the time though. Another more common case where defer is really nice is when calling C libraries, which there are of course a huge number of. defer in Zig lets you use those libraries and have fairly nice resource clean-up happening without needing to build and maintain any wrappers. RAII also seems like it pushes you down the exceptions route for handling errors during initialisation and clean-up, and exceptions certainly have their downsides.
It would be intetto have a language with “explicit RAII”. You can define a destructor, but instead of calling it for you the compiler just complains if you forget to call it. This way it would be visible in your code (likely with a defer statement) but hard to forget.
when calling C libraries
When interfacing with a C library in C++ or Rust one of the first things I tend to do is make small wrappers for resources that implement RAII for them. I find they pays off very quickly in terms of productivity and correctness. Even if I don’t wrap much else this is very useful.
I read that comment difficulty. I read that comment as code such as:
autoclose File f = open(...)
I am talking about “regular” RAII as seen in C++ or Rust except that it fails to compile if you forget it. It isn’t opt-in because it is required but the caller still needs to write cleanup code (well call the cleanup function). For example.
f = open(...)
f.write("foo")
// Error here: "f implements Drop and must be cleaned up.
Of course you could combine these ideas because something like an autoclose attribute would manage the cleanup and prevent the error from firing.
Good post. In Dawn, this will be solved by linear types. In order to “drop” a value that requires cleanup, you must call the appropriate cleanup term. No additional compiler magic needed, beyond the type system.
Error handling in drop is problematic indeed. Some libraries provide close(self) -> Result for handling this the hard way when you really care about the result.
To be honest how would you like to handle that situation in your program? Terminate it completely? Retry closing? What if you can’t close the file at all? This is one of those scenarios where error handling isn’t obvious.
I personally want a language that uses defer, but issues compiler warnings when you leak a value by accident. Basically a hybrid of manual calls and enforced destructors.
On the other hand I want language that use RAII but has support for linear types, where you are forced by the type system to close it manually, otherwise it will fail to compile.
That’s sort of what I mean - defer can be used in conjunction with what you are proposing. defer is just a form of control flow, similar to if. The compiler can still enforce free in the same way.
The thing is that I understood your comment as “defer by default, warn if ommited”. What I am saying is “RAII by default, type can explicitly out-out of RAII and then compilation fails if not explicitly handled the destruction”. Using Rust-like API:
struct Foo;
// Explicitly opt-out from supporting dropping this value
impl !Drop for Foo {}
impl Foo {
fn destroy(self) {
std::mem::forget(self);
}
}
// This will fail to compile, as `Foo` was not destructed and there is no `Drop` for `Foo`
// fn foo() {
// let f = Foo;
// }
// This compiles fine as destruction of `f` was handled explicitly
fn bar() {
let f = Foo;
f.destroy();
}
RAII is far from perfect… here are a few complaints:
Drop without error checking is also wrong by default. It may not be a big issue for closing files, but in the truly general case of external resources to-be-freed, you definitely need to handle errors. Consider if you wanted to use RAII for cloud resources, for which you need to use CRUD APIs to create/destroy. If you fail to destroy a resource, you need to remember that so that you can try again or alert.
High-performance resource management utilizes arenas and other bulk acquisition and release patterns. When you bundle a deconstructor/dispose/drop/whatever into some structure, you have to bend over backwards to decouple it later if you wish to avoid the overhead of piecemeal freeing as things go out of scope.
The Unix file API is a terrible example. Entire categories of usage of that API amount to open/write/close and could trivially be replaced by a bulk-write API that is correct-by-default regardless of whether it uses RAII internally. But beyond the common, trivial bulk case, most consumers of the file API actually care about file paths, not file descriptors. Using a descriptor is essentially an optimization to avoid path resolution. Considering the overhead of kernel calls, file system access, etc, this optimization is rarely valuable & can be minimized with a simple TTL cache on the kernel side. Unlike a descriptor-based API, a path-based API doesn’t need a close operation at all – save for some cases where files are being abused as locks or other abstractions that would be better served by their own interfaces.
It encourages bad design in which too many types get tangled up with resource management. To be fair, this is still a cultural problem in Go. I see a lot of
func NewFoo() (*Foo, error)
which then of course is followed by an error check and potentially a.Close()
call. Much more often than not,Foo
has no need to manage its resources and could instead have those passed in:foo := &Foo{SomeService: svc}
and now you never need to init or cleanup the Foo, nor check any initialization errors. I’ve worked on several services where I have systematically made this change and the result was a substantial reduction in code, ultimately centralizing all resource acquisition and release into essentially one main place where it’s pretty obvious whether or not cleanup is happening.This is super informative, thanks! Probably worth it to turn this comment into a post of its own.
The error handling question for RAII is a very good point! This is honestly where I’m pretty glad with Python’s exception story (is there a bad error? Just blow up! And there’s a good-enough error handling story that you can wrap up your top level to alert nicely). As the code writer, you have no excuse to, at least, just throw an exception if there’s an issue that the user really needs to handle.
I’ll quibble with 2 though. I don’t think RAII and arenas conflict too much? So many libraries are actually handlers to managed memory elsewhere, so you don’t have to release memory the instant you destruct your object if you don’t want to. Classically, reference counted references could just decrement a number by 1! I think there’s a a lot of case-by-case analysis here but I feel like common patterns don’t conflict with RAII that much?
EDIT: sorry, I guess your point was more about decoupling entirely. I know there are libs that parametrize by allocator, but maybe you meant something even a bit more general
from open(2):
so it’s actually quite important to handle errors from close!
The reason for this is that (AFAIK) write is just a request to do an actual write at some later time. This lets Linux coalesce writes/reschedule them without making your code block. As I understand it this is important for performance (and OSs have a long history of lying to applications about when data is written). A path-based API without close would make it difficult to add these kinds of optimizations.
My comment about close not being a big issue is with respect to disposing of the file descriptor resource. The actual contents of the file is another matter entirely.
Is that still true if you call fsync first?
Again, I think fsync is relevant. The errors returned from write calls (to either paths or file descriptors) are about things like whether or not you have access to a file that actually exists, not whether or not the transfer to disk was successful.
This is also related to a more general set of problems with distributed systems (which includes kernel vs userland) that can be addressed with something like Promise Pipelining.
Note that in the context of comparison with
defer
, it doesn’t improve the defaults that much. The common pattern ofdefer file.close()
doesn’t handle errors either. You’d need to manually set up a bit of shared mutable state to replace the outer return value in the defer callback before the outer function returns. OTOH you could throw from a destructor by default.I disagree about “bend over backwards”, because destructors are called automatically, so you have no extra code to refactor. It’s even less work than finding and changing all relevant
defer
s that were releasing resources piecemeal. When resources are owned by a pool, its use looks like your fourth point, and the pool can release them in bulk.3/4 are general API/architecture concerns about resource management, which may be valid, but not really specific to RAII vs
defer
, which from perspective of these issues are just an implementation detail.You’re assuming you control the library that provides the RAII-based resource. Forget refactoring, thinking only about the initial code being written: If you don’t control the resource providing library, you need to do something unsavory in order to prevent deconstructors from running.
Rust has a
ManuallyDrop
type wrapper if you need it. It prevents destructors from running on any type, without changing it.Additionally, using types via references never runs destructors when the reference goes out of scope, so if you refactor
T
to be&T
coming from a pool, it just works.In Zig you have to handle errors or explicitly discard them, in
defer
and everywhere else.I believe it also the point at which permissions are validated.
Not quite sure I understand what you mean… Something like Ruby IO.write()? https://ruby-doc.org/core-3.1.2/IO.html#method-c-write
I think that’s right, though the same “essentially an optimization” comment applies, although the TTL cache option solution is less applicable.
Yeah, pretty much.
Fully agree. I also think it’s important to note that RAII is strictly more powerful than most other models in that they can be implemented in RAII. Some years ago I made this point for implementing
defer
in Rust: https://code.tvl.fyi/about/fun/defer_rs/README.mdWhat other models are you talking about? Off the top of my head, linear types are more powerful; with-macros (cf common lisp) are orthogonal; and unconstrained resource management strategies are also more powerful.
How is that more “powerful” in OP’s sense? Can you implement RAII within the semantics of such a language?
I should perhaps have said ‘expressive’. There are programs you can write using such semantics that you cannot write using raii.
That’s interesting, but it wouldn’t work in Go because of garbage collection. You could have a magic finalizer helper, but you wouldn’t be able to guarantee it runs at the end of a scope. For a language with explicit lifetimes though, it’s a great idea.
Lua (which has GC) has the concept of a “to-be-closed”. If you do:
That variable will be reclaimed when it goes out of scope right then and there (no need to wait for GC). Also, if an object has a __close method, it will be called at that time.
Sounds like the Python with statement.
It doesn’t have to be such a clear-cut distinction. C# is a GC’d language but also has a
using
keyword for classes that implementIDispose
, which runs their finaliser at the end of a lexical scope. This can be used to implement RAII and to manage the lifetimes of other resources.What do you do when you want the thing to live longer? For the Rust case, you just don’t let the variable drop. For Go, you can deliberately not defer a close/unlock. What do you do in C#?
Hold onto a reference, don’t use
using
.Ah. Seems a lot like
with
in Python.Basically except I believe C# will eventually run
Dispose
if you don’t do it explicitly unlike Python. I can’t find evidence of when C# introducedusing
butIDisposable
has been there since 1.0 in 2002 while Python introducedwith
since 2.5 in 2005.Python explicitly copied several features from C#. I wouldn’t be surprised if with was inspired by using.
That’s a nice approach. Wish more languages would do something like this.
It still suffers from the point in the article where you don’t know who held a reference to your closeable thing and it’s not always super clear what is
IDisposable
in the tooling. I think VS makes you run the code analytics (whatever the full code scan is called) to see them.Has anyone written a language where stack / manual allocation is the default but GC’d allocations are there if you want them?
It seems mainstream programming jumped to GC-all-the-things back in the 90s with Java/C# in response to the endless problems commercial outfits had with C/C++, but they seem to have thrown the proverbial baby out with the bathwater in the process. RAII is fantastic & it wasn’t until Rust came along & nicked it from C++ that anyone else really sat up & took notice.
D?
Ruby does this right “open do..”. Python does it right, “with … do”. Any reasonably wrapped smart resource will catch this at compile time; we used to do this for memory and files in the ’80s. ‘defer’ is just a way to make mistakes.
Don’t these still require you to opt-in to the correct behavior? You can just forget to use those constructs too.
open
is the classic example where not usingwith
will still allow you to do your work. But most libs will dowith thing() as actual_object: ...
. If you doactual_object = thing()
then you’re not getting the object you want but just a manager object.The way to get around using
with
is doingactual_object = thing().__enter__()
, which is a bit unwieldy.Ruby does it right…
Whatever route you may take out of the block… there’s an “ensure” section inside the open that will catch it and close the file.
This is not “right by default” though. I’ve reviewed (and requested changes to) plenty of Ruby code that uses
file = File.open(path)
because the choice of using the block API is opt-in. If Ruby was “right by default” no one would have spent cycles on the issue at all.True, but that usage does have a number of valid use cases.
RuboCop, thanks to @bbatsov , is your friend.
https://www.rubydoc.info/gems/rubocop/RuboCop/Cop/Style/AutoResourceCleanup
RAII style is similar too, just without an additional nesting/indentation, unlike python or ruby.
I do C++ and Ruby… There are many places where GC is just soooo much easier.
C/C++ is as Rich Hickey describes, “Place Oriented Programming”… so much of you work and code is around specifying where something is going to be store and when you can reuse it, and so much of that vanishes in a GC’d language like Ruby.
However Rubies yield and ensure paradigm is so neat when you need it.
What happens if I store that
file
variable in a global variable? Can I still try and read/write to it, resulting in runtime errors? It’s certainly more right than an explicitopen
/close
pair.You don’t have to use the block syntax like in the example above. You can simply do
File.new("foo")
to open the file and store that into a variable if you so desire.Sort of what you have with stdin, stdout, stderr. And I use that for my logging output.
But too many of those and you hit a “too many file descriptors open” error.
Go does let you register a callback for when a value’s memory is collected.
See runtime.SetFinalizer.
It’s used in the standard library to automatically close files if the garbage collector collects the
os.File
structure.I also use it in my WebSocket library to close the connection if the
websocket.Conn
structure is collected.The problem with finaliser-based cleanup is that it relies on GC being run promptly. You can’t use it for things like lock release because it will run an unbounded amount of time in the future. It’s dangerous to use it for file descriptor cleanup because you can rapidly accept a lot of connections and then drop the last reference to the file descriptor object but if the GC doesn’t catch up fast enough then you don’t close the file descriptors and the will hit OS-defined limits (in the worst case, consume a lot of kernel resources in the best case). The Java documentation explicitly tells you not to use finalisers for file descriptor cleanup except as a last resort (i.e. if you’ve accidentally forgotten to close an open descriptor) for precisely this reason.
Fully agreed, it’s not something I’d recommend using as your primary cleanup mechanism either. Just wanted to put it out there as it’s not a well known feature.
I’ve always wondered why a GC’d language couldn’t know about a DeterministicRelease interface that users implement to switch to classic ref-counting scheme for values of that type. There’s no need to force every value to be GC’d, and or even GC’d the same way, esp when compiling it (be that AOT or JIT).
What would happen if you have a GC’d container of refcounted objects, like say an
ArrayList<File>
? When the ArrayList becomes unreachable, it won’t decrement the refcount on the Files right away, so the Files wouldn’t be released deterministically.Maybe the compiler should specialize
ArrayList<File>
so that the whole thing is refcounted?You’d have to propagate the marker, so ArrayList: DeterministicRelease if T: DeterministicRelease. That can be annoying and one of the reasons why rust doesn’t implement optional fully linear types (types which need to be explicitly dropped).
Which is part of the answer. Your type system would need to understand that any object that had a
DeterministicRelease
field was, itself, aDeterministicRelease
type. This gets fun with generics, because aFoo[T]
is eitherDeterministicRelease
or not depending on whetherT
is, but only ifT
is used as a field.This gets more difficult once you have any kind of structural typing. If I have an interface (or union type)
I
then this must correctly propagate theDeterministicRelease
attribute such that cleanups are run if a variable of typeI
goes out of scope, if it holds a reference to an object that isDeterministicRelease
d, but not otherwise.All of this is solvable but it touches a lot of the type system.
Sure. The tasks is quite clear, the devil is in some details. FWIW, there’s a nice article of what such a retrofit would mean for Rust. https://gankra.github.io/blah/linear-rust/
Early Rust was going to be this way, IIRC.
There are multiple points here I disagree with:
This distinction doesn’t really matter in a language with first-class lambdas. If you want to unlock a mutex at the end of a loop iteration with Go, create and call a lambda in the loop that uses
defer
internally.But constructors can. If you implement a
Defer
class to use RAII, it takes a lambda in the constructor and calls it in the destructor.I’m not sure I buy that argument, given that the code in
defer
is almost always calling another function. The code inside the constructor for the object whose cleanup you aredefer
ing is also not visible in the calling function.The point is that as a reader of zig, you can look at the function and see all the code which can be executed. You can see the call and breakpoint that line. As a reader of c++, it’s a bit more convoluted to breakpoint on destructors.
As someone that works daily with several hundred lines functions, that sounds like a con way more than a pro.
This can work sometimes, but other times packing pointers in a struct just so you can drop it later is wasteful. This happens a lot with for example the Vulkan API where a lot of the
vkDestroy*
functions take multiple arguments. I’m a big fan of RAII but it’s not strictly better.At least in C++, most of this all goes away after inlining. First the constructor and destructor are both inlined in the enclosing scope. This turns the capture of the arguments in the constructor into local assignments in a structure in the current stack frame. Then scalar replacement of aggregates runs and splits the structure into individual
alloca
s in the first phase and then into SSA values in the second. At this point, the ‘captured’ values are just propagated directly into the code from the destructor.Note that Go uses function scope for defer. So this will actually acquire locks slowly then release them all at the end of function. This is very likely not what you want and can even risk deadlocks.
Is a lambda not a function in Go? I wouldn’t expect
defer
in a lambda to release the lock at the end of the enclosing scope, because what happens if the lambda outlives the function?Sorry, I misread what you said. I was thinking
defer func() { ... }()
notfunc() { defer ... }()
.Sorry, I should have put some code - it’s much clearer what I meant from your post.
The first point is minor, and not really changing the overall picture of leaking by default.
Destruction with arguments is sometimes useful indeed, but there are workarounds. Sometimes you can take arguments when constructing the object. In the worst case you can require an explicit function call to drop with arguments (just like
defer
does), but still use the default drop to either catch bugs (log or panic when the right drop has been forgotten) or provide a sensible default, e.g. delete a temporary file iftemp_file.keep()
hasn’t been called.Automatic drop code is indeed implicit and can’t be grepped for, but you have to consider the trade-off: a forgotten
defer
is also invisible and can’t be grepped for either. This is the change in default: by default there may be drop code you may not be aware of, instead of by default there may be a leak you may not be aware of.Yes, more than useful:
I think the right solution is explicit destructors: Instead of the compiler inserting invisible destructor calls, the compiler fails if you don’t. This would be a natural extension to an explicit language like C – it would only add safety. Not only that: It fits well with
defer
too – syntactic sugar doesn’t matter, because it just solves the «wrong default» problem. But more than anything, I think it would shine in a language with lifetimes, like Rust, where long lived references are precisely what you don’t want to mess with.You could run an anonymous function within a loop in Go, just to get the per-loop defer. Returning a value in a defer is also possible.
What if there was a way to opt-in to RAII? i.e., just like you can tell C++ to infer a variable’s type from the value assigned, you could tell Zig to “infer” the deferred cleanup to be run. The cleanup would then be written elsewhere in a deconstructor.
What do people think? Would this be convenient? Or would it produce clashing coding styles—namely, to a greater extent than systems programming languages already allow and/or encourage?
For me, the point of the article was that if it’s opt-in, it’s “wrong by default”. Leaving that aside, I feel like it’s a bad idea generally to have different “modes” for programming languages. If I’m looking at some piece of code, I have to remember what mode it works in to know if it’s correct; I have to be much more careful in copying chunks of code around, as well (sure, “shouldn’t do that”, but it happens).
I mean, whether you’re using RAII or defer, any relevant block of code that isn’t purposefully obtuse will be contiguous. So I don’t think refactoring or moving code around should be an issue.
As far as I can see, the whole point of system programming languages is that almost everything is opt-in. Rust is basically the sole exception. So I’m not sure I buy that this principle is generally applicable, at least not to the extent you’re agreeing with the article that, say, C++ does more than Zig to protect programmers from themselves.
I personally try to avoid sweeping statements about whether a feature is subjectively good or bad. I’m in more of a utilitarian camp. Does a feature generally help or hurt in practice? Perhaps I should have stated that more explicitly in my initial question.
What I meant was, you need to be careful about moving code that is written for one mode into a context where the other mode is active.
I didn’t say that I agreed with the article (but I also don’t think the article makes that claim).
RAII comes with its own set of interesting footguns. Not to say that it’s a bad feature, but it’s not perfect. Languages that don’t employ RAII have a right to exist, and not just in the name of variety.
That particular example is not a problem with RAII though, it is specific to API of
shared_ptr
and C++ flexible evaluation order.This should be fixed in C++17, at least partially. TL;DR – see the “The Changes” section.
https://www.cppstories.com/2021/evaluation-order-cpp17/
As that article points out, this is solved in C++11 with
std::make_shared
: any raw construction ofstd::shared_ptr
is code smell. This kind of footgun is not really intrinsic to RAII, but to the way that C++ reports errors from constructors: the only mechanism is via an exception, which means that anything constructing an object as an argument needs to be excitingly exception safe. The usual fix for this is to have factory methods that validate that an object can be constructed from the arguments and return an option type.The more subtle footgun with RAII is that it requires the result to be bound to a variable that is not necessarily used. In C++, you can write something like:
Only, because you didn’t write the first line as
std::lock_guard g{mutex}
you’ve locked and unlocked the mutex in the same statement and now the lock isn’t held. I think the[[nodiscard]]
attribute on the constructor can force a warning here but I’ve seen this bug in real-world code a couple of times.The root cause here is that RAII isn’t a language feature, it’s a design pattern. The problem with a
defer
statement is that it separates the initialisation and cleanup steps. A language that had RAII as a language feature would want something that allowed a function to return an object that is not destroyed until the end of the parent scope even if it is not bound to a variable.I completely agree this is a downside of
defer
, but I do think there are a few (contingent on the task in hand) reasons to prefer it over RAII. One thing that’s pretty nice about Zig code is that it’s really easy to reason about its performance due to things being pretty explicit, and RAII does erode this slightly as there is (if I understand correctly) potentially arbitrary amounts of code being executed when exiting some scope, so you have to read a lot more code before fully understanding the performance of a leaf function. I admit this is not really a real issue for most programmers most of the time though. Another more common case wheredefer
is really nice is when calling C libraries, which there are of course a huge number of.defer
in Zig lets you use those libraries and have fairly nice resource clean-up happening without needing to build and maintain any wrappers. RAII also seems like it pushes you down the exceptions route for handling errors during initialisation and clean-up, and exceptions certainly have their downsides.It would be intetto have a language with “explicit RAII”. You can define a destructor, but instead of calling it for you the compiler just complains if you forget to call it. This way it would be visible in your code (likely with a defer statement) but hard to forget.
When interfacing with a C library in C++ or Rust one of the first things I tend to do is make small wrappers for resources that implement RAII for them. I find they pays off very quickly in terms of productivity and correctness. Even if I don’t wrap much else this is very useful.
Heyyy, similar idea! See above.
I read that comment difficulty. I read that comment as code such as:
I am talking about “regular” RAII as seen in C++ or Rust except that it fails to compile if you forget it. It isn’t opt-in because it is required but the caller still needs to write cleanup code (well call the cleanup function). For example.
Of course you could combine these ideas because something like an
autoclose
attribute would manage the cleanup and prevent the error from firing.Good post. In Dawn, this will be solved by linear types. In order to “drop” a value that requires cleanup, you must call the appropriate cleanup term. No additional compiler magic needed, beyond the type system.
How would you notice an error returned by the close() syscall for a file?
Exceptions, maybe?
In Rust?
Panic, probably.
Error handling in drop is problematic indeed. Some libraries provide
close(self) -> Result
for handling this the hard way when you really care about the result.std::file::File
chooses to ignore those errors.https://doc.rust-lang.org/stable/std/fs/struct.File.html
Ah, but importantly, it gives you the option to explicitly handle them and also explicitly documents what happens in the default case:
https://doc.rust-lang.org/stable/std/fs/struct.File.html#method.sync_all
To be honest how would you like to handle that situation in your program? Terminate it completely? Retry closing? What if you can’t close the file at all? This is one of those scenarios where error handling isn’t obvious.
I agree that there is no obvious right answer. But hiding the error is obviously wrong.
Seeing “just remember” in documentation is an easy watch phrase to note when language devs forgot something.
I personally want a language that uses defer, but issues compiler warnings when you leak a value by accident. Basically a hybrid of manual calls and enforced destructors.
On the other hand I want language that use RAII but has support for linear types, where you are forced by the type system to close it manually, otherwise it will fail to compile.
That’s sort of what I mean - defer can be used in conjunction with what you are proposing. defer is just a form of control flow, similar to if. The compiler can still enforce free in the same way.
The thing is that I understood your comment as “
defer
by default, warn if ommited”. What I am saying is “RAII by default, type can explicitly out-out of RAII and then compilation fails if not explicitly handled the destruction”. Using Rust-like API:I see - you want implicit by default and I want explicit by default, but the idea is the same.
Need more flags or a downvote.
I suggest a “yeah, well, that’s just like your opinion, man” flag.