It will allow recursive JSON-like data structures in Oil.
This is an odd justification for a GC, because if you only want JSON-like things (i.e. trees) you can represent this with reference counting. Non-atomic reference counting (which is all that you need since shells are single threaded) can handle this. Reference counting with autorelease pools[1] is a fairly nice programming model for single-threaded systems and has some very nice properties with respect to deterministic memory consumption.
[1] I’ve not seen these anywhere other than OpenStep / Cocoa, but they basically give you an operation that is ‘don’t decref this now, but do it later’ so the object remains valid until the end of the autorelease scope. This is really nice for interop with C things, because you can have temporary pointers floating around and as long as they’re gone by the end of the scope of the autorelease pool (which may be a few stack frames up), they remain valid. This addresses the f(g(), h()) for cases where f wants to take arguments that are not smart pointers.
That’s occurred to me, but how do you limit your structures to trees when the mutator can do arbitrary computation?
You would need some kind of cycle detection algorithm, which means you need basically all the metadata that a tracing GC has
The problems are roughly equivalent, and I think there’s a fundamental reason that essentially no language has trees but not cyclic graphs
Also note the other justification is memory safety at the C++ metalanguage level. The C++ code does have a few cycles, with the mutually recursive evaluators. Right now, it would probably be OK if they leak, but that could change and I’d rather not have that in the back of my mind :)
That’s occurred to me, but how do you limit your structures to trees when the mutator can do arbitrary computation?
It depends on whether you have aliasing in the language. If you have a JSON-like data model, then you can’t express cycles because you have no way of naming an object that is contained in another object, everything is an explicit path.
You would need some kind of cycle detection algorithm, which means you need basically all the metadata that a tracing GC has
Not quite. You remove the need to track roots. Whenever an object’s reference count is decremented and the object is not deleted, you colour it in the header to mark it as possible garbage and add it to a queue. When an object’s reference count is incremented, you colour it to mark it as not-garbage. When the queue is full, you scan it and, for any objects that are possibly garbage, you do a graph walk and decrement reference counts of every object that you find but don’t delete them. If the object that you started with ends with a reference count of 0, then it is garbage and you free it and decref all objects reachable from it. If there are external references, the reference count will not be 0, but you don’t care where they are: they could be on the stack, in globals, in C memory, or anywhere else that you can stash roots.
If your data structures are mostly acyclic then this is very efficient. I’m curious how it would be for your use case.
There are also incremental variants of this scheme that rely on noting when you incref an object that is being scanned and causing the cycle detector to backtrack. There’s also a fully concurrent version developed by David F. Bacon, but I can only keep how it works in my head for about 20 minutes after I read the paper so I’m not even going to try to explain it here.
The problems are roughly equivalent, and I think there’s a fundamental reason that essentially no language has trees but not cyclic graphs
Erlang springs to mind and it’s far from the only one. So does Rust, unless you use unsafe (explicitly or via a crate that uses unsafe internally).
Well remember in this case the language is C++, which has aliasing and cycles. The collector works on all C++ objects in the program, not just Oil objects. It’s basically the equivalent of if we had written the shell in Go and used its GC.
One downside I considered of ref counting is that it has the “false write” problem, which I took pains to avoid with the separate mark bitmap. That is the headers are immutable in our design to promote page sharing with fork().
The previous post has links about that: Ruby’s bitmap marking and a post about Instagram/Python/ref counting:
It doesn’t mean necessarily mean ref counting is worse … but yeah at this point I’m happy to be mostly past the design phase and more into optimization. As mentioned it’s still spec driven, so in theory it can be changed at any point, probably when a lot more people are using it and there are 10x demands in terms of heap size / perf.
The only thing that would prevent changes is if we expose some sort of shared library native interface (which bash actually has). But we don’t have that yet
Also I should repeat the point of the PyPy quote – any memory management scheme can be plugged into Oil
It’s a spec-driven project, which takes longer, but it means it’s decoupled from implementation details
It would actually be a great project for someone wants to compare moving vs. non-moving, ref counting, etc. on a real workload
All the benchmarks and tests are already set up, and run in the CI on every commit. The codebase is tiny, with fast incremental builds.
The Cheney collector still builds – the main difference is that with Cheney you have to register all copies of a root, not just one copy, so they can be moved in addition to being traced. That’s in our “back pocket”, but I don’t think we’ll need it for the “main project”
This is an odd justification for a GC, because if you only want JSON-like things (i.e. trees) you can represent this with reference counting. Non-atomic reference counting (which is all that you need since shells are single threaded) can handle this. Reference counting with autorelease pools[1] is a fairly nice programming model for single-threaded systems and has some very nice properties with respect to deterministic memory consumption.
[1] I’ve not seen these anywhere other than OpenStep / Cocoa, but they basically give you an operation that is ‘don’t decref this now, but do it later’ so the object remains valid until the end of the autorelease scope. This is really nice for interop with C things, because you can have temporary pointers floating around and as long as they’re gone by the end of the scope of the autorelease pool (which may be a few stack frames up), they remain valid. This addresses the
f(g(), h())
for cases wheref
wants to take arguments that are not smart pointers.That’s occurred to me, but how do you limit your structures to trees when the mutator can do arbitrary computation?
You would need some kind of cycle detection algorithm, which means you need basically all the metadata that a tracing GC has
The problems are roughly equivalent, and I think there’s a fundamental reason that essentially no language has trees but not cyclic graphs
Also note the other justification is memory safety at the C++ metalanguage level. The C++ code does have a few cycles, with the mutually recursive evaluators. Right now, it would probably be OK if they leak, but that could change and I’d rather not have that in the back of my mind :)
It depends on whether you have aliasing in the language. If you have a JSON-like data model, then you can’t express cycles because you have no way of naming an object that is contained in another object, everything is an explicit path.
Not quite. You remove the need to track roots. Whenever an object’s reference count is decremented and the object is not deleted, you colour it in the header to mark it as possible garbage and add it to a queue. When an object’s reference count is incremented, you colour it to mark it as not-garbage. When the queue is full, you scan it and, for any objects that are possibly garbage, you do a graph walk and decrement reference counts of every object that you find but don’t delete them. If the object that you started with ends with a reference count of 0, then it is garbage and you free it and decref all objects reachable from it. If there are external references, the reference count will not be 0, but you don’t care where they are: they could be on the stack, in globals, in C memory, or anywhere else that you can stash roots.
If your data structures are mostly acyclic then this is very efficient. I’m curious how it would be for your use case.
There are also incremental variants of this scheme that rely on noting when you incref an object that is being scanned and causing the cycle detector to backtrack. There’s also a fully concurrent version developed by David F. Bacon, but I can only keep how it works in my head for about 20 minutes after I read the paper so I’m not even going to try to explain it here.
Erlang springs to mind and it’s far from the only one. So does Rust, unless you use unsafe (explicitly or via a crate that uses unsafe internally).
Well remember in this case the language is C++, which has aliasing and cycles. The collector works on all C++ objects in the program, not just Oil objects. It’s basically the equivalent of if we had written the shell in Go and used its GC.
One downside I considered of ref counting is that it has the “false write” problem, which I took pains to avoid with the separate mark bitmap. That is the headers are immutable in our design to promote page sharing with fork().
The previous post has links about that: Ruby’s bitmap marking and a post about Instagram/Python/ref counting:
https://www.oilshell.org/blog/2022/10/garbage-collector.html
It doesn’t mean necessarily mean ref counting is worse … but yeah at this point I’m happy to be mostly past the design phase and more into optimization. As mentioned it’s still spec driven, so in theory it can be changed at any point, probably when a lot more people are using it and there are 10x demands in terms of heap size / perf.
The only thing that would prevent changes is if we expose some sort of shared library native interface (which bash actually has). But we don’t have that yet
Also I should repeat the point of the PyPy quote – any memory management scheme can be plugged into Oil
It’s a spec-driven project, which takes longer, but it means it’s decoupled from implementation details
It would actually be a great project for someone wants to compare moving vs. non-moving, ref counting, etc. on a real workload
All the benchmarks and tests are already set up, and run in the CI on every commit. The codebase is tiny, with fast incremental builds.
The Cheney collector still builds – the main difference is that with Cheney you have to register all copies of a root, not just one copy, so they can be moved in addition to being traced. That’s in our “back pocket”, but I don’t think we’ll need it for the “main project”
Ref counting would be interesting as well
https://github.com/oilshell/oil/wiki/Oil-Native-Quick-Start