What are the benefits wrt another high performance GC? O(1) allocation and de-allocation is the norm.
Thank you for asking. This implementation provides short O(1) pause times, regardless of working set size, and it also compacts. Maybe such things do already exist, but I couldn’t find such things when I searched as I was thinking about this project, so I thought this would be new. I’m happy to compare to another implementation if you could point me to one.
I see. Given the expensive read barrier, I think there are other ways to achieve similar characteristics with better trade-offs (I can find references tomorrow, but it is a bit late here, sorry). However, your design seems to make a lot of sense for a compacting database (the expensive read barrier can be factored in the cost to access external pages, storage overhead might be a problem though).
Small question: when GC thread is active, what happens if the main thread fills the region before than the GC can complete?
Thanks I’ll be happy to read them. Regarding your question, I use the term “region” in the code to refer to a range within the CB/ring-buffer. That region is allocated by the main thread for use by the GC thread, and then the main thread never touches it. If you’re referring to the CB/ring-buffer itself, as in “what happens if the main thread fills the CB before the GC completes?” then the answer is that the main thread will resize the CB upwards to the next power-of-2, copy all of the contents of the old CB (including the still-being-written GC consolidation region), and then when the main thread observes the GC to have completed, it will copy forward the data from the old CB’s consolidation region to the equivalent range in the new CB. In this way, the GC thread only knows about the CB that was given to it during the start of the GC/consolidation, and the main thread is responsible for the shift of the CB to a larger one.
Thanks! –Dan
Yeah, it looks like an exercise to write such GC. Other pauseless GC’s has similar characteristics and arguably cheaper in many cases. This implementation for lookups requires a hash table on-the-side and most likely requires a read-barrier (when rewire objects moved from zone A to B to C).
Thanks for taking a look! Yes it does require a read-barrier for the objects which have shifted locations. I’d be interested if you could link me to the best implementations you know of.
Read what you did led me immediately to Azul’s Pauseless GC. Both does tons of work on read side to ensure GC thread can finish marking and compacting in predictable way. In their case, I think they trap the pointer read to outdated pages to update the reference (in compacting phase).
Edit: thinking through their marking phase and yours (from my understanding, your zone B is for marking). One thing I don’t quite understand is how you can make sure mutations in zone A has no pointers back to zone B / C objects, hence, you can do marking phase correctly without consulting zone A?
I cheat by not using pointers but instead handles. Each allocation is just given an increasing integer ID, which is “dereferenced” indirectly through the O(log32(N)) objtable. This should be a reasonably shallow data structure, but is definitely not free, so adds overhead to the dereferences.
Yeah, by pointers, I meant handles. I guess whenever zone A tries to references an object already moved to zone B, you copied it out of zone B back to zone A? Otherwise I cannot see how you can avoid in marking phase missed some objects referenced from zone A?
Yeah, the vision is that all objects are either “small” (e.g. tuples under some arity [envisioned, but not yet present in this implementation]) and are copied from either B or C into A when they need to become mutable, or they are larger (e.g. maps/collections) but are persistent objects which are mutated via path-copying of the path-to-mutated-leaf, where such copied path is written out in A and is similarly considered “small”.
If you haven’t seen it already, you might find some of the work on Shenandoah interesting - specifically how they’ve optimized forwarding pointers/read barriers.
https://developers.redhat.com/blog/2019/06/27/shenandoah-gc-in-jdk-13-part-1-load-reference-barriers/ https://developers.redhat.com/blog/2019/06/28/shenandoah-gc-in-jdk-13-part-2-eliminating-the-forward-pointer-word/
What does O(n)
memory usage mean? Every language has O(n)
memory usage, where n
is the number of allocated objects.
I slightly changed my mind about error reporting after implementing a parser myself.
Yes, getting all syntax errors at once is nice, but it tends to considerably bloat the code. So I just try to generate a precise message on the first error and exit.
(Whoa, I am putting my hat on!)
This is mostly about what you value.
People praise Rust’s error reporting. Also, most compiler projects put some value on implementation simplicity. These two facts are related. Rust’s error reporting is superb, because Rust has near complete disregard for implementation simplicity. According to Rust project’s value as I understand it, as long as it’s simple for users, implementation can be arbitrarily complex almost without limit. This is different from most other projects, also this is probably not what you want.
Exhibit A: Rust does this. “notice the capitalization”.
Parse errors in Rust are handled by the strategy of special-casing known issues, so it’s a mixed bag. If you feed it randomly broken code, you’ll get the usual crappy “expected x
or y
, but found z
”. But there are cases where it will try a fix and re-parse. There are cases where it’ll go extra mile to show a good error, e.g. for unbalanced parenthesis or missing brackets it tries to guess which one was missing.
1 | fn foo() {}}
| --^ unexpected closing delimiter
| |
| this block is empty, you might have not meant to close it
Sorry, I didn’t articulate it well the first time. Trying again:
Deno is worth checking for people that want JS/TS but don’t want Node’s design flaws. Comes from Node’s original creator.
TSC must be ported to Rust. If you’re interested in collaborating on this problem, please get in touch.
For once, I agree with a RIIR.
I’m excited for this as well. See also: https://github.com/swc-project/swc
A compiler needs to do a lot of symbolic manipulations.
More often than not, especially with a non-trivial type system such as typescript one, these implies manipulating directed acyclic graphs. Or even worse, arbitrary directed graphs if the processed language has a strong enough notion of recursion. Reasoning with lifetime and sharing of graphs is notoriously difficult. Even with logic stronger than the borrow model of Rust, such as separation logic. This alone is very annoying. If the graphs are local enough, as is the case with CFGs when doing intra procedural analysis, you can just keep allocating and release everything with when done with the function. But type systems not always lend themselves to local analysis. Then, like C/C++, Rust code tend to be efficient if computations have a regular structure (e.g. traversing arrays), not chasing pointers with hardly predictable lifetimes. Which again, is often the case when doing symbolic/static analysis.
So two of “Rust selling points” are not really applicable, quite the opposite.
I firmly believe that Rust today is the best language for compilery tasks, based on two observations:
Rust, as an ML in C++ clothing, combines the relevant benefit of the two.
I specifically disagree that “it’s hard to work with graph data in Rust” criticism is relevant. Even in a GC language, representing complex graph as a literal graph of objects is often inconvenient/inflexible, and you would use arena and indices, just like in Rust.
My anecdotal experience tells me that Rust is a more convenient language than Kotlin for compiler front-ends.
I think I can get behind an idea that a sort of “idealized ML” would be a better language to implement a compiler in, but I think Rust’s quality of implementation, today, in practice, blows all existing functional languages out of the water.
And if you don’t care about a C++ clothing, you can just use ML. Rust compiler used to be fast when it was written in OCaml. ;)
About C++ clothing and quality of the implementation :-) For example, Windows support and parallelism are just two major benefits of Rust in comparison to OCaml, which are mostly unrelated to language per se.
Being honest, anything but JS would be an improvement in this task.
I’m okay with Rust because being already used in Deno and the community has a considerable inertia.
Your points seem reasonable. Would love to see, in general, web-related compilers/transpilers written in more efficient stuff than JS
Very cool idea! I have some questions about the semantics, which don’t seem very comprehensively described:
What is kept in memory, what is not? How are changes in types handled, with regards to the things that are kept in memory? What happens to globals defined in reloaded modules (don’t 100% remember now if OCaml has those)? How can I control which side-effects are re-run on reload, vs. which are not? How can
Hotlink.is_hot_unloaded ()
ever be true? Does it mean the module will be unloaded, but not any running code from that module, so it needs to check this flag to self-terminate gracefully and do some custom communication if needing to “pass the baton” to a newly loaded version of the module? Also, a “module is unloaded” phrase, does it mean the old version of the module (possibly upgraded to a newer version), or any version of the module is now completely gone?Hello, I am the author.
After compilation, an OCaml module is basically a record of values (some mutable ones and some that are functions/closures, if needed) paired with an initialisation function. During loading, the record is allocated, the initialisation function is called, and then we are done with that module. The module name now refers to the record.
Unloading the module simply means forgetting the association between the module name and the record. Other parts of the program can still refer to values (and functions) defined in the module, those are not going to disappear.
So the granularity of side effects that are rerun during reloading is the module (allocation of a new record, execution of the initialisation function). Modules that have changed and the whole cone of their reverse dependencies will be reevaluated.
Yes.
It mean that the old version of the module is gone. Either a new version has been provided, or the dependency has disappeared.
Changes in types do not receive any special treatment (as I said, the granularity is the module). There are good reasons to avoid that: OCaml type definitions are not structural, they aren’t even just nominal, they are generative. The action of defining a type is side effectful. Trying to handle changes in any clever way could easily result in breaking modular abstraction.
It is not as bad as it sounds: it means that it is the responsibility of the developer to opt in for a special treatment. This does not have to be built-in Hotcaml. It can safely be built on top of existing primitives, for instance by using some reification of types (e.g. using https://github.com/thierry-martinez/refl or https://github.com/pqwy/tpf), and then automating some of the migration process by defining a compatibility relation between types (by induction on their structure). It is not easy, but orthogonal to the Hotcaml machinery.
That being said, a ML-like language designed with reloading in mind would likely do things very differently.
Thanks for the reply!
A couple more questions if I may:
(Above quoted from the readme.) Does this mean that an entrypoint must not end with some blocking call, like
main_loop ()
? From a quick glance athotcaml.ml
, I seem to understandhotcaml.exe
will ensure that the main thread never ends, instead of how this would (I presume) in regular OCaml code be the responsibility of the main app’s entry point?I’m not good at type-theory at all, so just to make sure I got it: after every reload, any types are incompatible with their “previous versions”, regardless if the structure changed or not, yes? (Unless some extra explicit migration is done, e.g. through the libs you linked, or presumably some explicit serialize+deserialize.) Though, I now understand it would also typically probably not be that much of a problem, because any modules depending on the reloaded one will also be reloaded.
Yes, it should not end with a blocking call. This is where “Lwt” comes in handy: it provides an alternative main-loop that allows concurrent and cooperative computations (async/promise/coroutines/,,,).
Yes, types are incompatibles. Even without reload, if you do:
you have two incompatible type t that have the same definition.
Exhibiting an explicit migration function or as you suggest serialization+deserialization is the solution. With the proper framework, this can be done in a rather lightweight way.
The migration function is actually a proof that the types are, in some sense, compatible, and that the user agreed for this compatibility to be exploited (you could also have, for instance, two types that are actually integers but should not be seen as compatible, for instance a delay in seconds and an amount of dollars). It is quite similar to serialize+deserialize, but it can skip the actual transformation and directly share compatible values.
there was a discussion on hackernews where the author chimed in