So, I am hugely biased because this is describing precisely that single-provenance semantics that CHERI C (which I am largely responsible for) enforces. The Provenance section is really well written as a description of this model but it would be good to be a bit more explicit that it is a specific (my favourite) single-provenance model, not describing what a provenance model is in general.
Other provenance models are possible. The C standards committee is, I believe, leaning towards defining a multi-provenance model and making it implementation defined if you enforce a stronger one. A multi-provenance model allows XOR linked lists, for example. In an XOR linked list, there is a single next pointer in each node that is the previous node’s address, XORed with the next node’s address. You can then use a single word to allow traversal in either direction (though given a single node, you can’t traverse in either direction, iterators need to be two words). In a multi-provenance model, the next pointer carries the provenance of both the next and previous nodes.
In Rust today, you can implement an XOR linked list using ptr as usize. This is not possible in a single-provenance model.
This is very interesting. There are language features, potential optimizations, and sanitizers that seem to be impossible to properly implement due to the integer-to-pointer conversion gotcha, as in “it’d be great, except you can’t do that, because pointers conjured from arbitrary integers would lack the extra metadata”. And the solution is incredibly straightforward, to the point of seeming obvious in hindsight: require the extra pointer metadata to be provided. I hope this model can work in practice.
Basically, the strict provenance model is expected to be stricter than any final formal model Rust or C would actually use, so while those more finicky models get worked out precisely, we can point most devs to Strict Provenance and say “comply with this” and we/they can expect their code to just work under any future model, because that model will almost certainly be weaker / allow strictly more code than Strict Provenance!
So, I am hugely biased because this is describing precisely that single-provenance semantics that CHERI C (which I am largely responsible for) enforces. The Provenance section is really well written as a description of this model but it would be good to be a bit more explicit that it is a specific (my favourite) single-provenance model, not describing what a provenance model is in general.
Other provenance models are possible. The C standards committee is, I believe, leaning towards defining a multi-provenance model and making it implementation defined if you enforce a stronger one. A multi-provenance model allows XOR linked lists, for example. In an XOR linked list, there is a single
next
pointer in each node that is the previous node’s address, XORed with the next node’s address. You can then use a single word to allow traversal in either direction (though given a single node, you can’t traverse in either direction, iterators need to be two words). In a multi-provenance model, the next pointer carries the provenance of both the next and previous nodes.In Rust today, you can implement an XOR linked list using
ptr as usize
. This is not possible in a single-provenance model.This is very interesting. There are language features, potential optimizations, and sanitizers that seem to be impossible to properly implement due to the integer-to-pointer conversion gotcha, as in “it’d be great, except you can’t do that, because pointers conjured from arbitrary integers would lack the extra metadata”. And the solution is incredibly straightforward, to the point of seeming obvious in hindsight: require the extra pointer metadata to be provided. I hope this model can work in practice.
The “Tower of Weakenings” concept is really cool!
Basically, the strict provenance model is expected to be stricter than any final formal model Rust or C would actually use, so while those more finicky models get worked out precisely, we can point most devs to Strict Provenance and say “comply with this” and we/they can expect their code to just work under any future model, because that model will almost certainly be weaker / allow strictly more code than Strict Provenance!
More related to this: https://www.ralfj.de/blog/2022/04/11/provenance-exposed.html