This is very consciously designed to be a “minimal viable prototype” for a memory-safe subset of C, and as such is highly conservative.
But I think it begs the question: Is it actually viable? Does any existing useful program live in this subset? Failing that, can any existing useful programs be modified without an undue amount of effort to fit in this subset? Since the main driver behind this idea seems to be “we have too much C code already that is impossible to rewrite”, I think the answers to these questions will be crucial.
This proposal is very shallow and full of wishful thinking.
It doesn’t get into any of the hard problems that break static analysis in C, but at the same time leaves fixing its soundness holes and severe restrictions to be “(hopefully)” solved with static analysis later. It says they want fast compilation speed and “avoid expensive or complicated static analysis”, without any plan to deliver that. It’s a “TODO: fix undecidability”.
The STATIC subset is restricted beyond any usefulness. It allows less than constexpr.
The DYNAMIC subset is still very limited (no memory management, only fixed-size arrays), but adds pervasive run-time checks to everything, and no language constructs to avoid them. restrict is banned instead of being required. There’s no word about use-after-free.
Their definition of “memory safety” appears to simply be “spacial safety” (bounds checks):
We consider a memory-safe operation an operation that (under certain preconditions specified below) can be shown at compile-time not to cause any run-time out-of-bounds accesses, i.e. accesses outside the bounds of the objects accessible via reachable pointers.
As such, given the bounds check clang changes which are currently being upstreamed, this proposal strikes me as already obsolete. I intend to try applying that at work once it arrives in an upstream version of the compiler, and see what cost and impact it has for us.
As to some of the details of the proposal, certainly in previous instances of use of “[static 1]” for pointers, I’ve found that checks based on it seem to be quite limited, and easily skipped by the compiler. I can’t recall if it was clang or gcc which had the better (but limited) support.
I see a classic issue though. There’s a grey area between:
The proposed modes will not make use of any language extensions and will not change
semantics in incompatible ways,
(Top of the paper)
and
New rules to allow more operations.
particularly
Annotations for accessing objects via pointers in structures
Annotations for pointer ownership for dynamic memory management.
(End of the paper)
This sounds like at least introducing a kind of dialect into the language that you need to follow. While it is true that C has all the features to make this happen without a lot of changes: this is a different dialect that people need to learn and that codebases need to be ported to.
New rulesets, while theoretically compatible, can create something that is essentially the same as introducing breakages and new syntax.
This is a substantial effort at scale and may trigger the question “well, why not a rewrite in Rust and link it in?”.
However, such questions are useful to raise actively, because the decision may be a recommitting to C and its improvement.
I would like to see some comparison with FORTIFY_SOURCE. Will these proposals be compatible with existing safety enhancements or will they require rewrites?
This is very consciously designed to be a “minimal viable prototype” for a memory-safe subset of C, and as such is highly conservative.
But I think it begs the question: Is it actually viable? Does any existing useful program live in this subset? Failing that, can any existing useful programs be modified without an undue amount of effort to fit in this subset? Since the main driver behind this idea seems to be “we have too much C code already that is impossible to rewrite”, I think the answers to these questions will be crucial.
This proposal is very shallow and full of wishful thinking.
It doesn’t get into any of the hard problems that break static analysis in C, but at the same time leaves fixing its soundness holes and severe restrictions to be “(hopefully)” solved with static analysis later. It says they want fast compilation speed and “avoid expensive or complicated static analysis”, without any plan to deliver that. It’s a “TODO: fix undecidability”.
The STATIC subset is restricted beyond any usefulness. It allows less than
constexpr.The DYNAMIC subset is still very limited (no memory management, only fixed-size arrays), but adds pervasive run-time checks to everything, and no language constructs to avoid them.
restrictis banned instead of being required. There’s no word about use-after-free.Their definition of “memory safety” appears to simply be “spacial safety” (bounds checks):
As such, given the bounds check clang changes which are currently being upstreamed, this proposal strikes me as already obsolete. I intend to try applying that at work once it arrives in an upstream version of the compiler, and see what cost and impact it has for us.
As to some of the details of the proposal, certainly in previous instances of use of “[static 1]” for pointers, I’ve found that checks based on it seem to be quite limited, and easily skipped by the compiler. I can’t recall if it was clang or gcc which had the better (but limited) support.
Hm, this is relatively rough, but I’ll take it.
I see a classic issue though. There’s a grey area between:
(Top of the paper)
and
particularly
(End of the paper)
This sounds like at least introducing a kind of dialect into the language that you need to follow. While it is true that C has all the features to make this happen without a lot of changes: this is a different dialect that people need to learn and that codebases need to be ported to.
New rulesets, while theoretically compatible, can create something that is essentially the same as introducing breakages and new syntax.
This is a substantial effort at scale and may trigger the question “well, why not a rewrite in Rust and link it in?”.
However, such questions are useful to raise actively, because the decision may be a recommitting to C and its improvement.
I would like to see some comparison with FORTIFY_SOURCE. Will these proposals be compatible with existing safety enhancements or will they require rewrites?