1. 17
  1. 19

    See also Checked C1, which addresses many of the same vulnerability and bug classes.

    As a bit of a rant: I wish academics wouldn’t say things like “the advantages of Rust” when describing these safer C subsets. These are admirable research areas to pursue, but they all lack the primary (IMO) advantage that languages like Rust have: that security is established on a program level by construction, rather than cobbling together individually verified components and hoping that invariants are not broken in an unchecked component. It would be fairer (and more useful!) to say “a subset of C without a subset of spatial memory safety bug classes.”

    1. 8

      Or even “cobbling together individually verified components and hoping that invariants are not broken in the ‘cobbling together’ process”, as one of the big problems with ‘C the language’ for this stuff is how composition does not preserve the kinds of invariants we are interested in, so even exclusively using individually verified components doesn’t save you.

      1. 4

        I don’t think your description is accurate. I’ve just read the paper, and my understanding is that those checks do compose. A checked function also keeps tracks of the functions it calls, and if the prototype of the affected functions is not properly annotated you get a warning. In addition, the tool could (maybe does) issue a list of all unchecked functions. We would know which parts need extra scrutiny, same as Rust’s unsafe.

        The only real problem here is the same as const and gradual typing: it’s unsafe by default.

        1. 4

          The only real problem here is the same as const and gradual typing: it’s unsafe by default.

          Another of the weaknesses of gradual typing: you can gradually stop doing it. You need a social component to force gradual improvement.

          1. 2

            I’ve worked in teams where “Thou shalt obey the Holy Code code Formatter” were religiously enforced. We could likewise enforce that every piece of new code must be annotated, unless you can write a justification that has to be vetted by Q/A or whatever. Considering that the benefits of gradual typing & annotations are both indisputable and substantial (especially at scale), the social component should be relatively easy to assert.

            Don’t get me wrong, I still prefer a language that’s safe by default, where the escape hatches read unsafe_dangerous_hazard in blinking blood letters.

            1. 2

              Good point. I said ‘social component’, but the improvement pressure can also be done technically. I should have written ‘external component’.

          2. 2

            They check that each function is internally consistent and can validate that a call graph contains only checked functions. However, they also explicitly flag their analysis as intraprocedural and context-insensitive, meaning that it won’t track invariants that are managed between functions or in visited loop states.

            In other words, using this as intended will give you a false sense of security: the paper encourages you to treat these macros as proving properties similar to “safe” Rust, when they really only prove that a function without interaction doesn’t violate its invariants.

            1. 2

              Maybe I’m missing something, but is this any different from Rust? Rust’s borrow checker never does interprocedural analysis, yet it composes to ensure the overall program is safe.

              1. 3

                That part is the same as Rust! The difference that allows Rust to get away with it is aliasing and ownership/lifetime semantics: you don’t need to do a heavy interprocedural analysis if you know that writes are exclusive and that every binding has a known lifetime.

                It looks like C-rusted can do a subset of that, since its macros allow exclusive and shared ownership. But it’s unclear to me how they’d prevent an interprocedural UAF through an owned pointer, for example, or how they’d enforce the liveness of a shared pointer. Maybe they do that by providing their own annotations on top of malloc and free and other resource allocation primitives, or maybe they expect the user to do that.

                1. 1

                  What’s “UAF”?

                  1. 2

                    Use-after-free. Most interprocedural UAFs are context sensitive, meaning that C-rusted either needs to reject large numbers of valid programs (larger than Rust, because Rust has total lifetime information) or accept programs without temporal memory safety.

                    1. 1

                      Okay, that’s the usual choice between safe and complete type systems: the only way to reject all incorrect programs is to reject some correct ones. That’s why Rust needs unsafe. it’s okay to reject too many programs, as long as we have a well defined escape hatch. Likewise, we almost certainly can’t just annotate all C programs and make them fully C-Rusted.

                      Use-after-free is one of the problems Rust solves, and C-Rusted is (as far as I can tell) trying to address. I believe both face the same safe/complete dilemma, I’m not sure why you feel Rust has to fare any better in this particular department. Specifically I don’t know what you mean by “total” lifetime information, nor why C-Rusted wouldn’t have that. The C-Rusted model seems pretty simple as far as I understand:

                      • Shared handles are like immutable borrows.
                      • Exclusive handles are like mutable borrows.
                      • Owning handles end the lifetime of the handled resource. I don’t know the Rust equivalent.

                      So when you call a function and give it some handle, and the annotation of the function says it takes ownership, the caller considers the resource “dead” right after the function call. At least that’s how I understood the process() example in the paper.

                      Another thing that could happen is that a function you call gives you an owning handle. Just like malloc(). I guess you must then convince the checker that you either free that handle before returning to the caller (by calling a function like process() or free()), or return that handle to the caller one way or another. I’d wager the same is true with Rust too.

                      I bet we can also forward an owning handle from input argument to return value, but my guess here is that from the caller’s perspective, the handle they passed on and the handle they get are unrelated. I don’t think we can do better with intra-procedural analysis. Besides, why would we forward references like that? Such functions should take an ordinary exclusive handle/mutable borrow instead, that’d make liveness analysis easier.

                      It looks like C-rusted can do a subset of that, since its macros allow exclusive and shared ownership.

                      Wait, what you call “shared” sounds like std::shared_ptr, while the paper uses the same word to describe immutable borrows.

                      To be honest I have no idea how I might implement std::shared_ptr in a way that would satisfy a borrow checker — without using unsafe. I mean, the very point of a shared pointer is to split ownership, and as far as I understand that breaks the linearity I believe is necessary to borrow checkers, so I’d wager you need some of your code to be unsafe. (If you have a reference to Rust safe-only implementation of something like a std::shared_ptr, that means I’m dead wrong, and I’d be highly interested in correcting that error.)

                      1. 4

                        Owning handles end the lifetime of the handled resource. I don’t know the Rust equivalent.

                        Rust refers to any value that you hold directly (as opposed to through a reference) as “owned”, and they have similar semantics to what you describe: destructors, if any, are automatically invoked when they go out of scope. If you want an “owned pointer”, that’s std::boxed::Box, similar to C++’s std::unique_ptr.

                        Other types, like Vec and String, also act as “owning pointers” to a block of memory; Box is implemented via black magic for historical reasons, but fundamentally all these types could just be structs with destructors that call free.

                        To be honest I have no idea how I might implement std::shared_ptr in a way that wou ]ld satisfy a borrow checker — without using unsafe. I mean, the very point of a shared pointer is to split ownership, and as far as I understand that breaks the linearity I believe is necessary to borrow checkers, so I’d wager you need some of your code to be unsafe. (If you have a reference to Rust safe-only implementation of something like a std::shared_ptr, that means I’m dead wrong, and I’d be highly interested in correcting that error.)

                        Rust’s equivalents here are std::rc::Rc and std::sync::Arc (the latter uses an atomic reference count and thus is sharable across threads) and they do use unsafe internally.

                        I believe both face the same safe/complete dilemma, I’m not sure why you feel Rust has to fare any better in this particular department.

                        It’s hard to tell from the paper, but as a broader comment: most “Rust replacements” I’ve looked at fall short in one of three ways.

                        • The “safe” subset isn’t fully safe, so you just have two subsets that allow incorrect programs.
                        • There’s no clear syntactical delineation between the safe and unsafe subsets; you need an encyclopedic knowledge of the syntax and standard library to recognize potentially unsafe code.
                        • The safe subset is insufficiently expressive to write meaningful programs in.

                        (To be clear, there are languages I like that fall short on all the above points! But I get annoyed when people bill them as “just as safe as Rust.”)

        2. 17

          Paper without accessible code. This is computer science’s reproducibility blind spot. I’ve always found papers that don’t have code or contain pseudo code for an algorithm to be next to useless.

          1. 2

            Agreed, especially when the CS discipline makes it so easy (Github, etc) to share code. Other hard sciences don’t have this advantage (sharing a chemical or physics process), and yet we fail to take advantage in CS. I wonder if it’s because academics practice LPU (least publishable unit), and don’t want others to scoop them on some improvement worthy of publishing.

          2. 8

            In the comparison “Coding standards for security” is has green Yes for C and a red No for Rust, which is somewhat funny, because these coding standards exist to forbid all the dangerous features of C, which safe Rust doesn’t have. In a way, Rust has its own secure coding standard already built into the compiler, but that doesn’t sound as serious and professional as a certification process of a consortium.

            1. 6

              Seems like having to follow all the rust borrow checker rules without having the really nice compiler suggestions would be a massive disadvantage.

              1. 3

                I skimmed the paper to be honest, but I didn’t see anything about automatic deallocation, stack unwinding, etc. Also dereferencing pointers is still unsafe, isn’t it? And no code/implementation provided, that makes the entire paper not useful until they give readers a way to test and validate this design.

                1. 1

                  I think calling it worthless is unfair, it’s just not useful to end-users.

                  1. 7

                    My impression is that there’s also a certain mismatch between what’s promised and what’s delivered. Like, “The Advantages of Rust, in C, without the Disadvantages” would be pretty big! I don’t want to dunk on the title in particular, as I myself overindulge in the art of clickbait headlines, but the body of the paper (eg, Fig 4) corroborates the “yes, it’s strictly better than Rust” reading.

                    And yet, it seems that one would need much more machinery to bring reasonably-complete linguistic safety to C than what is demonstrated. So far it looks much closer to “some advantages of modern C++ in C”, which is a marked improvement relative to C, but a far cry from Rust.

                    This unclear messaging makes it hard to not choose some flamboyant words if you have a stake in a game :)

                    1. 1

                      It’s clearly some of the advantages of Rust. For the annotated parts. And I’m not sure they address things like signed integer overflow. Still, even if it was only about ownership of memory & resources (it’s a bit more than that), it would still be huge.

                    2. 1

                      True, wrong choice of words, I didn’t notice. I will edit my post, thanks :)