1. 34

    1. 3

      I see a lot of chatter about the arrogance of not using memory-safe languages for libraries like this that are deployed so widely and so likely to consume untrusted inputs. I’m generally inclined to agree. However, after reading this (excellent!) explanation of the library’s decoding process I’m really quite skeptical that a rust implementation would have avoided unsafe behaviors when implementing these multi-level huffman tables. Hopefully someone proves that wrong though!

      1. 8

        However, after reading this (excellent!) explanation of the library’s decoding process I’m really quite skeptical that a rust implementation would have avoided unsafe behaviors when implementing these multi-level huffman tables.

        I don’t understand what you mean here? This is just an out of bounds access which is not possible outside of unsafe blocks in rust, so even a near verbatim translation from C to rust would not have been exploitable

        1. 1

          Right, rust provides all of the tools to avoid this. My suspicion is about whether engineers would choose to use those tools versus reaching for the unsafe escape hatch. Looking at the diff, it seems to me that the original bugged implementation felt an extreme pressure to avoid over-allocating. So a lot of the bog standard rust approaches like relying on a Vec to reallocate would have likely been avoided. There are other memory-safe ways to do it in rust that they may have used but in my experience rust engineers reach for unsafe alot more often than the marketing would imply and I can see why it would be very tempting to do so in a rust analog of the original implementation.

          Edit: just to be extra clear, I’m not arguing against trying to implement things in languages like rust. I’m just skeptical it will be a silver bullet as advertised by a lot of the people calling for it.

          1. 5

            The thing about reaching for unsafe is that:

            a) It’s a big flag for anyone auditing for security issues

            b) You can basically reason about which invariants have to be upheld in a tight local scope (even if maintaining those involves a larger scope)

            In practice I think this leads to issues like this just not being nearly as common. But yeah, no question, people will use unsafe and they will not manage the invariants properly.

            Beyond that, my experience has been that bug density in Rust is so low and mitigation quality tends to be so high that practical exploitation can be very difficult. As an example[0], we took a bug that in many ways is a perfect candidate for exploitation and attempted to exploit it but it wasn’t practical. There were so few primitives to work with, so few areas of unsafe where we might be able to chain things together, and a mitigation that happened to just destroy the major primitive we had. We could have gotten luckier with another bug but this one was definitely pretty severe at a glance.

            [0] https://chompie.rip/Blog+Posts/Attacking+Firecracker+-+AWS'+microVM+Monitor+Written+in+Rust

          2. 4

            in my experience rust engineers reach for unsafe alot

            Although Chrome doesn’t seem to have applied their Rule of Two here at all, I wonder whether the Rule’s requirements on usage of unsafe Rust would have been enough to prevent it from incurring this vulnerability.

          3. 1

            The problem isn’t the allocations, it’s the subsequent indexing, which would only be doable without bounds checks if the decoder logic itself was in an unsafe block.

            Given how much pressure there is to not use unsafe in rust for more than a tiny number of statements I can’t imagine anyone being ok reviewing it with such an approach.

            1. 1

              You’re right, the bug is due to the out-of-bounds write and not the allocations. I mentioned the allocations because most of rust’s approaches to avoid out-of-bounds writes pair bounds-checked memory access with allocations in a single abstraction like Vec. So as you try to handle allocations yourself you also have to do bounds checks yourself.

              Given how much pressure there is to not use unsafe in rust for more than a tiny number of statements I can’t imagine anyone being ok reviewing it with such an approach.

              I understand this argument but it doesn’t match my experience. What I’ve observed is similar to experiments showing aggregate energy usage going down as people become away of energy efficiency improvements. Rust reviewers often overestimate the ameliorative impact of just knowing where the unsafe code is that they become more comfortable with it existing in one or two places. The relative rarity of unsafe blocks can often leave reviewers with a sense that it was unavoidable. However security issues often only need one instance of such behavior to lead to an exploit.

      2. 3

        Rust has unsafe and so is not memory-safe. If you can reproduce the issue in a memory-safe language, then I would be interested in reading your code.

        1. 4

          “unsafe” is used to implement those things that can’t be implemented in the context of the memory safe part of the language.

          All “memory safe” languages have that, either through something explicit like “unsafe” in rust, raw pointers with a load of associated “unsafe” APIs, or the runtime (and its APIs) itself.

          Saying “rust is not memory safe because it has ‘unsafe’” is like saying that there are no memory safe languages. It might be true on a technical basis for a certain set of definitions, but that’s not a useful definition.

          We all know that we lose some guarantees in a rust “unsafe” block, just as we lose safety in Haskell when using the pointer APIs, or swift when using UnsafePointer. What memory safe means is “if you aren’t using the explicitly unsafe parts of the language you aren’t going to get memory safety errors” with a layer of it being aggressively hard to use those corners.

          1. 2

            “All ‘memory safe’ languages” is too bold. My counterexample will be R5RS Scheme, as specified in this PDF. R5RS Scheme does not have raw pointers.

            My original point was that it’s not correct to claim that Rust is immune to memory corruption caused by improper parsing or loading of binary formats. By construction, there do exist systems which are immune, and I think that it is a sort of learned helplessness to act otherwise.

            1. 2

              JavaScript is memory safe in the same sense as Scheme, but it has memory corruption bugs (eg due to type confusion) and Scheme could have the same kind of bugs too.

              The point of safety in Rust is that the “by construction” part is in Rust, it isn’t passed off to some different completely unsafe implementation language.

              1. 1

                You are confusing a language with its implementations. JavaScript itself, as far as I know, does not have invalid memory accesses; type-confusion bugs come from e.g. V8 – written in C++ – or similar implementations.

                1. 3

                  No, I am pointing out that memory safety is a property of the system as a whole. At some layer there must be a part of the system that establishes and maintains the invariants required for memory safety. This TCB has to be unsafe. What makes safety achievable is keeping the TCB contained and auditable. You are arguing that it is necessary to put the TCB inside a safe language’s implementation. The question then is how much of the implementation is in the TCB, and how easy is it to find and audit the TCB. If you use Rust to implement your memory-safe language it will be easy to find the TCB.

                  1. 1

                    This approach fails as soon as the underlying machine doesn’t have memory; e.g. a cyclic tag system or a matrix of natural numbers. Safety is always relative to the framing.

                    1. 3

                      How are machines without memory relevant to memory safety?

                      1. 1

                        They demonstrate that memory-safety is a property of programs within a particular language, not a property of algorithms executed on a particular machine. The confusion arises because our languages are designed to map well to our machines, and also our machines are designed to efficiently encode facets of our languages.

                        Imagine somehow compiling C to The Waterfall Model. This must be possible because The Waterfall Model is Turing-complete. On one hand, the C abstract machine has cases of undefined behavior; on the other hand, The Waterfall Model is totally defined. This means that our compiler must assign weird instructions to any undefined memory accesses, creating a weird machine.

                        This argument proceeds for quite a few Turing-complete systems, as long as they too are totally defined. It is the fault of the C language and the C abstract machine that we get weird machines in such cases. We also – at least in the case of x86 machines – know that weird machines arise in practice, regardless of whether the hardware’s behavior is totally defined. This is ultimately why memory-safety matters: when a language is not memory-safe by default, then arbitrary programs generate weird machines during compilation.

                        1. 1

                          But that says nothing useful about how memory safe languages are implemented on the machines we actually use, nor about my observations of the TCB required for memory safety. Weird machines are often found by breaking abstraction layers, like my example of type confusion in dynamic language runtimes, which is why I am asking about how to make those abstraction layers (the memory safe languages) more solid. But you are telling me to assume they are already perfect, despite the evidence.

            2. 1

              Safety is a continuum.

              1. 1

                Memory-safety is usually phrased as a Boolean: either invalid accesses are possible, or they are not possible.

      3. 2

        This was an out of bounds access and would have been prevented by safe rust. No one owes you a proof.