1. 28

    1. 6

      ROP can trivially be eradicated at effectively no performance cost by using separate call and data stacks. The only problem is that it is not obvious how to do it without breaking binary compatibility (I came up with an exceedingly convoluted scheme for doing it); but, from my understanding, openbsd does not value binary compatibility. I wonder why they do not do this.

      1. 2

        you should send a proposal to tech@

        1. 2

          I’m not particularly interested in contributing to openbsd, nor in building security mitigations for unsafe languages. Technically speaking, this is completely trivial; there is hardly anything to propose, and I’m sure it’s been thought of already.

          1. 2

            technically speaking, can you elaborate? i’m curious.

            1. 8

              Maintain two stacks, rather than one: a call stack, and a data stack; devote one general-purpose register to each (but elide any ‘frame pointer’, so register pressure is the same). The call stack comprises a sequence of two-word activation records: a return address, and a pointer into the data stack. The data stack contains all other data that would have been stack-allocated (spilt, passed/returned on stack, etc.) in the traditional calling convention; because this is separate in memory from the return addresses, there is no possibility of corrupting the latter through an overflown access to the former.

              Indeed, it is possible to avoid storing any pointers to the call stack at all in memory. In the case of, for example, setjmp/longjmp: align the call stack (to, say, 1mb); then, store in the jmp_buf only the low bits (say, 20 of them) of the call stack, and when longjmping, restore only those low bits. Consequently, even an attacker with arbitrary read/write primitives would have no good way of finding the call stack.

              This arrangement also enables much faster stack unwinding, which is desirable independently of any safety concerns. (EDIT: faster only for the purposes of tracing and profiling and such like; nonlocal exits like thrown exceptions will be as hard as ever unless you get rid of callee-saved registers.)

              1. 7

                The SafeStack code in LLVM does this (anything address taken goes on the main stack, anything else [including return addresses] goes on the safe stack), but I think you are overstating the security claims. It prevents stack buffer overflows from being turned into arbitrary code execution vulnerabilities. That’s a win but it doesn’t prevent ROP. Any code that’s able to get a pointer to the safe stack can corrupt it. If the location of the safe stack is predictable (even probabilistically - if you’re attacking a million machines, a 1% chance of success gives you a nice big botnet) then any pointer-injection attack lets you modify values that are on the safe stack. Speculative side channels make it fairly easy to probe the address space. Worse, it’s often easier to do pointer spraying attacks on the safe stack because it is less sparse: there’s a much higher probability that a write there will hit a return value than anywhere else.

                Intel’s CET works in a similar way but makes the safe stack non-addressable so only explicit pushes and pops modify values on it. This makes it impossible to just take the address of it and overwrite it. For ABI compatibility, CET doesn’t replace the stack, it duplicates values spilled on the main stack and traps if they differ.

                It’s fairly easy to demonstrate that any CFI scheme is bypassable if you don’t have memory safety. There’s been less effort on these in the last few years because they increase work factor but are eventually bypassable, then the bypasses become automated, but you’re stuck maintaining the complexity of the defence.

                1. 1

                  I don’t see why the location of the call stack would be at all predictable; we have strong randomness, and if your randomness source is broken, you have far bigger problems. (And I gave the example of a 1mb stack because it’s nice and round, but that’s quite generous; you could go for, say, 16kb, and have space for 1024 recursive calls while still having 33 bits of entropy to protect the stack.)

                  If you can probe the address space, it seems likelier you can find and corrupt a function pointer than the call stack, as the heap will have a regular structure and be a much larger target. Of course you can corrupt the stack (memory safety is no absolute protection either!), but this seems to demote ROP from a significant exploit category whose mitigations are often defeated to basically a curiosity.

                  1. 3

                    I don’t see why the location of the call stack would be at all predictable; we have strong randomness,

                    Because it’s moderately large and you typically have only a 47-bit VA space for userspace. 33 bits of entropy is not that much to probe. On platforms that just do ASLR, once you leak one pointer you know the random displacement and so any information disclosure tells you where it is. On platforms that do full ASR, you have more probing to do, but you also have accesses to it in function prologues and epilogues and so finding a speculative execution gadget that lets you leak it is fairly easy. The structure on the shadow stack is predictable and so is a great place to inject your gadgets (it’s basically a ROP gadget machine: it’s a pile of values that go into registers and return addresses, tightly packed, so if you can do one arbitrary write somewhere into it then you can trivially build a Turing-complete weird machine).

                    Defences that depend on secrets were never robust, but since Spectre was disclosed the number of techniques for comprehensively breaking them has exploded. The techniques for breaking things like SafeStack are now nicely automated and available to script kiddies.

              2. 1

                hey thanks for this – very clear and concise