1. 4

    I really like Ansible and I’m totally going to see if I can use all or part of this tutorial.

    I bothers me a bit that updating OpenBSD 6.3 -> 6.4, per the official documentation, requires booting the installation media to preform the upgrade. In the world of could providers and VMs, I want to put together a guide to attempt to do this semi-inplace with a single reboot.

    I’m glad I read all the release notes before attempting anything though. The OpenSMTPD configuration grammar has changed entirely. I’m going to have to redo all my work in a VM to make sure it all still works.

    1. 4

      The big issue with “in place” upgrades in the OpenBSD world is that there is no guarantee the ABI between X.Y and X.Y+1 will be the same. This can cause all sorts of issues while doing in place upgrades. For example, tar, once replaced by the updated binary could segfault for every subsequent call. This would leave the system in an unknown state.

      I wrote an upgrade tool a while back (snap) that could be used to upgrade from release to release. The ABI issue was hit every couple of releases, so I removed the option to upgrade releases.

      I am not saying it’s impossible.. just that you will basically have to backup everything prior to doing an install.

      1. 1

        Following -current a in-place upgrade mostly just works, but I also always keep a new bsd.rd ready in case in-place fails, reboot into bsd.rd and upgrade will fix it.

        But I also have switched to a script that downloads sets, patches bsd.rd and reboots. Much less hassle and minimum downtime.

        1. 1

          This. I do the same - download bsd.rd, add an auto_upgrade.conf file to the image, then use that bsd.rd on all my systems to upgrade them. Just copy the patched bsd.rd over /bsd on the target, reboot, wait a few minutes, and the box is back up on the new release. I wrote my own script ages ago, but nowadays the upobsd port can take care of the patching bsd.rd bit.

      2. 2

        I want to put together a guide to attempt to do this semi-inplace with a single reboot.

        There is such a guide in the official upgrade notes. The only reason it suggests two reboots is KARL.

        1. 1

          Back in the day, I just had a script that downloaded, extracted sets and install a new bsd. Then I rebooted it and ran another script that took care of etc changes and new users, cleanup up old files no longer needed (according to release notes). Last step was just a pkg_add -U.

          I didn’t run into issues but I was aware of the risks and being on my own when it broke ;-)

        1. 1

          Author says a common class of gadgets uses such and such registers. Says avoid them in favor of other registers. Maybe the gadget type with those registers is common because the registers themselves are common from compiler choices. Switching registers might lead to gadgets just using those registers instead. Or are there x86-specific reasons that using different registers will do entirely different things you can’t gadget?

          Other than that confusion, slides look like great work. Especially on ARM.

          1. 15

            Author here. Thanks for having a look! It was fun to do this talk.

            Yes, there are X86 specific reasons that other registers don’t result in ROP gadgets. If you look at Table 2-2 in the Intel 64 and IA-32 Architectures Software Developer’s Manual you can see all of the ModR/M bytes for each register source / dest pair, and other places in that section describe how to encode the ModR/M bytes for various instructions using all of the possible registers. When I surveyed the gadgets in the kernel and identified which intended instructions resulted in C3 bytes that were used as returns in gadgets, there were a large number of gadgets that were terminating on the ModR/M byte encoding the BX series registers. You are correct that these gadgets are common because the compiler frequently chooses to use the BX series registers, and the essence of my change to clang is to encourage the compiler to choose something else. By shifting RBX down behind R14, R15, R12 and R13 the compiler will choose these registers before RBX, and therefore reduce the incidence of the use of RBX resulting in a C3 ModR/M byte. We can see that this works because just shifting the BX registers down the list results in fewer unique gadgets.

            To directly answer your inquiry, gadgets arising from using R14, R15, R12, R13 instead (now that they will be more common) are not a problem. The REX prefix is never C3, and we can look at the ModR/M bytes encoding operations using those registers, and none of them will encode to C3. When I look at gadgets that arise from instructions using these registers, they don’t get their C3 bytes from the instruction encoding - they get them from constants where the constant encodes to a C3, so the register used is irrelevant in these cases. So moving RBX down behind R14, R15, R12 and R13 doesn’t result in more gadgets using those registers.

            There are other register pairs that result in a C3 ModR/M byte. Operations between RAX and R11 can result in a C3 ModR/M byte, but these are less common when we survey gadgets in the kernel (~56 in the kernel I have here now). RAX and R11 were already ahead of RBX in the default list anyway, so moving RBX down the list does not result in more gadgets using R11. If you ask why we haven’t moved R11 down next to RBX, the answer is that gadgets using R11 this way are not that numerous, so it hasn’t risen to the top of the heap of most-common-sources-of-gadgets (and therefore has not got my attention). There are many other sources of gadgets that can be fixed and will have a larger impact on overall gadget counts and diversity.

            I hope this clarifies that part of the talk. :-)

            1. 3

              Thank eveyone for the answers. Thank you in particular for this very-detailed answer that clarifies how x86’s oddities are creating the attack vectors.

              The reason I wanted to know is that I planned to design around high-end ARM chips instead of x86 where possible because I believed we’d see less ISA-related attacks. Also, certain constructions for secure code might be easier to do on RISC with less performance hit. Your slides seem to support some of that.

              1. 2

                To be fair, x86 doesn’t create the attack vectors, but does make any bugs much easier to exploit.

                ARM doesn’t have nearly the same problem - you can always ROP into a jump to THUMB code on normal ARM instructions, but these entry points are usually more difficult to find than an 0xc3.

              2. 1

                I’m curious to learn more about ROP. I’d like to examine adding support for another target to ROPgadget.py. So what designates a gadget? Any sequence of instructions ending in a return? How do attackers compose functionality out of gadgets? By hand, or is there some kind of a ‘compiler’ for them?

                1. 3

                  You might be interested in the ROP Emporium’s guide. Off the top of my head the only automatic tools I know of are ropper and angrop.

              3. 5

                Switching registers might lead to gadgets just using those registers instead. Or are there x86-specific reasons that using different registers will do entirely different things you can’t gadget?

                If I understand this correctly, it’s because the ebx register causes opcodes to be created that contain a return instruction, i.e., opcodes that are useful in ROP. So by avoiding ebx as much as possible, you also avoid creating collateral ROP gadgets with early returns. This issue only happens because x86/amd64 have variable-length opcodes.

                1. 4

                  As far as I understand, the register allocation trick is indeed x86-specific. The point is to avoid C3 bytes because these will polymorph into the RET instruction when used in unaligned gadgets. See the “polymorphic gadget” and ‘register selection’ sections in the slide set.

                1. 1

                  This seems really cool. I’d love to have email more under my own control. I also need 100% uptime for email though, so it’s hard to contemplate moving from some large hosted service like Gmail.

                  1. 4

                    If email is that important to you (100% uptime requirement), then what’s your backup plan for a situation where Google locks your account for whatever reason?

                    1. 1

                      Yeah, that’s true. I mean I do have copies of all my email locally, so at least I wouldn’t lose access to old email, but it doesn’t help for new email in that eventuality.

                    2. 3

                      Email does have the nifty feature that (legit) mail servers will keep retrying SMTP connections to you if you’re down for a bit, so you don’t really need 100% uptime.

                      Source: ran a mail server for my business for years on a single EC2 instance; sometimes it went down, but it was never a real problem.

                      1. 1

                        True. I rely on email enough that I’m wary of changing a (more or less) working system. But I could always transition piece by piece.

                      2. 3

                        If you need 100% delivery, then you can just list multiple MX records. If your primary MX goes down (ISP outage, whatever), then your mail will just get delivered to the backup. My DNS registrar / provider offers backup MX service, and I have them configured to just forward everything to gmail. So when my self hosted email is unavailable, email starts showing up via gmail until the primary MX is back online. Provides peace of mind when the power goes out or my ISP has outages, or we’re moving house and everything is torn apart.

                        1. 1

                          That’s a good system that seems worth looking into.

                        2. 2

                          Note that email resending works. If your server is unreachable, the sending mail server will actually try the secondary MX server, and if both are down, it will retry half an hour later, then a few more times up to 24 hours later, 48 hours if you are lucky. The sender will usually receive a noification if the initial attempts fail (and a second one when the sending server gives up)

                          On the other hand, if your GMail spam filter randomly decides without a good reason that a reply to your email is too dangerous even to put into the spam folder, neither you nor the sender will be notified.

                          1. 1

                            And I have had that issue with GMail, both as a sender and a receiver, of mail inexplicably going missing. Not frequently, but it occurs.

                        1. 1

                          What is the overhead of this?

                          I love openbsd, though one of my coworkers told me he fundamentally disagrees with security by depth when we are starting to get memory safe language and I somewhat agree. While our useful kernels and applications are written in C, it seems like the best thing to do for now.

                          1. 3

                            The overhead is two xor instructions per function call (using a register and the top of the stack). This is cheap.

                            Memory safe languages have their own overhead. Even rust - which achieves so much at compile time - still has to check for overflow at runtime in some situations, and doesn’t implement integer overflow protection because it is too expensive. I am a big fan of memory safe languages, but there is still a lot of C/C++ out there.

                          1. 1

                            why are all the infosec people I follow charitably saying this is theatre at best and doesn’t do anything for any kind of attack?

                            1. 8

                              The most common negative response I have seen is that this can be bypassed if an attacker knows the addresses they will write their rop chain to. This is true, but it is not the case that all attacks know the addresses where the rop chain goes. The @grsecurity response is interesting, since they point out that this idea has been seen before (quite some time ago - in 1999 and 2003). If you have heard other specific criticisms, then I’d be interested to hear them.

                              The next iteration of this doesn’t have to use the stack pointer - it can use something stronger. Step 1 is getting the ecosystem working with mangled return addresses. For this, the stack pointer is cheap and easy.

                            1. 5

                              Does OpenBSD have any plans to upstream RETGUARD to llvm?

                              1. 14

                                Definitely. The llvm people I have spoken with have pointed out it might be better to implement in the preamble / epilogue lowering functions instead of as a pass, so once we prove it works in the ecosystem and have worked out any kinks, then I will do it that way and submit upstream.

                                1. 2

                                  Very cool! Thank you for the detailed answer.

                                2. 3

                                  Sure. Why not?