1. 4

    I always wonder why MsgPack took off over similar tools like Cap’n Proto?

    1. 21

      they’re different things, msgpack is self-encoding and is basically binary json, whereas capnproto comes with an IDL, a RPC system, etc.

      1. 4

        That is an excellent question! Could be some kind of worse-is-better thing, due to how very compatible with JSON MessgePack is. But then, why not BSON? I would love to see some analysis of usage trends for these new binary serialization protocols, including Protobufs, CBOR, BSON, and Cap’n Proto.

        1. 9

          I want to like CBOR; it has a Standards Track RFC: 7049, after all. But, the standard does define some pretty odd behavior, last I looked. There’s a way to do streaming which, effectively, could require you to allocate forever.

          I’ve not used Cap’n Proto, but have definitely used Protobuf. I greatly prefer the workflow of using MsgPack, but do also appreciate the schema enforcement being generated for you with Protobuf, it does get in the way early on in dev though. :/

          1. 4

            Funny, it’s the other way for me; I prefer to nail down the schema as early as possible. I thought Capn’p was pretty great compared to JSON, but I might not be as enthusiastic if I was trying to port over some sprawling legacy thing that never had a very well-defined schema in the first place.

            1. 4

              I do adhoc, investigative stuff far more often than I do work that goes into production. That being said, the last few projects I’ve had involvement in have started out with defining a schema in protobuf and moving forward that way. Prematurely, in both cases, probably. :)

            2. 4

              Yeah, but the streaming is optional and can be used as a ready-made framing format for your whole TCP session instead of inventing an ad-hoc one.

              Seriously, stop inventing new protocols. Just use newline-separated JSONs or CBOR. Please? Pretty please?

              1. 5

                You can stream MsgPack objects back-to-back without any problem at all - it’s just concatenating objects one after the other on the wire, in a file, in a kafka queue, etc. You can do the same thing in CBOR, but CBOR makes it ambiguous if you should wrap the objects inside an indefinite-length array or something, and then says that some clients might not like that.

                And this encapsulates the problem with CBOR - it defines a bunch of optional features of dubious value (tags, ‘streaming mode’, optional failure modes) that complicate interoperability and bloat the specification. The MsgPack spec is tiny and unambiguous.

                It’s really a shame that CBOR was forked from MsgPack and submitted to the IETF against the will of the original authors. Now we have two definitions of essentially the same thing, but one of them is concise, and the other one is an IETF standard.

                1. 1

                  It might have been better, yes. But you can use strict mode, the standard can be revised and IANA runs a tag registry. In this case, as much as I hate the saying, good is better than better.

                  Without a clear signal “use this” and proper hype some people might consider alternatives. And that will inevitably lead to custom formats. We need an extensible TLV format with approximately JSON semantics to move forward. Not more “key value\n” without escaping or dumping packed data structures to the wire.

                2. 1

                  ready-made framing format for your whole TCP session instead of inventing an ad-hoc one.

                  Is this actually true? I assumed the streaming allowed for an arbitrary nesting, say:

                  [ {...},
                    {...},
                  ....
                  ]
                  

                  Where you really can’t finish reading until that last ] forcing continued growth in allocations. I suppose your implementation could say “an outer array will emit the inner elements via a callback” … then you don’t have to allocate the world.. But, what if the inner object also uses streaming? Is that a thing that can happen?

                  Seriously, stop inventing new protocols. Just use newline-separated JSONs or CBOR. Please? Pretty please?

                  Probably don’t want newline delimited CBOR, or MsgPack. Might I suggest you stick a MsgPack integer before your object, decode that, and then read that value of bytes more and avoid delimiters altogether? Sure was nice back when I was using “framed-msgpack-rpc”…

                  1. 2

                    I believe you can have nested streaming, though I can’t think of a practical use-case outside of continuously streaming “frames” of data. The endless allocation isn’t really endless, it goes until whatever is producing the data marks the end of it. If you’re pipelining your data handling, streaming like this is extremely useful because it allows you to stream large datasets while keeping things stateless/within the same request context. Most implementations don’t support streaming though.

                    1. 1

                      Just use newline-separated JSONs or CBOR. Please? Pretty please?

                      Probably don’t want newline delimited CBOR, or MsgPack.

                      I am not a native English speaker and I thought the comma made it ((newline-separated JSONs) || (CBOR)).

                      1. 2

                        I think you did everything right English syntax wise. I messed up reading your intention. Even with your clarification, though, I still think framing MsgPack with the number of bytes in the message is a great idea. :)

                        1. 1

                          And do you frame it using it’s compact integer notation, or fully wrap it in a blob or do you settle for an uint32be?

                3. 8

                  I don’t have usage trends, but I wrote up a compare-and-contrast to all these various things a little while ago: https://wiki.alopex.li/BetterThanJson

                  Long story short, MsgPack is intended to be schema-less, or at least schema-optional, like JSON (and also CBOR). Cap’n Proto, like Protobufs, assumes a schema to begin with, which makes it much more complicated, with more tooling attached, and potentially much faster.

                  Also the more I look at CBOR and MsgPack in terms of encoding details the more similar they look to me; they both seem to very obviously share the same lineage. Take a look at the encoding of the example on the MsgPack website and the CBOR version.

                4. 3

                  Slightly less juvenile name?

                  More seriously though, has MsgPack “taken off”? From what I can see most stuff offered as an API format is JSON. I guess for internal messaging something less fat on the wire is valuable.

                  1. 3

                    It definitely has some level of adoption. I’ve been using Rocket (Rust web framework) and noticed it’s mentioned in the minor version release notes. Which leads me to assume that MsgPack has at least enough interest for issues with it to get fixed in a non-mainstream framework. Not a super strong basis for this conclusion, but I think we’ll continue to see interest in MsgPack growing.

                    1. 6

                      For how old it I think it has absolutely not taken off. Also Rocket is not even Rust-mainstream.

                      1. 7

                        Is any web framework Rust mainstream? 😛

                1. 38

                  Rust.

                  The dev experience is so much nicer than my usual C/C++. After spending a lot of time writing and doing code reviews of C, C++, and rust, I am pretty convinced that it is much easier to write correct code the first time in rust than it is in the others, and rust has equally nice performance properties but is much easier to deploy.

                  I spend most of my day working on high performance network software. I care about safety, correctness, and performance (in that order). The rust compiler pretty much takes care of the first item without any help from me, makes it very easy to achieve the second one, and is just as good as the alternatives for the third.

                  1. 6

                    I’m curious if you’ve ever tried another — non C/C++/Rust — language (anything garbage collected or dynamically typed) for projects where you don’t necessarily care about the fastest runtime? Is that ever relevant, or do you really only work on “high performance network software”?

                    1. 8

                      I work in games, and my experience is very similar to mortimer. I would go rust with no hestitation.

                      I’ve done a lot of C# with Unity, and quite a bit of Go. I’d pick Rust over both of them any day of the week.

                      The big thing with C# in games is that you lack control, and also have to do generally more memory management than even C++, working around the garbage collector is not fun.

                      1. 7

                        Sure, there is some stuff where performance doesn’t matter too much, and for those we’re free to choose something else. Python is pretty popular in this space, though even for these things I’d still consider using Rust instead just because the compiler makes it harder to screw up error handling and such.

                        I did a transparent network proxy in ruby once, and that was super nice because ruby is super nice, but if I were to do it again today then I’d pick Rust. Most of the code wasn’t something you’d get from a library, and the vast bulk of bugs I had to handle would have been squashed by a better type system (this thing that is usually a hash is suddenly an array!) and better error handling (this thing you thought would work did not, and now you have a nil object!). Ruby (also python) just don’t help you at all with these things because it’s dynamically typed and will usually return nil to indicate error (or python will sometimes throw, which is just offensive). This paradigm where the programmer has to manually identify all the places where errors can happen by reading the documentation, and then actually remember to do the check at runtime is really failure prone - inevitably someone does not remember to check and then you get mystery failures at runtime in prod. Rust’s Result and Option types force the programmer to deal with things going wrong, and translate the vast bulk of these runtime errors into compile time errors (or super-obvious-at-code-review-time unwrap()s that you can tell them to go handle correctly).

                        I haven’t really done any professional Java dev, but the people I know who do Java dev seem happy with it. They don’t have any complaints about performance - and they deploy in places where performance matters. When they do complain about Java, they complain about the bloat (?) of the ecosystem. FactoryFactoryFactories, 200 line backtraces, needless layers of abstraction, etc.. I don’t think they’re looking to change, so they must be happy enough. When I did Java in school I remember lots of NullPointerExceptions though, so I assume the same complaint I have about ruby / python / C / C++ error handling would apply to Java.

                        For personal projects, it was usually ruby (because ruby is super nice), but lately all the new stuff is Rust because the error handling is so much better and it’s easier to deploy. Even when I don’t care about it being fast I do care about it being correct.

                      2. 1

                        Another reason: Attract good developers!

                        That’s the flipside of all the good techical reasons, plus actually some of the bad - learning curve and newness.

                        There are too few Rust and Haskell jobs, ok many C and C++ jobs, and absurdly many Java jobs.

                        1. 11

                          In order to validate the ‘learning curve for newbies’ concern, I actually gave Rust to a new employee (fresh out of uni) to see what would happen. They had a background in Java and hadn’t heard of Rust before then. I gave them a small project and suggested they try Rust, then sat back to see what happened. They were productive in about a week, had finished the project in about two weeks, and that project has been running in production ever since without any additional care or feeding for over a year now. This experience really cemented for me that Rust isn’t that hard to learn, even for newbies. The employee also seemed to enjoy it (this is a bit of an understatement), so if new staff can be both productive and happy then I’m not too concerned about learning curves and stuff.

                          1. 4

                            Vast majority of people that write about Rust online mention fighting the borrow checker. Your new folks didn’t have that problem?

                            1. 8

                              Having helped both a few co-workers and a fresh intern with answering rust questions as they learned it, I’ve come up with a theory: Fighting the borrow checker is a symptom of having internalized manual memory management in some previous language before learning rust. And especially severe cases of it is from having internalised some aspect of manual memory management wrong. People who don’t have that are much more likely to be open to listening to the compiler than people who “know” they’re already implementing it right & they just need to “convince the compiler”.

                              1. 8

                                I find that I can often .clone() my way out of problems for now and still be correct.

                                Sometime later I can revisit the design to get better performance.

                                1. 4

                                  Oh yes, new people fight the borrow checker but it just isn’t that bad (at least not in my experience) and they seem to get past it quickly. The compiler emits really excellent error messages so it’s easy to see what’s wrong, and once they get their heads around what kinds of things the borrow checker is concerned about they just adapt and get work done.

                                  1. 3

                                    I felt that I wasn’t fighting it. It was difficult, but the compiler was so helpful that it felt more like the compiler was teaching me.

                                    (That said, I was coming from Clojure, which has terrible compilation errors.)

                                    1. 1

                                      Not sure about his employee’s perspective. But, I’m new to writing in Rust, and I think the frustration with the borrow checker is not understanding (or maybe just not liking?) what it is trying to do. My experience has been that at first I wanted to just try to build something in Rust and work through the documentation as I go. In that case the borrow checker was very frustrating and I wanted to just stop. But, instead I worked my way through the Rust book and the examples. Now I’ve picked up the project again, and it isn’t nearly as frustrating because I understand what the borrow checker and ownership stuff is trying to do. I’m enjoying working on the project now.

                                    2. 2

                                      This experience really cemented for me that Rust isn’t that hard to learn, even for newbies.

                                      Counter anecdota – we have a team at $job that works entirely in rust, and common complains from the team are:

                                      1. The steep learning curve and onboarding time for new team members
                                      2. The Very Slow compile times

                                      We aren’t hiring many folks direct from uni though – so perhaps counter-intuively, having more experience in other languages may make learning rust more difficult for some, and not less? Unsure.

                                1. 16

                                  Consider CBOR

                                  1. 9

                                    CBOR is so overlooked, I really wish it had as much attention as MessagePack.

                                    https://tools.ietf.org/html/rfc7049

                                    1. 3

                                      I have done some decent digging on both and they honestly look pretty similar in purpose and design. Do you have any opinions or references for important differences?

                                      1. 1

                                        There’s a comparison in the RFC:

                                        https://tools.ietf.org/html/rfc7049#appendix-E.2

                                        I honestly can’t remember exactly why I chose CBOR over MessagePack, and yes, they are very similar. A big one for me was that CBOR is backed by an RFC, but on the other hand MessagePack’s wider use may make it preferable depending on the situation. There are probably more important things to worry about though!

                                        1. 1

                                          When we had to choose between MessagePack and CBOR we landed on msgpack because it is so much simpler and straightforward. The spec is trivial to read and understand and there are many libraries available in a variety of languages. CBOR seems ambiguous and complex by comparison, particularly with tags being interpreter optional and weirdly arbitrary (MIME?).

                                          There is also some history between the magpack community and the CBOR guy, but that isn’t a technical point.

                                      2. 3

                                        CBOR is an excellent “efficient/binary JSON” but it’s, well, JSON. The main attraction of these protobuf style things is the schema stuff.

                                        I’d love to see a typed functional style schema/interface definition language that uses CBOR for serialization…

                                      1. 2

                                        Take some time to think about having a backup MX record for those times when your own domain is offline and you still want email to get delivered. This mostly applies if you are planning to self host, but even commercial hosting will have hiccups.

                                        I have my DNS registrar listed as a backup MX for my domain, and it is configured to forward everything to my gmail account (which I otherwise never use). If someone is talking to my backup MX, then something is wrong with my primary email domain, so it is impractical to have my registrar try to deliver to my primary domain, and it is also impractical to configure gmail to forward everything to my primary domain (as much of the advice in this thread is recommending).

                                        This was pretty handy when a tornado took out power to my domain host for four days last year. Email addressed to my primary domain simply landed in my gmail inbox and I could get on with things while the power situation got sorted out.

                                        Anyway, I think the advice from artemis is spot on. The only thing I would add is it is handy to give some thought to what happens when mail for your own domain is offline, and configure things so you have some kind of workable failover when that happens.

                                        1. 1

                                          I self host my email (have done so since 1998) and I used to have a backup MX host. I stopped that years ago since I found that spammers will target the backup MX, probably with the thought that it might not have as much anti-spam protections in place [1]. I think I removed the backup MX somewhat after I started greylisting on my server, and found that it easily trapped 50% of spam right off the bat (still does in fact [2][3]). Given that fact, and the fact that (again, back in the day) legitimate email servers would queue outgoing email for a couple of days, that it wasn’t worth it (for me) to have a backup MX record. I really haven’t had any issues with the lack of a backup MX.

                                          [1] Might have been the case years ago—these days, maybe not so much.

                                          [2] I found it to be the cheapest, most effective anti-spam measure

                                          [3] I wrote my own greylist software, and I have the ability to whitelist or blacklist based upon sender’s IP, sender’s domain, sender’s email address, recipient domain or recipient email address. The only feature I wish I had added is the ability to timestamp and leave a comment as to why I added an entry to one of the whitelists/blacklists. Maybe some day.

                                          1. 1

                                            Oh that’s interesting. I haven’t had this problem at all, but then the backup MX just forwards to gmail, and gmail has decent spam filtering. I have been self hosting since 2002.

                                            I used to greylist, but have found that RBL + spamasassin + mailproc is very effective and eliminates the annoying delays.

                                            I am not sure I would be happy with senders queueing mail for days, but everyone has different tolerances for these kinds of things.

                                            1. 2

                                              The delay doesn’t bother me, and once the timeout is over (I have it set to 25m, but even 5m is enough to catch about 80% of spammers), the tuple “sender-ip, sender-email, recipient-email” is then whitelisted for (in my case) for the next 36 days (and each time an email is received, the timeout is extended).

                                              In normal email operations, if the sending server can’t deliver an email, it will be queued for redelivery after a period of time (sometimes fixed, sometimes exponential decay). After a period of time of unsuccessful delivery attempts, it is deleted—it’s this value that lasts for up to a couple of days. It’s not “oops, can’t deliver, try again in two days” type of operation.

                                            2. 1

                                              An issue I have encountered with greylisting or delayed greeting is that some other sites won’t retry the message delivery.
                                              This may not be a problem for a personal server but it can be an issue when there are more users.

                                              1. 2

                                                It my experience in greylisting (over a decade now) that’s very rare for a legitimate email server to give up after one try—I think I’ve only encountered that once, and it was an easy thing to just whitelist the IP addresses [1].

                                                [1] My software supports CIDR format for IP addresses, so adding blocks of addresses is easy.

                                                1. 1

                                                  Do you use delayed greeting?
                                                  It has been years since I administered a mail server but that was much more problematic than greylisting.

                                                  1. 2

                                                    I don’t use delayed greeting.

                                          1. 7

                                            The first half of this article is the important bit. This really is a question of language design: philosophy, design goals, affordances, etc.

                                            Programming in Go feels like programming in a better C. Programming in Rust feels like programming in a better C++.

                                            The rest of the article seems like an attempt to rationalize this feeling, which is a sensible thing to do but - as others have pointed out - not all the rationalizations are really fair.

                                            The initial point remains though: we might want a memory-safe C but Rust is not it. Rust - in design and feel - is a memory-safe C++. Whether that bothers you or not depends on your view of C++.

                                            1. 13

                                              The initial point remains though: we might want a memory-safe C but Rust is not it. Rust - in design and feel - is a memory-safe C++. Whether that bothers you or not depends on your view of C++.

                                              I think Rust and C have more in common than Rust and C++.

                                              • Fundamentally, Rust is a struct oriented language where you define functions that take the struct as an argument - just like C. C++ is an object oriented language with inheritance trees, function overriding (runtime dispatch tables), etc. Traits in Rust make working with structs feel superficially like OO, but in reality it’s more like defining interface implementations for structures so you can use them interchangeably, which is actually very different from OO.
                                              • Neither Rust nor C have exceptions, where C++ does.
                                              • Reasoning about when structures are deallocated in Rust is more like C than C++ (both C and Rust have trivial memory management rules for developers, where C++ has relatively complex rules and norms about how and where to define destructors, and it is easy to screw up deallocation in C++).

                                              As someone who writes mostly C for a living the Rust model looks fairly straightforward, where the C++ model looks relatively complex. Rust is just structures and functions, where C++ is objects and templates and exceptions and abstract virtual base classes and other stuff that makes it non-obvious what’s actually happening when your code runs. To me, Rust feels like a memory and thread safe C (with interfaces), and less like a memory and thread safe C++.

                                              1. 2

                                                Rust is definitely preferable to C++. I haven’t programmed in Rust much but what I have done I have enjoyed. Sadly, I have programmed in C++ for a couple of decades, none of which I particularly enjoyed (although C++11 did make some things less painful).

                                                C++ programmers rarely use the whole language, not least because it is impossible for any one person to remember the whole language at once. People tend to find a subset of the language that they think they understand and stick to that. These days, I see mostly functional-style C++. I don’t see much use of inheritance or exceptions. The C++ that I have written and code reviewed in the last five years or so looks quite a lot like Rust.

                                                Rust does it better. It’s like a nice subset of C++ with better ergonomics. Its type system is more pleasant to use than C++ templates. Its support for functional programming is better. And, of course, it has RAII that really works because the compiler ensures that it is safe.

                                                I don’t know what the right term is: aesthetics? style? feel? mindset? Whatever it is, Rust shares it with C++. I don’t think it is an insult to say that it is C++ done right.

                                                I would contrast that with C. Some people may use C because they have to: they need something with manual memory management for example. But manual memory management is not a design goal for C. The core ethos of C, perhaps now diluted by standards bodies and compiler writers, is simplicity. It’s about having a small number of orthogonal building blocks from which bigger things can be made. I think that ethos has been passed on to Go.

                                              2. 11

                                                wat. How is golang in any way like a “better C”? I remember when it came out some touting it that way and I got excited… and then it just isn’t that at all IME. The GC alone disqualifies

                                                1. 2

                                                  I agree. I’d express it as: C and Go value simplicity. C++ and Rust, they do not sacrifice simplicity for nothing, but they are eager to trade off simplicity with almost anything.

                                                1. 13

                                                  I disagree with most of this post, that said I think he brings up some good points, and I suspect that there’s a certain amount of truth in the statement that “Rust is not a language most people who are still C programmers will like”, because most of those developers who would like it have already moved to C++ or other languages.

                                                  I’ll start with a few things I agree with - then a long list of things I disagree with. I’ve tried to avoid duplicating arguments already made in other comments.

                                                  C is the most portable programming language.

                                                  This is true, and is a deciding feature for the few people it applies to. It’s can also be a significant factor to people who it might apply to in the future, in terms of return on time invested in learning and building up a library of code.

                                                  more importantly [the number of features added per year] speaks to their complexity. Over time it rapidly becomes difficult for one to keep an up-to-date mental map of Rust and how to solve your problems idiomatically.

                                                  This is definitely a concern I have for Rust’s future, and C’s lack of features over even Rust’s current complexity is definitely a substantial upside.

                                                  [The number of features added per year] speaks volumes to the stability of these languages

                                                  I don’t think I agree, you could add a million features per year and as long as nothing broke it would be perfectly stable. In particular, the source looking outdated is not an “issue” that means it broke. It would be interesting to see a comparison of breaking changes per year in each - I suspect Rust beats C++ and is on par with C.

                                                  No spec means there’s nothing keeping rustc honest. Any behavior it exhibits could change tomorrow.

                                                  They have made some rather strong backwards compatibility guarantees, and while there is not an official spec there is documentation of every public API. This greatly exaggerates the degree to which rustc could change without breaking promises.

                                                  C and C++ on the other hand have no problem with changing the spec with new versions. If anything rustc has more promises about backwards compatibility than C (or C++).

                                                  Edit: By the same logic there would be nothing keeping the Linux userspace honest - but obviously there is.

                                                  Serial programs have X problems, and parallel programs have X^Y problems, where Y is the amount of parallelism you introduce.

                                                  Certainly the case in C, I think Rust challenges this assumption. In my experience parallelism in idiomatic rust introduces few problems, and it doesn’t scale much with the amount of parallelism you introduce past 0.

                                                  However, nearly all programs needn’t be parallel.

                                                  Depends on how you define both “nearly all” and “needn’t”. I posit that most programs weighted by time spent on development benefit substantially from some degree of parallelism (speeding up computation, running IO simultaneously to a user provided script, etc).

                                                  rewriting an entire program from scratch is always going to introduce more bugs than maintaining the C program ever would.

                                                  This is a ridiculously high goalpost, Rust doesn’t have to replace C in existing programs to be a C replacement.

                                                  1. 8

                                                    “Rust is not a language most people who are still C programmers will like”, because most of those developers who would like it have already moved to C++ or other languages.

                                                    I think this is not as true as you think it might be? C devs who did not already move to another language may very well like Rust.

                                                    My personal experience is that many C devs I know seem generally keen on Rust because it solves real problems they have in C while still being usable in the spaces they care about (bare metal, embedded) and without much of the uncertainty and opacity that comes with C++ (virtual overloads, unexpected copy constructor calls, template shenanigans).

                                                    Bryan Cantrill also gave a talk last year on this very topic - how he likes Rust as a long time C dev.

                                                    1. 3

                                                      I think you are confusing stability and maturity. Rust is stable, but Rust is not mature (yet).

                                                      1. 3

                                                        Assuming you’re replying to the part of my comment in reply to

                                                        [The number of features added per year] speaks volumes to the stability of these languages

                                                        I think if you want to make that argument it’s sircmpwn who is confusing the two.

                                                        I’d add that I think “lack of maturity” isn’t a particularly good argument for “X is not a Y replacement”. To the extent maturity is required it can only happen with time. I don’t think the thesis being argued includes a “yet”.

                                                        1. 2

                                                          As you said, Rust didn’t have time to mature, so it is pretty much impossible for Rust to have maturity now. sircmpwn’s argument is different. According to sircmpwn, C++ had enough time to mature, but decided to remain forever immature, and in this respect, Rust is like C++. Since C is mature, Rust is not a good C replacement for those who value maturity.

                                                          My opinion is that Rust is unlike C++ in this respect and Rust will have maturity once enough time passes.

                                                    1. 4

                                                      I really like Ansible and I’m totally going to see if I can use all or part of this tutorial.

                                                      I bothers me a bit that updating OpenBSD 6.3 -> 6.4, per the official documentation, requires booting the installation media to preform the upgrade. In the world of could providers and VMs, I want to put together a guide to attempt to do this semi-inplace with a single reboot.

                                                      I’m glad I read all the release notes before attempting anything though. The OpenSMTPD configuration grammar has changed entirely. I’m going to have to redo all my work in a VM to make sure it all still works.

                                                      1. 4

                                                        The big issue with “in place” upgrades in the OpenBSD world is that there is no guarantee the ABI between X.Y and X.Y+1 will be the same. This can cause all sorts of issues while doing in place upgrades. For example, tar, once replaced by the updated binary could segfault for every subsequent call. This would leave the system in an unknown state.

                                                        I wrote an upgrade tool a while back (snap) that could be used to upgrade from release to release. The ABI issue was hit every couple of releases, so I removed the option to upgrade releases.

                                                        I am not saying it’s impossible.. just that you will basically have to backup everything prior to doing an install.

                                                        1. 1

                                                          Following -current a in-place upgrade mostly just works, but I also always keep a new bsd.rd ready in case in-place fails, reboot into bsd.rd and upgrade will fix it.

                                                          But I also have switched to a script that downloads sets, patches bsd.rd and reboots. Much less hassle and minimum downtime.

                                                          1. 1

                                                            This. I do the same - download bsd.rd, add an auto_upgrade.conf file to the image, then use that bsd.rd on all my systems to upgrade them. Just copy the patched bsd.rd over /bsd on the target, reboot, wait a few minutes, and the box is back up on the new release. I wrote my own script ages ago, but nowadays the upobsd port can take care of the patching bsd.rd bit.

                                                        2. 2

                                                          I want to put together a guide to attempt to do this semi-inplace with a single reboot.

                                                          There is such a guide in the official upgrade notes. The only reason it suggests two reboots is KARL.

                                                          1. 1

                                                            Back in the day, I just had a script that downloaded, extracted sets and install a new bsd. Then I rebooted it and ran another script that took care of etc changes and new users, cleanup up old files no longer needed (according to release notes). Last step was just a pkg_add -U.

                                                            I didn’t run into issues but I was aware of the risks and being on my own when it broke ;-)

                                                          1. 1

                                                            Author says a common class of gadgets uses such and such registers. Says avoid them in favor of other registers. Maybe the gadget type with those registers is common because the registers themselves are common from compiler choices. Switching registers might lead to gadgets just using those registers instead. Or are there x86-specific reasons that using different registers will do entirely different things you can’t gadget?

                                                            Other than that confusion, slides look like great work. Especially on ARM.

                                                            1. 15

                                                              Author here. Thanks for having a look! It was fun to do this talk.

                                                              Yes, there are X86 specific reasons that other registers don’t result in ROP gadgets. If you look at Table 2-2 in the Intel 64 and IA-32 Architectures Software Developer’s Manual you can see all of the ModR/M bytes for each register source / dest pair, and other places in that section describe how to encode the ModR/M bytes for various instructions using all of the possible registers. When I surveyed the gadgets in the kernel and identified which intended instructions resulted in C3 bytes that were used as returns in gadgets, there were a large number of gadgets that were terminating on the ModR/M byte encoding the BX series registers. You are correct that these gadgets are common because the compiler frequently chooses to use the BX series registers, and the essence of my change to clang is to encourage the compiler to choose something else. By shifting RBX down behind R14, R15, R12 and R13 the compiler will choose these registers before RBX, and therefore reduce the incidence of the use of RBX resulting in a C3 ModR/M byte. We can see that this works because just shifting the BX registers down the list results in fewer unique gadgets.

                                                              To directly answer your inquiry, gadgets arising from using R14, R15, R12, R13 instead (now that they will be more common) are not a problem. The REX prefix is never C3, and we can look at the ModR/M bytes encoding operations using those registers, and none of them will encode to C3. When I look at gadgets that arise from instructions using these registers, they don’t get their C3 bytes from the instruction encoding - they get them from constants where the constant encodes to a C3, so the register used is irrelevant in these cases. So moving RBX down behind R14, R15, R12 and R13 doesn’t result in more gadgets using those registers.

                                                              There are other register pairs that result in a C3 ModR/M byte. Operations between RAX and R11 can result in a C3 ModR/M byte, but these are less common when we survey gadgets in the kernel (~56 in the kernel I have here now). RAX and R11 were already ahead of RBX in the default list anyway, so moving RBX down the list does not result in more gadgets using R11. If you ask why we haven’t moved R11 down next to RBX, the answer is that gadgets using R11 this way are not that numerous, so it hasn’t risen to the top of the heap of most-common-sources-of-gadgets (and therefore has not got my attention). There are many other sources of gadgets that can be fixed and will have a larger impact on overall gadget counts and diversity.

                                                              I hope this clarifies that part of the talk. :-)

                                                              1. 3

                                                                Thank eveyone for the answers. Thank you in particular for this very-detailed answer that clarifies how x86’s oddities are creating the attack vectors.

                                                                The reason I wanted to know is that I planned to design around high-end ARM chips instead of x86 where possible because I believed we’d see less ISA-related attacks. Also, certain constructions for secure code might be easier to do on RISC with less performance hit. Your slides seem to support some of that.

                                                                1. 2

                                                                  To be fair, x86 doesn’t create the attack vectors, but does make any bugs much easier to exploit.

                                                                  ARM doesn’t have nearly the same problem - you can always ROP into a jump to THUMB code on normal ARM instructions, but these entry points are usually more difficult to find than an 0xc3.

                                                                2. 1

                                                                  I’m curious to learn more about ROP. I’d like to examine adding support for another target to ROPgadget.py. So what designates a gadget? Any sequence of instructions ending in a return? How do attackers compose functionality out of gadgets? By hand, or is there some kind of a ‘compiler’ for them?

                                                                  1. 3

                                                                    You might be interested in the ROP Emporium’s guide. Off the top of my head the only automatic tools I know of are ropper and angrop.

                                                                3. 5

                                                                  Switching registers might lead to gadgets just using those registers instead. Or are there x86-specific reasons that using different registers will do entirely different things you can’t gadget?

                                                                  If I understand this correctly, it’s because the ebx register causes opcodes to be created that contain a return instruction, i.e., opcodes that are useful in ROP. So by avoiding ebx as much as possible, you also avoid creating collateral ROP gadgets with early returns. This issue only happens because x86/amd64 have variable-length opcodes.

                                                                  1. 4

                                                                    As far as I understand, the register allocation trick is indeed x86-specific. The point is to avoid C3 bytes because these will polymorph into the RET instruction when used in unaligned gadgets. See the “polymorphic gadget” and ‘register selection’ sections in the slide set.

                                                                  1. 1

                                                                    This seems really cool. I’d love to have email more under my own control. I also need 100% uptime for email though, so it’s hard to contemplate moving from some large hosted service like Gmail.

                                                                    1. 4

                                                                      If email is that important to you (100% uptime requirement), then what’s your backup plan for a situation where Google locks your account for whatever reason?

                                                                      1. 1

                                                                        Yeah, that’s true. I mean I do have copies of all my email locally, so at least I wouldn’t lose access to old email, but it doesn’t help for new email in that eventuality.

                                                                      2. 3

                                                                        Email does have the nifty feature that (legit) mail servers will keep retrying SMTP connections to you if you’re down for a bit, so you don’t really need 100% uptime.

                                                                        Source: ran a mail server for my business for years on a single EC2 instance; sometimes it went down, but it was never a real problem.

                                                                        1. 1

                                                                          True. I rely on email enough that I’m wary of changing a (more or less) working system. But I could always transition piece by piece.

                                                                        2. 3

                                                                          If you need 100% delivery, then you can just list multiple MX records. If your primary MX goes down (ISP outage, whatever), then your mail will just get delivered to the backup. My DNS registrar / provider offers backup MX service, and I have them configured to just forward everything to gmail. So when my self hosted email is unavailable, email starts showing up via gmail until the primary MX is back online. Provides peace of mind when the power goes out or my ISP has outages, or we’re moving house and everything is torn apart.

                                                                          1. 1

                                                                            That’s a good system that seems worth looking into.

                                                                          2. 2

                                                                            Note that email resending works. If your server is unreachable, the sending mail server will actually try the secondary MX server, and if both are down, it will retry half an hour later, then a few more times up to 24 hours later, 48 hours if you are lucky. The sender will usually receive a noification if the initial attempts fail (and a second one when the sending server gives up)

                                                                            On the other hand, if your GMail spam filter randomly decides without a good reason that a reply to your email is too dangerous even to put into the spam folder, neither you nor the sender will be notified.

                                                                            1. 1

                                                                              And I have had that issue with GMail, both as a sender and a receiver, of mail inexplicably going missing. Not frequently, but it occurs.

                                                                          1. 1

                                                                            What is the overhead of this?

                                                                            I love openbsd, though one of my coworkers told me he fundamentally disagrees with security by depth when we are starting to get memory safe language and I somewhat agree. While our useful kernels and applications are written in C, it seems like the best thing to do for now.

                                                                            1. 3

                                                                              The overhead is two xor instructions per function call (using a register and the top of the stack). This is cheap.

                                                                              Memory safe languages have their own overhead. Even rust - which achieves so much at compile time - still has to check for overflow at runtime in some situations, and doesn’t implement integer overflow protection because it is too expensive. I am a big fan of memory safe languages, but there is still a lot of C/C++ out there.

                                                                            1. 1

                                                                              why are all the infosec people I follow charitably saying this is theatre at best and doesn’t do anything for any kind of attack?

                                                                              1. 8

                                                                                The most common negative response I have seen is that this can be bypassed if an attacker knows the addresses they will write their rop chain to. This is true, but it is not the case that all attacks know the addresses where the rop chain goes. The @grsecurity response is interesting, since they point out that this idea has been seen before (quite some time ago - in 1999 and 2003). If you have heard other specific criticisms, then I’d be interested to hear them.

                                                                                The next iteration of this doesn’t have to use the stack pointer - it can use something stronger. Step 1 is getting the ecosystem working with mangled return addresses. For this, the stack pointer is cheap and easy.

                                                                              1. 5

                                                                                Does OpenBSD have any plans to upstream RETGUARD to llvm?

                                                                                1. 14

                                                                                  Definitely. The llvm people I have spoken with have pointed out it might be better to implement in the preamble / epilogue lowering functions instead of as a pass, so once we prove it works in the ecosystem and have worked out any kinks, then I will do it that way and submit upstream.

                                                                                  1. 2

                                                                                    Very cool! Thank you for the detailed answer.

                                                                                  2. 3

                                                                                    Sure. Why not?