1. 1

    Good read. I would like to know if and how BTRFS does guarantee that data that is on the disk is the one that was written. As well it would be interesting to know how much this guarantees are impacting the performance. I would be grateful if somebody who has answers to this could shed some light.

    1. 2

      (The same answear you can find on the blog in the comment section).

      Let’s start by seeing, “I’m not an expert in BTRFS”.

      That said, first of all, the BTRFS is using b-tree instead of the merkle tree. They decided not to keep the checksum of the node in the node above. Instead, they are checksumming the level of the block and block number where this block is supposed to live. This allows them to detect misplaced writes/reads on the media.

      Everything that points to a tree block also stores the transaction id (the generation field) it expects that block to have, and this allows to detect phantom writes.

      The difference with ZFS is that it checks the checksum of the block instead of the transaction id.

      Source: https://btrfs.wiki.kernel.org/index.php/Btrfs_design

      1. 1

        Thank you for the reply!

    1. 7

      Off-topic: is it just me or do all of the game screenshots look like they have a greyscale canvas filters applied?

      1. 1

        Yes. That’s the ‘style’ of my blog. ;)

        1. 8

          Might confuse people when you talk about video games working on a specific OS. To me, it didn’t look like they were working correctly…

          1. 1

            If you’d like to dissuade people from gaming on FreeBSD I would leave it as-is.

          1. 8

            Open-Source games

            You missed the big one here. The Elder Scrolls III: Morrowind! :)

            [emulators] many of them for GameBoy, SNES, NeoGeo and other games consoles

            Much more exciting: Dolphin (GameCube/Wii), PPSSPP (PSP), RPCS3 (PS3) — and if you’re on a Radeon GPU, these can use Vulkan!

            And now, some extra things that aren’t in ports yet but might be interesting:

            1. 2

              Oh right! Added. Thanks! I’m didn’t dig very deeply into the game consoles emulators. Now I see that I have to :)

              1. 1

                Wow, didn’t know OpenMW worked on FreeBSD. That’s good to know. What an awesome project.

                1. 1

                  I wonder if Dolphin would work with a USB controller on FreeBSD, that would give me a whole new usecase for an old Juniper

                  1. 1

                    It does. It can also directly access a USB Bluetooth dongle, which worked for my (3rd party knockoff) wiimote

                1. 7

                  When I first read about Capsicum back in 2010 I thought it was a very cool idea, much like the later pledge system call in OpenBSD. I especially liked the idea that they introduced Capsicum calls to Google Chromium, as browsers are just piles and piles of code that you just generally have to trust. It’s just really unfortunate that these things are all tied to a specific operating system.

                  I wonder if those Capsicum changes were ever accepted upstream and are still maintained?

                  1. 21

                    It was intended to be cross-platform concept. Lots of the big companies have Not Invented Here syndrome which sort of ties into them liking to control and patent anything they depend on, too. Examples:

                    1. Google’s NaCl was weaker, but faster, than capability security. Just used Java and basic permissions for Android.

                    2. Microsoft Research made a lot of great stuff but Windows division only applies tiniest amount of what they do.

                    3. I saw a paper once about Apple trying to integrate Capsicum into Mac OS. I’m not sure if that went anywhere.

                    4. Linux tried a hodgepodge of things with SELinux containing most malware at one point. It was a weaker version of what MLS and Type Enforcement branches of high security were doing in LOCK project. These days, it’s even more hodgepodge with lots of techniques focused on one kind of protection or issue sprinkled all over the ecosystem. Too hard to evaluate its actual security.

                    5. FreeBSD, under TrustedBSD, was originally doing something like SELinux, too. Capsicum team gave them capability-security. That’s probably better match for today’s security policies. However, one might be able to combine features from each project for stronger security.

                    6. OpenBSD kept a risky architecture but did ultra-strong focus on code review and mitigations for specific attacks. It’s also hard to evaluate. It should be harder to attack, though, since most attackers focus on coding errors.

                    7. NetBSD and DragonflyBSD. I have no idea what their state of security is. Capsicum might be easy to integrate into NetBSD given they design for portability and easy maintenance.

                    8. High-security kernels. KeyKOS and EROS were all-in on the capability model. Separation kernels usually have capabilities as a memory access and/or communication mechanism, but policies are neutral for various security models. The consensus in high-assurance security is that the above OS’s need to be sandboxed entirely in their own process/VM space since there’s too much risk of them breaking. Security-critical components are to run outside of them on minimal runtimes and/or tiny kernels directly. These setups use separation kernels with VMM’s designed to work with them and something to generate IPC automatically for developer convenience. Capsicum theoretically could be ported to one but they’re easier to use directly.

                    9. Should throw in IBM i series (formerly AS/400). The early version, System/38, described in this book was capability-secure at the hardware level. They appear to have ditched hardware protections in favor of software checks. Unless I’m dated on it, it’s still a capability-based architecture at low levels of the system with PowerVM used to run Linux side-by-side to get its benefits. That makes it a competitor to Capsicum and longest-running capability-based product in the market. Whereas, longest-running descriptor architecture, which also ditched full protections in hardware, is Burroughs 5500 sold by Unisys as ClearPath Libra in modern form.

                    1. 5

                      Nice listing, thanks! If you haven’t heard of it, and going in a slightly different direction, you may be interested in CheriBSD which is a port of FreeBSD on top of capability hardware, the CHERI machine. (This makes it undeployable pretty much anymore, but it’s interesting research that I expect to pay dividends in many ways.) The core people working on Capsicum are also working on CHERI.

                      1. 4

                        My post was for software stuff mainly. On hardware side, I’m following that really closely along with research like Criswell’s SVA-OS (FreeBSD-based) and Hardbound/Watchdog folks. They’re all doing great work of making fundamental problems disappear with minimal, performance hit. I was pushing some hardware people to port CHERI to Rocket RISC-V. There weren’t any takers. One company ported SAFE to RISC-V as CoreGuard.

                        CHERI is still one of my favorite possibilities, though. I plan to run CheriBSD if I ever get a hold of a FPGA board and the time to make adjustments.

                      2. 3

                        Wow, thank you for the extremely thorough reply (this is the sort of thing I really like about the lobste.rs community)!

                        It makes sense that there are multiple experiments and various OSes having a completely different approach (the hardware protection of System/38 you mentioned sounds particularly interesting), but I was mostly thinking about the POSIX OSes. The Capsicum design fits quite well into the POSIX model of the world.

                        I wonder why Apple did not follow through with Capsicum. They’re not too afraid to take good ideas from other OSes (dtrace comes to mind, and their userland comes mostly from FreeBSD IIRC).

                        1. 3

                          Capsicum might be easy to integrate into NetBSD given they design for portability and easy maintenance

                          There was a port of CloudABI to NetBSD, which kind of “includes” Capsicum (just not for NetBSD-native binaries).

                          one might be able to combine features from each project for stronger security

                          Indeed. Sandboxes protect the world from applications touching things they’re not supposed to, MAC things like TrustedBSD and SELinux were (at least originally) designed to implement policies on an organizational level, like documents having sensitivity levels (not secret, secret, top secret) and people having access to levels only lower than some value, etc.

                          1. 2

                            Re CloudABI. Thanks for the tip.

                            Re 2nd paragraph. You’re on right track but missing the overlap. SELinux came from reference monitor concept where every subject/object access was denied by default unless a security policy allowed it. So, sandboxing or, more properly, an isolation architecture done strong as possible was the first layer. If anything, modern sandboxing is weaker at same goal by lacking enforcement consistently by simple mechanism.

                            From there, you’re right that organizational design often influenced the policies. Since military invented most INFOSEC, their rules, Multilevel Security, became default which commercial sector couldnt do easily. Type Enforcement was more flexible, doing military and some commercial designs. Note you could also do stuff like Biba to stop malware (deployed in Windows, too), enforcing database integrity, or even some for competing companies to make sure they didnt share resources. The mechanism itself wasn’t rooted in organizational stuff. That helped adoption.

                            Eventually they just dropped policy enforcement out of kernel entirely so it just did separation. Middleware enforced custom policy. Still hotly debated since it’s most flexible but gived adopters plenty of rope. Hence, language-based coming back with strong type systems and hardware/software schemes mitigating attacks entirely.

                          2. 2

                            High-security kernels.

                            just to add, there’s also Coyotos in the EROS family, which gave us BitC, which is an interesting (if dead) language.

                            Zircon is also working on an object capability model, but I haven’t looked too deeply at it myself.

                            edit: Also, CapLore has some really interesting articles, such as this one on KeyKos…

                            1. 2

                              Yeah, they were interesting. People might find neat ideas looking into them. I left them off cuz Shapiro got poached by Microsoft before completing them.

                              Far as Zircon, someone told me the developers were ex-Be, Danger, Palm, and Apple. None of those companies made high-security projects. The developers may or may not have at another company or in spare time. This is important to me given the only successes seem to come from people that learned the real thing from experienced people. Google’s NIH approach seems to consistently dodge using such people. Whereas, Microsoft and IBM played it wise hiring experts from high-security projects to do their initiatives. Got results, too. Google should’ve just hired CompSci folks specialized in this like the NOVA people. Them plus some industry folks like on Zircon to keep things balanced between ideal architecture and realistic compromise.

                              I’ll still give the final product a fair shake, regardless, though. I look forward to seeing what they come up with.

                              1. 2

                                totally agreed re: Google; I also have concerns about some of the items I’ve seen such as this, which discusses systems within Fuchsia that could be used for adverts, as well as Google’s tendency to do something cool and then drop it.

                                Also, re: Shapiro: I think he’s interesting, but I also (having dealt with him on the mailing lists) wonder about his ability to produce, since Coyotos/EROS/and-so-on were largely embryonic (at best).

                                1. 2

                                  re Google. They’re an ad company. Assume the worst. I even assumed Android itself would get locked up somehow over time where we’d loose it, too. Maybe with technique like this. Well, anything that wasn’t already open. We’re good so long as they open source enough to build knock-off phones with better privacy and good-enough usability. People wanting best-in-class will be stuck with massive companies without reforms about patent suits and app store lock-in.

                                  re Shapiro. He was a professional researcher. Their incentives are sadly about how many papers they publish with new, research results. Most don’t build much software at all, much less finish it. He was more focused than most with the EROS team having running prototype they demo’d at conferences. Since he’s about research, he started redoing it to fix its flaws instead of turn it into a finished product. They did open-source it in case anyone else wanted to do that. I’m not sure whether these going nowhere says something about him, FOSS developers’ priorities, or both. ;)

                                  1. 2

                                    Completely agreed re: Google. I don’t even disagree re: Shapiro either, but I’ll add one comment: I looked at the source code for EROS/Coyotos/BitC such that they were… it wasn’t something you could just dive into. Describing it as “hairy” and “embryonic” is about as kind as I can be for someone who has been awake since 0300 local.

                                    1. 2

                                      Thanks for the tip. Yeah, that’s another problem common with academics. It’s why I don’t even use stuff with great architecture if they coded it. I tell good coders about it hoping they’ll do something like it with good code. For some reason, the people good at one usually aren’t good at the other. (shrugs) Then you get those rare people like Paul Karger or Dan Bernstein that can do both. Rare.

                                      1. 2

                                        so Bernstein’s father was one of my professors in college; definitely an interesting fellow… I can see at least why he has practical chops, since his father is a very practical (if nitpicky) coder himself.

                                        1. 2

                                          That’s cool. I didn’t know his dad was a programmer. That makes sense.

                            2. 1

                              I never understood why NaCl didn’t take off. I loved that framework.

                              1. 1

                                I was never sure about that myself. A few guesses are:

                                1. It’s hard to get any security tech adopted.

                                2. Chrome was still having vulnerabilities. Might have been seen as ineffective.

                                3. Couldve been a burden to use.

                                4. Other methods existed and were being developed that might be more effective or usable.

                            3. 3

                              Unfortunately Google never accepted those changes :(

                            1. 1

                              assert() has a bug ;-)

                              1. 1

                                Aghr… To quick! Thanks.

                              1. 3

                                I’ve never really wanted better looking assertions—just the fact that assert() failed is enough for me to rerun the program under a debugger (or look at the core file if there is one). What would be nice (for my uses) is the output going through syslog() (since I tend to write server software) but I can live without that.

                                Also, his example ASSERT3U(a++, ==, b) is bad, as one MUST NOT (per RFC-2119) use expressions with side effects in assert()—only bad things can happen otherwise.

                                1. 1

                                  Strangely enough recruiters ask these kind of questions all the time. Passing value to a function using increment or decrement operator. It just makes me shudder.

                                  1. 2

                                    That’s not necessarily bad though. It is bad for assert() because its behavior can change depending upon the definition of the macro NDEBUG (namely, assert() does nothing if NDEBUG is defined as the expression is not compiled at all).

                                    1. 2

                                      I agree that the chosen example of ASSERT3U(a++, ... is not good. Omitted from this post, though, is the mandatory assertion form we provide via sys/debug.h: VERIFY(). The VERIFY family (including an analogous VERIFY3U, VERIFY3P, etc) is never compiled out, even for release builds. You can use it for critical things like bounds checks on buffers, or checks on critical return values (even from functions which mutate) where failure is believed not possible, etc.

                                      We often use these routines in the operating system kernel. Crash dumps are often very large, sometimes aren’t completely written to disk (e.g., if the disk subsystem itself was the problem), and certainly contain private information that users aren’t always keen to expose to third parties – even for debugging purposes. In those cases, it can be quite helpful to have the panic message contain both the expected and the actual value in addition to the location of the assertion.

                                      1. 2

                                        Agree. But something I did to work around the lack of core files (or the inability to obtain them) I wrote code to log a crash report to syslog which comes in handy for me. Yes, the code is probably not asynch safe, but I tried as best as I could, and so far, it’s worked fine for me.

                                        1. 1

                                          Does the actual Solaris/Illumos code send the failure message to stdout as shown in the article? This seems like a textbook case of something that should go to stderr…

                                          1. 1

                                            I cannot speak to the Oracle fork, but illumos uses stderr here. Most of the VERIFY and ASSERT macros are wrappers around assfail(), which in the C library has an explicit write to file descriptor 2 (i.e., STDERR_FILENO) in a way that tries to avoid as much of the stdio machinery as possible (it might be broken).

                                            1. 1

                                              Yes I fixed that, thanks.

                                          2. 1

                                            Yea you are right. The example could be better. I added a notice about that.