1. 33
    1. 24

      So, basically, “Use FreeBSD or Illumos because jails and ZFS”.

      What if you don’t need virtualization at all and your filesystem needs are as simple as “Mostly holds my data most of the time” ?

      I realize that sounds flippant, but for those of us who work mostly with short lived cloud instances, it’s not clear to me whether these are significant enough advantages to warrant the increased difficulty one might encounter in building infrastructure using these operating systems.

      Illumos and FreeBSD are amazing, it’s just not clear to me that they represent the clear and present broad based win the the article author is touting.

      (I know there are a ton of BSD and Solaris folks here so I’m prepared to be tarred and feathered for my ignorance :)

      1. 5

        warrant the increased difficulty one might encounter in building infrastructure using these operating systems.

        Funny, I use these operating systems (not only) because it’s less difficulty than using Linux…

        1. 4

          What are some of the advantages you see in terms of ease of administration and launching of short lived cloud instances?

    2. 22

      To start, the ZFS filesystem combines the typical filesystem with a volume manager. It includes protection against corruption, snapshots and copy-on-write clones, as well as volume manager.

      It continues to baffle me how “mainstream” filesystems like ext4 forgo checksumming of the data they contain. You’d think that combatting bitrot would be a priority for a filesystem.

      Ever wondered where did vi come from? The TCP/IP stack? Your beloved macOS from Apple? All this is coming from the FreeBSD project.

      Technically, vi and the BSD implementations of the TCP/IP stack can be attributed to 4.xBSD at UCB; FreeBSD is not the origin of either.

      1. 10

        It continues to baffle me how “mainstream” filesystems like ext4 forgo checksumming of the data they contain. You’d think that combatting bitrot would be a priority for a filesystem.

        At least ext4 supports metadata checksums:

        https://wiki.archlinux.org/index.php/ext4#Enabling_metadata_checksums

        At any rate Ted T’so (the ext[34] maintainer) has said as far back as 2009 that ext4 was meant to be transitional technology:

        Despite the fact that Ext4 adds a number of compelling features to the filesystem, T’so doesn’t see it as a major step forward. He dismisses it as a rehash of outdated “1970s technology” and describes it as a conservative short-term solution. He believes that the way forward is Oracle’s open source Btrfs filesystem, which is designed to deliver significant improvements in scalability, reliability, and ease of management.

        https://arstechnica.com/information-technology/2009/04/linux-collaboration-summit-the-kernel-panel/

        Of course, the real failing here is not ext4, but that btrfs hasn’t been able to move to production use in more than ten years (at least according to some people).

        That said, ZFS works fine on Linux as well and some distributions (e.g. NixOS) support ZFS on root out-of-the-box.

        1. 3

          Of course, the real failing here is not ext4, but that btrfs hasn’t been able to move to production use in more than ten years (at least according to some people).

          I think it’s good to contrast “some people’s” opinion with the one from Facebook:

          it’s safe to say every request you make to Facebook.com is processed by 1 or more machines with a btrfs filesystem.

          Facebook’s open-source site:

          Btrfs has played a role in increasing efficiency and resource utilization in Facebook’s data centers in a number of different applications. Recently, Btrfs helped eliminate priority inversions caused by the journaling behavior of the previous filesystem, when used for I/O control with cgroup2 (described below). Btrfs is the only filesystem implementation that currently works with resource isolation, and it’s now deployed on millions of servers, driving significant efficiency gains.

          But Facebook employs btrfs project lead.

          There is also the fact that Google is now using BTRFS on Chromebooks with Crostini.

          As for opinions I’ve seen one that claims that “ZFS is more mature than btrfs ON SOLARIS. It is mostly ok on FreeBSD (with various caveats) and I wouldn’t recommend it on Linux.”.

          1. 2

            I wouldn’t recommend it on Linux.

            I’d still say that ZFS is more usable than lvm & linux-softraid. If only due to the more sane administration tooling :)

      2. 9

        Ext4, like most evolutions of existing filesystems, is strongly constrained by what the structure of on-disk data and the existing code allows it to do. Generally there is no space for on-disk checksums, especially for data; sometimes you can smuggle some metadata checksums into unused fields in things like inodes. Filesystems designed from the ground up for checksums build space for checksums into their on-disk data structures and also design their code’s data processing pipelines so there are natural central places to calculate and check checksums. The existing structure of the code matters too because when you’re evolving a filesystem, the last thing you want to do is to totally rewrite and restructure that existing battle-tested code with decade(s) of experience embedded into it; if you’re going to do that, you might as well start from scratch with an entirely new filesystem.

        In short: that ext4 doesn’t have checksums isn’t surprising; it’s a natural result of ext4 being a backwards compatible evolution of ext3, which was an evolution of ext2, and so on.

      3. 4

        It continues to baffle me how “mainstream” filesystems like ext4 forgo checksumming of the data they contain. You’d think that combatting bitrot would be a priority for a filesystem.

        Ext4 doesn’t aim to be that type of filesystem, for desktop use on the average user, this is fairly okay since actual bitrot in data the user cares about is rare (most bitrot occurs either in system files or empty space or in media files where the single corrupt frame barely matters).

        If you want to check out a more modern alternative, there is bcachefs. I’ve been using it on my laptop for a while (until I stopped but now I’m back on it) and it’s been basically rock solid. The developer is also working on erasure coding and replication in a more solid way than btrfs currently has.

    3. 12

      Is the author not aware that you can use zfs on linux with both lxc and docker? Article doesn’t really go into any of the details of what goodies that I’m missing out on.

      1. 7

        Newer features like encryption and raw sends (likely crucial to some end-to-end encryption features Datto is planning on offering; Tom Caputi, one of their engineers, authored that feature) might take a while to land in FreeBSD and IllumOS.

        At the same time, FreeBSD ZFS has had TRIM for a while, but it’s still being developed on Linux and appears to be in some sort of development hell (c.f. https://github.com/zfsonlinux/zfs/issues/598 https://github.com/zfsonlinux/zfs/issues/5925).

        See http://open-zfs.org/wiki/Features for more details - it seems to be a bit out of date, especially regarding things like encryption which made their way to 0.8.0-pre in Linux.

        So, the result is that a lot of newer pool feature flags are Linux-only for now, and pools created with -pre releases of ZFS can’t be imported on other platforms. Not a problem if you’re only using Linux, but something to keep in mind.

        1. 5

          We’re working on importing the encryption changes, but we’re not going to finish that process until we’re sure they’re ready. There have been some panics and other issues that we’ve been chasing down. It’s the file system, so it’s extremely important to get this stuff right, even when it takes a while.

          1. 2

            Absolutely, and the encryption changes are rather extensive.

            Related: what do you think of channel programs? Running Lua in the kernel scares me a little.

            1. 2

              I’m not sure that I’m scared. I think the Lua interpreter we have is almost certainly a high quality piece of code that has seen a lot of testing and (I believe?) fuzzing in the upstream codebase. I definitely had (and, really, still have) reservations about it – you can read about some of them in comments I made on the pull request.

              That said, because the implementation is still (as far as I know) constrained to use by the super-user – and even then, only in the global zone – I don’t think it’s been the cause of much if any havoc on general purpose or multi-tenant systems. I do appreciate that if you’re building a sealed appliance that happens to sit on top of illumos, it probably does help with certain internal ZFS administration functions that you might want to build.

        2. 3

          Sure, I can accept that there will be version/feature skew, however that’s a long way from the claims the author is making about “doing it wrong”.

    4. 11

      Don’t forget HAMMER on DragonFlyBSD if we’re gonna talk about the right way to do storage management.

      1. 3

        Could you elaborate on this?

    5. 9

      Not a single mention of DTrace? Why? For me personally that would be the biggest selling point, yeah it comes even before ZFS for me. It totally changed the way how I perceived operating systems and monitoring.

    6. 8

      With pkgng, the package management tool used in FreeBSD has almost 27.000 compiled packages for you to use. Almost all software found on any of the important GNU/Linux distros can be found here. Java, Python, C, C++, Clang, GCC, Javascript frameworks, Ruby, PHP, MySQL and the major forks, etc. All this opensource software, and much more, is available at your fingertips.

      But (last time I checked) unfortunately not Racket, which for me is a no-go for putting it on my server.

      Tried using it on the desktop as well, but LibreOffice crashed constantly and Gnumeric gave strange rounding errors. Probably not the fault of the FreeBSD developers, but I think it’s best to be realistic about the trade-offs of using FreeBSD, which is not something that can be said about the documentation or the linked article:

      PS: I haven’t mentioned both softwares, FreeBSD and SmartOS do have a Linux translation layer. This means you can run Linux binaries on them and the program won’t cough at all.

      I just found this not to be the case at all.

      1. 10

        unfortunately not Racket

        huh? lang/racket is now at version 7.1. I even contributed a patch that enabled the build on non-x86 architectures like aarch64 (without JIT) :)

        1. 3

          Yay! Thanks for fixing that.

      2. 2

        Tried using it on the desktop as well, but LibreOffice crashed constantly and Gnumeric gave strange rounding errors.

        Which FreeBSD version was that?

          1. 3

            This is Gnumeric 1.12.43 on FreeBSD:

            https://i.postimg.cc/7Lg4wWgW/vermaden-2018-11-25-11-46-10.png

            Seems to be fixed now.

            I never had any stability issues with either Libreoffice 5/6 or Gnumeric on FreeBSD.

            Used FreeBSD 11.2 and tried 12.0-RC1 recently - also stable.

            … but thats me.

            Regards.

            1. 1

              I think the bug is/was dependent on the type of processor.

    7. 7

      Or use OpenBSD - simplicity is important if you want security and stability.

    8. 7

      Are FreeBSD jails remotely as usable as Docker for Linux? Last time I checked they seemed rather unusable.

      1. 3

        In technical terms they’re just fine, in my semi-professional experience. What they lack is the ergonomics of Docker.

        1. 5

          I’m not very impressed with the ergonomics of docker, and it’s definitely not obvious to me that BSD jails are an inferior solution to it.

          1. 5

            Ok, so I’m a big fan of BSDs, so I’d be very interested if there’d be a nice (not necessarily identical, but similar) way to do the roughly the following things with jails:

            vi Dockerfile # implement your container based on another containers
            docker build . # build it
            docker push https://<internal_storage>/money-maker:0.9 # push it to internal repo
            ssh test_machine
            docker run https://<internal_storage_server>/money-maker:0.9 # run the container on the test machine
            
            1. 5

              The obvious equivalent I can think of is:

              • Create a jail
              • Set it up (whether manually or via a Dockerfile-equivalant shell script)
              • Store a tar of its filesystem to https://<internal_storage>/money-maker:0.9
              • Create a jail on the destination machine
              • Untar the stored filesystem
              • Start the jail

              These steps aren’t integrated nicely the way they are with docker, but they are made of small, otherwise-useful parts which compose easily.

              1. 4

                Sure. How much work do you think needs to be done to get the benefits of Docker’s layer-based approach to containers? If your containers are based on each other, you get significant space savings that way.

                1. 0

                  ZFS deduplicates stored blocks, so you would still get the space savings. You would still have to get it over the network, though.

                  1. 6

                    ZFS does not dedup by default, and deduping requires a lot of ram to the point that I’d not turn it on for performance reasons. I tried a 20TiB pool with/without, the speed was about 300k/s versus something closer to the underlying ssd’s performance. It was that bad, even after trying to tune the piss out of it.

                    Hardlinks would be faster at that point.

                  2. 3

                    No no no, ZFS dedup wastes some ridiculous amount of RAM. Do you use it IRL or are you just quoting a feature list?

                    1. 1

                      I use it, bit not on anything big, just my home BAS.

            2. 2

              One option is to use a jail-management system built on top of the raw OS functionality. They tend to take an opinionated approach to how management/launching/etc. should work, and enforce a more fleshed-out container model. As a result they’re more ergonomic if what you want to do fits with their opinions. CBSD is probably the most full-featured one, and is actively maintained, but there are a bunch of others too. Some of them (like CBSD) do additional things like providing a unified interface for launching a container as either a jail or a bhyve VM.

    9. 15

      I don’t use Linux, I use GNU, and i don’t use GNU for technical reasons. Picking GNU is like picking a punk band over rock and roll (Windows), pop (Mac), or folk/country (BSD).

      1. 19

        I hurd you.

      2. 7

        I hope that makes OpenBSD (puffer)phish.