1. 10

  2. 1

    What’s the tl;dr on this? Is it using jails?

    1. 6

      It’s trying to reuse the whole of the OCI infrastructure, rather than reinvent something more FreeBSD-like (in contrast to something like pot, which may be more the ‘FreeBSD way’ but which is fighting an entrenched ecosystem formed by FreeBSD ignoring containers for a decade). This means:

      • Containerd is responsible for managing containers. It can talk to distributed orchestration frameworks, such as Kubernetes. It is responsible for things like assembling containers out of individual layers (and already supports ZFS, so has a clean path to working easily on FreeBSD).
      • A containerd shim is responsible for launching containers. On Linux, this is typically runc, which manages the mess of cgroups, namespaces, and seccomp-bpf that gives a vaguely jail-like abstraction. In this stack it will typically be runj, which manages jails. On Linux it can also be something layered on top of KVM, on FreeBSD eventually I hope that there will be something on top of bhyve.
      • Moby (Docker is the branded version of the Moby open-source project) that is mostly responsible for building containers. This is the thing that reads a Dockerfile and does stuff based on what it says.

      There are a few things that could be improved in the base system for this to work well:

      • Currently, ifconfig has to be run from inside the jail to set up networking, which means that runj needs to be extended to support running a command in the jail before it runs the real container entry point. This could be fixed if ifconfig could jail_attach itself and configure networks for jails from the host. There were some patches to do that under review, I don’t remember what happened to them. This is also really annoying for running Linux containers because you need to inject a FreeBSD ifconfig binary into the Linux system.
      • There’s no base-system abstraction over the multiple different firewalls. This pulls in a dependency on pf, which seems to be what everyone (including me) uses but it’s somewhat unfortunate that people using the others are left out. In my ideal world, the project would pick one and provide compat shims that translated rules in either format into a common in-kernel representation.
      • The FreeBSD base system is pretty big. There’s a load of stuff like the toolchain and svnlite that are not needed for 99% of containers. Actually, those two are the only things larger than 1 MiB in /usr/bin and if you remove them you remove about 80% of the size of /usr/bin. Removing the debug symbols and static libraries from /usr/lib eliminates 2.3GiB. /rescue (statically linked programs for recovery if rtld is broken) add another 15 MiB and are completely pointless in a container (if the contents of a container is broken, regenerate it). All of these things can be turned off with build flags and in PkgBase are separate packages, so it would be quite easy to provide a minimal container image.