1. 13
    1. 3

      I do something similar for testing rust-fuse. The integration tests must run as root and I don’t want to worry about having test state stick around between runs, so those tests are executed under QEMU.

      The implementation is a bit different though – the guest filesystem is minimal (basically just /sbin/init in an initrd) and there’s no filesystem sharing. The test runs are hermetic, deterministic, and reproducible.

      From the post, it seems like sharing the host filesystem is an important design goal of vmtest, which doesn’t seem like a benefit to me. Keeping the environments separate allows the guest and host OSes to be be decoupled. I can do development on my Linux workstation or macOS laptop, and can test both the Linux and FreeBSD implementations of FUSE in the same test run.

      Being able to test different OSes (and different versions of those OSes) can discover bugs in unexpected places, for example in the FreeBSD kernel:

      I’m also not really sure what the purpose is of allowing a test to access the terminal or test configuration file. That just seems like more opportunity for non-determinism to slip in.

      1. 2

        Thank you for the perspective! Rootfs sharing is only done for kernel targets — vmtest also supports running standard qcow2 images for more deterministic / hermetic setups. In image targets, only the directory rooted at vmtest.toml is shared into the guest at /mnt/vmtest (to make it easier to move things in and out).

        I have not tried non-linux guest OSes for vmtest, but I suspect it might work with a little bit of work. I’ve sent some patches to qemu-guest-agent before and from I can tell they take cross platform support seriously (eg windows is supported too). I suspect the same might apply for host OS as well.

      2. 1

        I used something similar to this for testing Linux kernel changes. It was great for rapid iteration. It took 8 seconds to boot the new kernel, with bash as init any my host filesystem mounter read only as the guest’s root. I could build my tests outside and then run them in both host and guest and make sure that bugs were fixed and so on. I definitely wouldn’t want to use something like that for userspace development (including FUSE/CUSE drivers) where I’d want to avoid any host state leaking. Similarly, I wouldn’t want to use it for CI, where having a fully deterministic build environment is crucial.

        1. 1

          Out of curiosity, how would vmtest affect determinism in CI? For github actions at least, each new job already runs in a fresh VM. So anything that runs inside the top level VM (vmtest being the nested VM) already possess a degree of determinism, right? (Ignoring the usual sources of non-determinism like fetching packages)

    2. 2

      This technique is also something we use with Nix. I noticed that setting up integration tests like that tends to pay off quite quickly as it’s easy to execute locally and tends to run pretty quickly.

    3. 1

      This gets pretty complicated once there are additional requirements involved:

      • support for macOS :),
      • scaling issues, so this will need another layer, a scheduler, to be build on top on vmtest, and then “host machine” would be just one host of many,
      • kernel panic handling in case there is kernel development going on,
      • not sure if now it’s possible to write a test that requires restarting the VM?
      • some tests may require more than running a simple binary, so there should be a way of preparing the environment for the actual test (so that VMs have some Python version with some packages installed, some applications or libraries that are in conflict with our software are installed and we have tests for that),
      • Automatic way of creating new fresh VMs with a new OS version; for macOS this means that a new VM with a new update is created, on Linux that would probably mean a fresh copy of the kernel/distribution we can to test our software on.

      I’m maintaining a similar thing in my company for some years, but it only supports macOSes and is built on top on VMware Fusion. It’s working, but it’s not great, because of i.e. bugs and limitations of Fusion (and it’s hard to find an employee that cares, but that’s a different story).

    4. 1

      I wonder if anyone has used User Mode Linux for tests? It’s basically a way to build the Linux kernel so that it runs on top of its own syscall interface, not hardware. The the kernel runs as a user process.

      https://en.wikipedia.org/wiki/User-mode_Linux

      I remember Linode used it like ~20 years ago. Here’s a 2019 guide:

      https://xeiaso.net/blog/howto-usermode-linux-2019-07-07

      https://news.ycombinator.com/item?id=20379063

      I think NetBSD’s rump kernel is similar: https://en.wikipedia.org/wiki/Rump_kernel


      In theory this should be faster to start than a Linux system under QEMU ?

      I’m tired of playing Whac-a-Mole with very useful but non-deterministic tests with https://oilshell.org . The /proc interface is inherently racy

      So I wonder if we can can somehow instrument the kernel state too for some more deterministic assertions. Job control added some more kernel state

    5. 1

      Looking at vmtest, it appears to be a Rust program that parses a config file, has a little UI, spawns QEMU, and communicates with the QEMU Guest Agent

      https://github.com/danobi/vmtest/tree/master/src

      https://wiki.qemu.org/Features/GuestAgent

      I wonder if it could just be a shell script? Then you don’t have to worry about distributing binaries.

      The QGA stuff might be hard in a shell script, but there’s probably a way.

      In any case, it’s a nice example of how to run QEMU in CI ! Which I’ve been very close to doing

      1. 3

        Yes, it could be a shell script! That’s how most of the “vmtest” predecessors in the kernel-y space work. But they suffer from fragility and maintenance issues. For example triply escaped shell strings you would need to pass to QEMU. A binary is more heavy weight for sure but there are ways to make it smoother (eg cargo-dist).

        Although note that use of QGA in this problem space is new AFAIK.

    6. 1

      Out of curiosity, why use qemu instead of firecracker?

      1. 4

        Firecracker only supports Linux hosts, and it requires KVM. Both of those make sense given its design goals, but for use in development and CI they can be limiting.

        QEMU runs pretty much anywhere, and it doesn’t have a hard requirement on host CPU features that might be absent when the host itself is a VM.

      2. 1

        No particular reason other than QEMU is mature and available everywhere. Firecracker would be a good backend to explore. Unclear if it has all the functionality we need from skimming the docs.