1. 9
  1.  

  2. 6

    One of the authors here.

    We built a service that executes arbitrary user-submitted code. It’s the thing you’re not supposed to build, but we had to do it, because it’s a cloud build service.

    Running arbitrary code means containers weren’t a good fit ( container breakouts happen), so we are spinning up and down ec2 instances. This means we have actual infrastructure as code (i.e. not just piles of terraform but go code running in a service that spins up and down VMs based on API calls).

    The service spins up and down EC2 instances based on user requests and executes user-submitted build scripts inside them. It’s not the standard web service we were used to building, so we thought we’d write it up and share it with anyone interested.

    One cool thing we learned was how quickly you can Hibernate and wake up x86 EC2 instances. That ended up being a game-changer for us.

    Corey and Brandon did the building, I’m mainly just the person who wrote things down, but hopefully, people find this interesting.

    The next iteration of this service using firecracker VMs is already under investigation but this is working surprisingly well.

    1. 2

      One cool thing we learned was how quickly you can Hibernate and wake up x86 EC2 instances. That ended up being a game-changer for us.

      This was my main “huh, wow” moment reading it. Definitely going to be able to make use of that I think, had never realised it was possible.

    2. 4

      Developer here that worked on this. This system also had no shortage of obscure issues to debug.

      One of my favorite stories is wrestling with container networking several layers deep. Our build system is containerized, and sometimes the user’s workloads can contain more containers (which can themselves contain even more containers!) - it was a bit like the move “Inception” but with layers of container exec commands.

      1. 0

        doesn’t everyone else in this space just use containers?

        1. 1

          So our build runner is buildkitd, and it runs containerized but needs to run privileged. But I think the answer is no, everyone else doesn’t just let everyone run arbitrary containers on shared infra. AWS uses firecracker for lambda isolation for instance.

          1. 1

            I could see privileged containers being the line drawn in the sand, but it was interesting the reasoning regarding container breakouts

            I guess I feel like container breakouts for unprivileged containers isn’t something that people typically worry about… perhaps as much as they should?

            I guess I need to try out the service but a CI service that doesn’t support customer containers seems constrained to me, maybe I just need to give that some more thought

            1. 1

              Oh, so this is probably bad communication on my part, but we do allow customers to run their own containers. We just don’t run all the containers together on a shared instance, like a shared Kuberentes cluster or something. Instead each customer is on their own EC2 instance. Containers are fine for packaging, we just have another layer there.

              1. 2

                each customer is on their own EC2 instance

                ah interesting

                yea, the more i read about buildkit and earthfiles the more i was seeing the whole thing together, i think the article might assume a lot of knowledge about what you all already have in place, which might be fine depending on your intended audience

                to me your company is just known as “that company trying to figure out how to get ci pipelines to run the same locally as remotely” which is a very inciting prospect when you’re an engineer working on devops tooling and you’re trying to figure out how to get things to work the same w/ local dev/build as it does in a gitlabci runner, which we run for ourselves w/ k8s clusters. but i’m not exactly sure how gitlab.com provides theirs, which this article would be more analogous to… that’s why i started the thread