1. 22
  1.  

  2. 2

    I like the Dockerfile pattern there for Go, copying just the go.mod and go.sum files, and downloading those, before copying the rest of the source, to take proper advantage of Docker layer caching when iterating on source during development.

    That’s a definite “duh, why aren’t I doing this?” moment.

    1. 4

      Go already has a cache for incremental builds and downloads, and builds are reproducible. Using Docker layers in this case is redundant and is a major slowdown whenever some core input changes (e.g. go.mod and go.sum). That is especially true for large projects with lots of generated code since updating a single dependency will re-download everything and then rebuild the whole project from scratch.

      Instead, with modern Docker releases, you can let the Go manage its cache.

      RUN \
          --mount=type=cache,id=gocache,target=/root/.cache/go-build \
          --mount=type=cache,id=gomodcache,target=/go/pkg/mod \
        go install ./cmd/...
      

      I’d also suggest using gcr.io/distroless/static as a base for the final image unless you need a full-blown distro.

      1. 2

        ThoseThose are some great points, especially the distroless usage.

        With regards using the cache mount type, I have to manage that cache between build agents to make it useful, whereas with a multistage build I can push the intermediate stages after build, and pull before the next build to get a populated cache.

        I had problems doing this with buildkiy before, but I should revisit again soon

      2. 2

        Thanks!

        I spend a lot of time in various companies fixing their dockerfiles for speed/size/content, and this split I originally did with nodejs containers…then I realised I could make my go ones significantly faster when not changing deps too :)

      3. 1

        I love this! We slurp the nomad event stream into Loki today for just having observability. One thing I’ve noticed is that 1000 events (the default retention) isn’t very many when there’s a job having problems and I wonder how far that can be pushed before there’s a problem with the control plane. The other issue is that we’ve tried both Go and Python based methods for listening and there’s something that grinds everything to a halt, but not from Nomad’s POV. We have some hilariously large job specs and I’m wondering if that’s it.

        There was a problem as well where you couldn’t stream off every namespace but that’s fixed, yay!