1. 73
  1.  

  2. 23

    This blog is so consistently good and gives such concrete suggestions. Thanks.

    1. 17

      I’m trying to find a charitable interpretation for the fact that “avoid installing security updates because this distribution tool can’t handle updating in a secure manner” has ever even been considered as a form of best practice. Charitable as in not leaning towards “web developers gonna web develop”, which I would’ve been happy with 15 years ago but I realise perfectly well that’s not the right explanation. I just can’t, for the life of me, figure out the right one.

      Can someone who knows more about Docker and DevOps explain this old Unix fart why “packages inside parent images can’t upgrade inside an unprivileged container” is an argument for not installing updates, as opposed to throwing Docker into the thrash bin, sealing the lid, and setting it on fire?

      1. 13

        This is not a problem with Docker the software. Docker can install system updates and run application as non-privileged user. The article demonstrates how, and it’s not like some secret technique, it’s just the normal way documented way.

        This is a problem with whoever wrote this document just… making nonsensical statements, and Docker the organization leaving the bad documentation up for years.

        So again, Docker the software has many problems, but inability to install security updates is not one of them.

        1. 1

          Has that method always worked? Or is it a recent addition for unprivileged containers? I’m just curious to understand how this ended up being the Docker project’s official recommendation for so many years that it ended up in linters and OWASP lists and whatnot. I mean none of these cite some random Internet dude saying maybe don’t do that, they all cite the program’s documentation…

          1. 5

            When I (and the documentation in question) say “unpriviliged” in this context it means “process uid is not root”.

            There’s also “unpriviliged containers” in the sense that Docker isn’t running as root, which is indeed a new thing also is completely orthogonal to this issue.

            1. 1

              Now it sounds even weirder, because the documentation literally says “unprivileged container”, but I think I got your point. Thanks!

        2. 5

          Well, the article did point out that you can upgrade from within Docker. The problem is that the OS running inside Docker can’t assume it has access to certain things. I only skimmed the article, but I think it mentioned an example where updating an Linux distro might cause it to try to (re)start something like systemd or some other system service that probably doesn’t work inside a Docker container.

          However, that really doesn’t address your main point/question. Why was this ever advice? Even back in the day, when some OSes would misbehave inside Docker, the advice should have been “Don’t use that OS inside Docker”, not “Don’t install updates”.

          I think the most charitable explanation is that developers today are expected to do everything and know about everything. I love my current role at my company, but I wear a lot of hats. I work on our mobile app, several backend services in several languages/frameworks, our web site (ecommerce style site PHP + JS), and even a hardware interfacing tool that I wrote from scratch because it only came with a Windows .exe to communicate with it. I have also had to craft several Dockerfiles and become familiar with actually using/deploying Docker containers, and our CI tool/service.

          It’s just a lot. While I always do my best to make sure everything I do is secure and robust, etc, it does mean that sometimes I end up just leaning on “best practices” because I don’t have the mental bandwidth to be an expert on everything.

          1. 2

            it mentioned an example where updating an Linux distro might cause it to try to (re)start something like systemd or some other system service that probably doesn’t work inside a Docker container.

            That’s not been true for years, for most packages. That quote was from an obsolete article from 2014, and only quoted in order to point out it’s wrong.

            1. 2

              I didn’t mean to imply that it was! If you read my next paragraph, it might be a little more clear that this isn’t an issue today. But I still wonder aloud why the resulting advice was ever good advice- even when this particular issue was common-ish.

              1. 1

                AFAICT the current version of best practices page in Docker docs was written in 2018 (per Wayback Machine), by which point that wouldn’t have been an issue. But maybe that’s left over from an older page at a different URL.

          2. 5

            I am not a Docker expert (or even user), but as I understand the OCI model you shouldn’t upgrade things from the base image because it’s a violation of separation of concerns between layers (in the sense of overlay filesystem layers). If there are security concerns in the base packages then you should update to a newer version of the image that provides those packages, not add more deltas in the layer that sits on top of it.

            1. 2

              That makes a lot more sense – I thought it might be something like this, by analogy with e.g. OpenEmbedded/Yocto layers. Thanks!

              1. 1

                This doesn’t hold water and is addressed in the article.

                The way Docker containers work is that they’re built out of multiple, composable layers. Each layer is independent and the standard separation of concerns layer based.

                So after pulling a base container, the next layer that makes sense is to install security updates for the base image. Any subsequent changes to the base image will re-install security updates.

                Often base images are updated infrequently, So relying on their security update is just allowing security flaws to persist your application.

                1. 1

                  To me, an outsider who uses Docker for development once in a while but nothing else, a separate layer for security updates doesn’t make much sense. Why would that be treated as a separate concern? It’s not something that is conceptually or operationally independent of the previous layer, something that you could in principle run on top of any base image if you configure it right – it’s a set of changes to packages in the parent layer. Why not have “the right” packages in the parent layer in the first place, then? The fact that base images aren’t updated as often as they ought to be doesn’t make security updates any more independent of the base images that they ought to be applied to. If that’s done strictly as a “real-world optimisation”, i.e. to avoid rebuilding more images than necessary or to deal with slow-moving third parties, that’s fine, but I don’t think we should retrofit a “serious” reason for it.

            2. 3

              Charitable as in not leaning towards “web developers gonna web develop”

              I kind of want to push back on this, because while it’s easy to find examples of “bad” developers in any field of programming, I think it’s actually interesting to point out that many other fields of programming solve this problem by… not solving it. Even for products which are internet-connected by design and thus potentially exploitable remotely if/when the right vulnerability shows up. So while web folks may not be up to your standards, I’d argue that by even being expected to try to solve this problem in the first place, we’re probably ahead of a lot of other groups.

              1. 1

                Yeah, that’s exactly why I was looking for the right explanation :-). There’s a lot of smugness going around that ascribes any bad practice in a given field to “yeah that’s just how people in are”, when the actual explanation is simply a problem that’s not obvious to people outside that field. Best practices guides are particularly susceptible to this because they’re often taken for granted. I apologise if I gave the wrong impression here, web folks are very much up to my standards.

            3. 10

              Most of Docker features and best practices are a joke. We operate a large fleet of servers running Docker containers and I can say that Docker is giant amount of waste of energy best and the security nightmare worst case. Usability is another question. And then, as always, more abstraction for the rescue! Hello Nomad or Kubernetes. I hope that some alternative is going to take over asap. I am really hopeful for Firecracker or something similar. Dockerfile is total garbage, impossible to combine different layers in any meaningful way. If anybody is working on a truly composable way of creating containers (add java8, python3.9 and my app) would get me dollars.

              1. 8

                You can use both Nix and Bazel to create Docker images. There’s also buildpacks for Docker, but I don’t think they’d solve the combo of Java and Python.

                1. 1

                  Yeah, it does not. Anyways, I think we still need a much better containerization approach that can do these things easily.

                  1. 4
                    1. BSD jails + Ansible?
                    2. LXD containers + https://github.com/rollcat/judo
                    3. https://www.joyent.com/smartos

                    I think it depends on your storage needs, too. Options 1 and 3 provide ZFS out-of-the-box and some level of Linux binary compatibility. Option 2 can provide it, but your preferred distribution chooses how much work that will be. If you do not have exotic backup-friendly storage needs, then Option 2 is the least resistance of the three.

                    There are options out there for containerization. You have to get your team to buy into them, though. The convenience of a Dockerfile is so compelling that any other solution will have to offer a similar low barrier of entry.

                    1. 1

                      If I could chose. :) LXD is probably pretty close.

                    2. 2

                      Like systemd with systemd-nspawn and alike? You do not even need to create image to make it work, as you can restrict application view on the system to allow access only to given set of paths and reduce allowed syscalls to secure set.

                      1. 1

                        Like systemd with systemd-nspawn and alike?

                        Something similar. Too bad we cannot use systemd-nspawn.

                  2. 1

                    NixOS containers seem like they would solve your problem.

                    1. 1

                      Maybe, I need to investigate that. I cannot make the company to switch from Nomad to NixOS even if Nix has container orchestration.

                      1. 1

                        Nix doesn’t have container orchestration at all, and you can use Nomad with NixOS without problems.

                  3. 4

                    A lot here is good I think.

                    However, one thing I didn’t see discussed: Sometimes, distributions break things. For example, there were a few hours where Canonical had pushed a new version of Python, then pulled that package of Python without changing the package index. This broke my Docker workflow which uses apt-get in the Dockerfile. Other times, though they try not to, Canonical will push a new version of a package which does have visible changes and might break existing programs. You introduce a not insignificant level of fragility by relying on Canonical every time you spin up a new container instead of every tine you actively decide to upgrade your image. Whether that’s tolerable or not depends entirely on context - it might be OK for CI, it might not be OK when using Docker to automatically scale a fleet of production server.

                    1. 19

                      You don’t do the security updates “every time you spin up a new container”. You do it during image creation time, with a RUN apt-get update && apt-get -y upgrade. RUN commands are not part of container spinup, they’re image creation commands.

                      Since it’s during image creation, it’s in CI/CD, so you can do smoke tests, integration tests, etc.. to make sure the image is OK.

                      And then you need to have a process to make sure your images are up-to-date, so typically people will rebuild the image weekly, or nightly, or whenever a security update comes out if you can automate that.

                      1. 4

                        Correct. Use apt-get update without CI/CD at your own risk. Tests help a whole lot.

                        Of course, you also use apt-get update on your own machine at your own risk, too.

                      2. 9

                        I feel like this is not a great reason to avoid applying security updates, and that getting pwned due to lack of critical update is probably far more common than “upstream pushed a bad apt update”.

                        And I hope you (the hypothetical “you”) aren’t just deploying a container without testing it; hopefully there is some CI that verifies the container actually works and passes a test suite before it’s allowed to get anywhere near production, and ideally pushes the built container to some repository you actually deploy from rather than building containers from scratch every time you need one. All of which will catch the bad apt update without affecting any production system and let you temporarily turn off the update or just hold deploys until upstream fixes.

                        At any rate, the solution is never going to be “don’t apply updates” – it’s going to be “build your infrastructure to stay up-to-date and alert you/be resilient for the cases where an update is a problem”.

                        1. 4

                          There’s an easy fix for that, though: Don’t use Ubuntu for anything important.

                          1. 5

                            I was going to write the same thing but less salty. All distros make mistakes from time to time, choose one which makes the least impactful ones for your personal needs.

                            1. 3

                              Which distros make fewer/less impactful packaging mistakes than Ubuntu? Certainly not Debian after they patched out OpenSSL’s RNG seeder. Maybe Alpine is close to infallible?

                              1. 5

                                Nobody’s infallible. Debian has had a heavy process in place for more than a decade to help them get close to it.

                                1. 1

                                  I’m sure Ubuntu has a heavy process in place too. I’m going to need to see data if you’re claiming that Debian is so much safer than Ubuntu that automatically running updates when you spin up a Debian image is safe, while doing the same with an Ubuntu image is dangerous.

                                2. 2

                                  I see no reason not to use Red Hat’s UBI

                          2. 3

                            this leaves out the update/upgrade results in image size bloat as well, and things like AWS ECR can charge by the byte.

                            … but, not surprised; this same author once wrote against using alpine because they “once couldn’t do DNS lookups in Alpine images running on minikube (Kubernetes in a VM) when using the WeWork coworking space’s WiFi.”

                            1. 14

                              this leaves out the update/upgrade results in image size bloat as well

                              The author has published a lot of stuff on Docker, including tips for how to avoid that. And given that a lot of real-world deployments will install at least some system packages beyond the base distro, this isn’t exactly a knock-down argument – they’ll have to grapple with how to do that sooner or later.

                              this same author once wrote against using alpine because

                              This is a disingenuous cherry-pick. The article in question mentions a couple weird bugs and incompatibilities as a kind of “cherry on top” at the end, but they’re not the main argument against using Alpine images. In fact the argument is specifically against using Alpine for containers that deploy Python applications, and is based on pointing out the big disadvantage: on Alpine you don’t have access to pre-compiled “wheel” packages from the Python Package Index, which both slows down your builds (you have to compile from source any C/other-language extensions used in a Python package) and potentially increases image size (since you need to have a compiler toolchain and any necessary libraries to link against, and it’s harder to do the “all in a single RUN” trick without breaking out a large separate script that the Dockerfile invokes to do all the compilation and installation of compiler dependencies).

                              1. 12

                                The good news is that there’s a PEP now for musl wheels (https://www.python.org/dev/peps/pep-0656/) so if it’s accepted, it’s possible that the situation on Alpine will improve.

                                1. 7

                                  Multi-stage builds solve the “need to install the compilers and they bloat the image” problem, right? The majority of my Dockerfiles are structured like

                                  FROM base-image AS build
                                  RUN install_some_packages
                                  RUN build_some_code
                                  
                                  FROM base-image
                                  COPY --from=build /some/build/artifacts /final/destination
                                  

                                  One convenient thing about doing it this way is that you don’t need to worry about minimizing layers in the build stage since none of them will end up in the final result anyway. You can do one command per RUN with none of the && shenanigans.

                                  1. 1

                                    Now try the same exercise with Java 8 and Python 3.9 installed from packages. We have 3 images, one needs only Java one needs Python and one needs both. I think it is nice to have a build image and you can achieve this with most of the CI/CD solutions out there. What is not possible to have a combination of two images easily. It is only possible in a crude way, copy pasting the installation steps between Dockerfiles. So we are back to copy paste computing.

                                    1. 4

                                      What is not possible to have a combination of two images easily.

                                      The real issue is that Dockerfiles are not declarative and do not have a conception of packages and their dependencies. As a result you have to rely on fragile sequences of steps and ‘manually’ copying stuff to the final image and hope that you didn’t miss anything.

                                      Building a complex Docker image is fairly trivial with most declarative package managers, such as Nix, Guix, or Bazel. E.g. with Nix you do not specify the steps, but what a container image should contain. Everything gets built outside containers in the Nix sandbox, then the image is built from the transitive (runtime) closure. You get images that only contain all the necessary and only the necessary packages. Moreover, the images are far more reproducible, since all the dependencies are explicitly specified [1], down to the sources through fixed-output derivations.

                                      [1] Unless you rely on channels or other impurities, but these can be avoided by pinning dependencies and avoiding certain functions and/or using flakes (beware, beta).

                                      1. 3

                                        There is a slightly more ergonomic way to do this: the builder pattern. You use a set of base Dockerfiles. One will have your compilers and such in it, then another Dockerfile for each image you intend to build. In the subsequent, or dependent, Dockerfiles, you just copy the files out of the first image you built. Wire the whole thing up using a shell script of about 5 - 10 lines of code.

                                        Yes, this breaks the ability to just docker build -t myimage . but it gives you the ability to not have to copy and paste.

                                        In my world, I use this pattern to build a php-fpm backend API server and then have a container with nginx and the static assets in a separate container. It takes three Dockerfiles and a small shell script (or Makefile).

                                  2. 1

                                    It does say “Dockerfiles in this article are not examples of best practices”; perhaps the clean-up from an upgrade is one of the things omitted for clarity. It’s a fairly straightforward chunk of disto-specific boilerplate, an easy problem to solve.

                                    1. 7

                                      There is actually image size bloat, unfortunately. The package index/download cleanup isn’t the bloat (in fact these days Debian and Ubuntu official images clean that up atuomatically) but the fact you end up with two copies of the installed packages. So e.g. if you upgraded libzstd1, now your image is storing two copies.

                                      So if the base Ubuntu image (which is created from tarball) was always up-to-date with security updates that would in fact result in slightly smaller images. It isn’t, though, and it’s better to have a slightly larger image than to not have security updates.

                                  3. 1

                                    Interesting, I’d never seen this mentioned as a best practice and I do agree that it’s not. I’ve always been updating my (important) containers because the images are lagging behind so often and you’re probably installing packages anyway.

                                    1. 1

                                      A reminder that you don’t need to start or end with a working os install in your docker image. Start from scratch or remove stuff from your distro and copy to a fresh stage.