1. 36

I’ve built this Docker image over the months of using NGINX in Docker as the reverse proxy of most of my services, and this is the result.

NGINX is pulled, verified and built-from-source during build, all the necessary libraries are pulled, verified and included during build too.

Once compiled, the binary is put into an empty base image (“FROM scratch”) with only it’s necessary runtime dependencies too.

The result is a ~13MB image that contains only those files required to run NGINX, and nothing else. No bash, to UNIX toolset, no package manager…

  1.  

  2. 10

    Cool, this is how image should be made!

    At work we even go a little further. We strip unnecessary symbols from binaries binaries and use extreme compression on files using upx. I believe we’re hitting the 3MB mark. We define a non-root user and disable as much capabilities as possible to make things even more constrained and secure.

    1. 15

      The symbols are unnecessary right up to the point that your program crashes and you’d like to know why.

      1. 7

        You can still keep the ELF symbols as separate files outside of the image, right? Similar to how dbg packages work with package managers.

        1. 3

          True. However, for our use cases this tradeoff is fine. We consider nginx to be stable enough, and haven’t had any crashes yet. The container will automatically restart if it does, and if we do need to debug a repeating crash we switch to a version with symbols.

          1. 12

            Which may or may not have the same problem…

            1. 5

              But isn’t it more fun to just watch people discover this on their own?

        2. 3

          Thanks for the kind words!

          That’s maybe too far of a stretch, but it’s not a bad idea.

          Feel free to check out my lighttpd and dnsmasq images, too.

          I’m in the works of doing a haproxy image and writing a blog post about my process for building tiny (IMO, correct) images like this one.

          1. 1

            If you want small and secure, check out Lwan. It might fit one of your use-cases. It’s supposed to be useful from embedded to servers.

            1. 1

              I’ve heard wonders about Lwan, but I haven’t had the time to try it out.

          2. 2

            I tried to build OP’s container and it gave errors copying rootfs (I think it has a build/CI process that isn’t in the repo).

            So I hacked it up to always use musl, strip the binary, and upx it. I verified that it builds with -fPIC to produce Position-Independent Code. The final container size is 3.2MB and it builds easily.

            https://github.com/sean-public/nginx-tiny

            1. 1

              OP here, you shouldn’t have any issues building the image.

              I think I know the issue are you running into. Try with the following:

              1. clone repo, cd into it
              2. run docker build -t nginx:glibc -f glibc/Dockerfile .

              Replace with musl for musl-based image

          3. 13

            The result is a ~13MB image that contains only those files required to run NGINX, and nothing else. No bash, to UNIX toolset, no package manager…

            So, an executable running in a process… </sarcasm>

            1. 8

              Just like Docker images should be built, IMO.

              A binary, it’s runtime dependencies, and mount your config and files through volumes. Nothing else.

              I roll my eyes back every time I see an image bloated with full-blown Ubuntu or Debian in it.

            2. 4

              For fun, I added the nginx image to the docker-nixpkgs project, it took me 5 minutes. It’s not production-ready or fine-tuned at all :)

              The image takes 22MB according to docker hub: https://hub.docker.com/r/nixpkgs/nginx/tags . The image only includes busybox, nginx and their dependencies.

              On nice aspect is that if nixpkgs gets a new version, the image will automatically be updated.

              1. 4

                What qualifies something as the “most secure?” People getting started in cryptograpy development often make the claim of “unbreakable crypto”–surely it’s only unbreakable by their novice skills, but a more experienced attacker will have no problems breaking it. Same thing applies to software security.

                The state of exploit mitigations on Linux is horrid. No W^X, no Cross-DSO CFI, no SafeStack.

                Have you ensured that nginx was compiled as a PIE?

                Are you applying Cross-DSO CFI and Cross-DSO SafeStack to nginx?

                Are you enforcing PaX NOEXEC?

                1. 1

                  I’m not claiming it’s unbreakable, all I’m saying it’s that this image builds NGINX from official sources during build time, and includes no files other than those needed during runtime.

                  No busybox, no shells, no package manager…

                  I see “minimal” NGINX images that contain full-blown Alpine Linux with them. With apk, busybox… Those are golden for someone who has broken out of NGINX.

                  1. 3

                    I think where I get hung up is the usage of the phrase “most secure” as I don’t believe that to be accurate. It seems that “stripped down” would be a better fit.

                    1. 2

                      I was wondering what you think about a method of security assessment I’ve been using recently: the payouts at Zerodium or other brokers for specific apps as indicators of how often they find 0-days. Nginx is currently at $200,000. PHP, Outlook, and Exchange are at $250,000. That’s already disturbing given PHP’s history. IIS and Apache are at $500,000. Windows is at $1 million. This makes me think they see more 0-days in Nginx than PHP which is pretty easy to hack. So, if extrapolating from those numbers, Apache in hardened configuration would be safer than Nginx.

                      The most interesting thing on that chart is Dovecot’s position vs its security/coding claims. Some of the pricing might be tied to userbase, though, where little-used stuff has less money. Nginx, PHP, and Apache all have lots of users. That makes me think the comparison is more about how often they find 0-days in it.

                      1. 1

                        Perhaps it’s also indicative of market share.

                        1. 1

                          That’s what I meant by userbase. That’s also why I think Nginx vs PHP vs Apache might be based on difficulty to get 0-days given they each have wide deployment.

                      2. 1

                        Perhaps the packager has enabled only 1 TLS cipher suite, the one they view as “most secure” :)

                        1. 1

                          less attack surface is more secure

                          1. 1

                            Not really even though I see why you say it. You could have no authentication, buffer overflow checks, etc on the world’s smallest web server with no security as a result. The rule is less code equals less potential for vulnerabilities introduced accidentally at whatever rate that developer or team introduces them. Then, less privileged or memory-handling code for less, severe vulnerabilities. Still gotta decide how secure it is on a case by case basis with skilled evaluators looking at it. V&V activity ultimately what decides what’s more or less secure.

                    2. 5

                      Yet nginx runs as root inside the docker container….

                      1. 5

                        Check out Docker’s user namespaces (https://docs.docker.com/engine/security/userns-remap/)

                        They map the container’s UID 0 to a host’s UID != 0.

                        TL;DR: a container’s root ain’t the host’s root.

                        1. 6

                          Security is best in layers. Yes host root != container root, but to change this image to not run as container root is 100% less work than everything you already have done here.

                          Inside of the container, it is root. Also, breaking out of Docker containers is not ridiculously easy, but it’s not known for being very difficult either. I’m sure over time it’s gotten harder, but docker is still pretty new, I’m sure there are still tons of low-hanging fruit here. The default permissions Docker gives containers is non-trivial. Adding another layer here, especially one that is completely trivial is a no-brainer.

                          This is the security person’s perspective, always assume worst case scenarios, and try to minimize those worst cases, so that in the best case or normal case, the bad people have to work super, super hard to do bad things.

                          Never, ever make it easier for them, especially when making it harder, in this case, is 100% trivial.

                          Another way to put this, from the perspective of a bad person:

                          1. Break out of Nginx
                          2. Break out of Docker
                          3. Free host root / profit!

                          With the trivial change:

                          1. Break out of Nginx
                          2. Break into root
                          3. Break out of Docker
                          4. Free host root / profit!

                          An extra step, especially one that becomes non-trivial since all they have in the container is the nginx binary for basically zero work on your part, is again, a no-brainer to implement.

                          1. 2

                            This would make sense in the traditional scenario, where the attacker would require root privileges to do other stuff, but there’s nothing to do. The container itself has no special capabilities, it’s filesystem is as tight as it can be, the network namespace should be very limited, the attacker has no tools other than the NGINX binary and the runtime deps…

                            I know it is a very easy change, I might consider changing it to something other than root, just because there’s nothing to lose, but it would not add any “security layer” whatsoever.

                            1. 5

                              keyword: should. Trusting on should is a terrible idea. You have no control over how people run your docker container, I promise it will be as bad as you can imagine at least once, where the roots are mapped identical for some stupid reason. I’ve seen docker hosts map host / into containers. It’s obviously a terrible idea, but that doesn’t stop people from doing stupid things.

                              As for nothing to do.. One has ALL of the musl library there for the picking(s). so it’s not remotely out of possibility to write out new binaries once you get past nginx, to do pretty much anything you want. Is it more difficult than having a full shell or python laying around? Of course, but that’s the ENTIRE POINT of security.. make it HARD for bad people to do things.

                              Another thing you could do: enforce and limit file permissions (read-only by root, only nginx is executable, etc)

                              Anyways, at this point I think we have very different viewpoints. I don’t really see you seeing things from the perspective I am trying to show: that of a bad person.

                              1. 5

                                No! Trust me, I see your point.

                                I’ll consider changing to something other than root in the next few days.

                                Thanks for the feedback!

                                1. 1

                                  I agree, making it run as a low priv user inside docker would improve this already great package!

                                  1. 4

                                    After considering community feedback, including yours, I’ve decided to replace root (UID 0) as the process owner of nginx in the image with a non-root user.

                                    Check out the commit here: https://github.com/ricardbejarano/nginx/commit/77047757b3e01608c93ef17e8aaaf2a526654fa4

                                    It should be live as soon as the Docker Hub finished building the image.

                                    Thanks for your help!

                                  2. 1

                                    Thank you for seeing this! Your comment gives the paranoid among us a hope that more people can come to bring secure software to the world.

                                  3. 1

                                    After considering community feedback, including yours, I’ve decided to replace root (UID 0) as the process owner of nginx in the image with a non-root user.

                                    Check out the commit here: https://github.com/ricardbejarano/nginx/commit/77047757b3e01608c93ef17e8aaaf2a526654fa4

                                    It should be live as soon as the Docker Hub finished building the image.

                                    Thanks for your help!

                                  4. 1

                                    The filesystem can be tight, but by exploiting a vulnerability in nginx, an attacker may be able to upload anything to the filesystem (as nginx is running as root). It’s not like nginx doesn’t already have code to download and store stuff.

                                2. 2

                                  But is it treated like root in the container? If yes, that’s still a huge problem.

                                  Either way, this is a terrible solution.

                                  1. 3

                                    It’s treated like root in that it has rwx on the whole filesystem, that is the image’s contents and any volume you mount on it.

                                    But any special Linux capabilities like NET_ADMIN are not granted unless the Docker host says so.

                                    1. 1

                                      After considering community feedback, including yours, I’ve decided to replace root (UID 0) as the process owner of nginx in the image with a non-root user.

                                      Check out the commit here: https://github.com/ricardbejarano/nginx/commit/77047757b3e01608c93ef17e8aaaf2a526654fa4

                                      It should be live as soon as the Docker Hub finished building the image.

                                      Thanks for your help!

                                3. 3

                                  Yes! Why do we want to pull an entire operating system just to run a single binary!

                                  1. 7

                                    What do you do if something breaks? You have zero tools to debug the breakage. Zero performance analysis tools, etc.

                                    1. 5

                                      People who like this solution don’t debug issues, they just destroy and re-deploy at the first sign of trouble.

                                      1. 2

                                        That’s not necessarily true.

                                        Docker images being like immutable infrastructure lets you do this, to quickly bring the service back up, but Docker stores the logs and the container filesystem so you can copy them over to your local machine for debugging.

                                        There’s no point in redeploying the service without debugging, if the root cause is unknown it’ll fail again.

                                        1. 4

                                          Logs aren’t sufficient for the difficult issues. You need to be able to do real debugging and tracing.

                                          1. 2

                                            Is it not possible from outside the container?

                                            1. 2

                                              Depends on if the host OS has tools or is one of the specialized distros that is slimmed down and intended for container hosting. Doesn’t fix the lack of symbols though. A lot of variables to consider that are usually ignored until an event happens.

                                      2. 2

                                        I would access the debugging facilities of the host and read the log files from there.

                                        But that means having access to the host, are there situation where it’s false? Container-only hosting offers maybe? Then let’s ask the hosting provider to debug the image then! :P

                                        1. 1

                                          Docker stores both the logs and the content of the container that broke, so you can copy it all over to your local machine and debug safely.

                                      3. 3

                                        Nice. FWIW, I did something similar as part of a more ambitious project about five years ago, but I didn’t do any ongoing maintenance of those images. My nginx image included BusyBox, though I don’t think it really needed to.

                                        1. 2

                                          That’s nice! I like seeing other people worried about bloated Docker images.

                                          As you can see, there’s a musl-based image, if you prefer that.

                                          Feel free to check out my other images built this way (lighttpd and dnsmasq). I’m in the works of doing a haproxy one, too.

                                        2. 3

                                          I’m not especially familiar with Docker - what’s the benefit of using it instead of a package manager here? It looks pretty similar to a source-based package where it’s a list of dependencies and a build script.

                                          1. 3

                                            Not much a Docker fan here - I use it in some situations at the moment and provide one of my projects as docker images for convenience for users.

                                            I think the main point for using images/containers is if you have large scale deployments and do not want to care what runs where and use additional features automatic restart, automatic scaling, blue-green deployment (even though much of that is not part of docker itself, but maybe kubernetes can handle it?).

                                            Another advantage would be simple distribution of the container to any system. As long as your users have docker you don’t have to create packages for different distributions / Windows and they can still run it with one command.

                                            However, I’d say that most people that use containers would not really need it. I have two servers with several projects running and this goes quite well with ansible deployments. Scaling out with ansible is also no problem - as long as you do not need the dynamic allocation.

                                            But if you can actually profit from docker then the article’s method seems quite nice. Most docker images are around 100MB or more (I think I have one that is almost 1GB, just because I did not optimize anything…) and a few days ago I deleted around 60GB of docker image data from my hard disk.

                                            1. 2

                                              It’s all about trust.

                                              If you inspect the Dockerfiles, you’ll see each source tarball has a SHA-256 checksum associated to it that gets checked during build. This means source integrity is hard-coded. Anyone can download the according tarball and check that it matches the one in the Dockerfile.

                                              During the build stage, those tarballs are downloaded and the build only proceeds if integrity is verified.

                                              You don’t have that kind of protection when downloading from a repository, because the binary is already built and trust is put on the repository maintainer. Now, this means trust is put on the NGINX/OpenSSL/zlib/etc. maintainers to release trusted tarballs on their websites, but if we can’t trust the devs with that task, that’s a whole different problem.

                                              1. 5

                                                I said source package. Gentoo, Nix, probably most package managers that can build from source include this by default, and it’s a single build command in the unpacking phase if not.

                                              2. 2

                                                I think the goal is to just let the container loose on infrastructure meant to handle ports , logs, hardware, lifetimes, without ever coming close to treating the install like a pet.

                                                1. 1

                                                  Here’s you an intro that covers a few uses.

                                                2. 1

                                                  So need to do a little more testing, but using a combination of nix for package generation along with disabling some compiler flags, as well as generating the container image. And upx for some extra packing. I’ve gotten a secure nginx image down to the size of 570kb overall.

                                                  1. 1

                                                    If you want to remove bloat in your docker image, I suggest getting rid of docker first, and convert the dockerfile to a shell script. It would be shorter, easier to read and you wouldn’t have to download a full operating system only to run ./configure; make. Then once the ./rootfs is built, you start it with

                                                    unshare -fpiuUm --mount-proc env -i chroot ./rootfs ./nginx
                                                    

                                                    This runs with the same amount of isolation as docker, without the whole docker ecosystem bloat overhead!

                                                      1. 4

                                                        Is avoiding bloat by default premature optimization? I think not. It’s just a good, engineering tradeoff in many situations. A quick example is how people might have a bloated stack running on a $10-20 VM. A lean, efficient stack might run on a $2.50 VM with 512MB of RAM. They save money. If money doesn’t matter itself, they have extra that can go to other capabilities like their next project that runs in parallel, high-availability for their older one, backups for their data, something other than just Netflix, and so on. All because one app was more efficient than another. This concept can scale even bigger if it’s a big business making these decisions. :)

                                                        1. 3

                                                          And that’s just avoiding bloat.

                                                          Container startup time has been proved to be linked to image size, too.