1. 17

After sharing my NGINX image with the world, and due to the recent acquisition of NGINX by F5 Networks and the FUD it generated, I decided to give HAProxy a try.

To do so I built this image. Built from source, tight filesystem, secure binary, etc.

The result is a ~11MB image that contains only those files required to run HAProxy, and nothing else. No bash, to UNIX toolset, no package manager…

  1.  

  2. 12

    Excuse my ignorance, but how do we know this is the world’s most secure image?

    1. 6

      Allright, allright… It’s a little baity.

      I said this (and said it with my NGINX image in the past) because every other Docker image I run into either is based in one of the OS images (Ubuntu, Debian, Alpine, etc.) therefore have one or many shells, the whole UNIX toolset, a package manager, etc. Or have a binary that doesn’t have the most basic exploit mitigations like RELRO, NX, PIE and SSP, making it somewhat vulnerable to stack overflows should {HAProxy,NGINX} have an unknown vulnerability.

      1. 2

        Thanks for the clarification! Appreciate the transparency.

        1. 1

          Removing basic tools does not improve security significantly. It’s an obsoleted concept since more than a decade, as tools like metasploit inject feature-rich shells and toolsets as attack payload. There are many reasons for an attacker to use a custom environment over the OS shells and tools.

          Also, legitimate intrusion detection, vulnerability management, access control and logging systems rely on having a “full” OS to run their own daemons, probes and so on.

      2. 7

        I’m surprised that musl is bigger than glibc. Isn’t size and simplicity like the whole point of musl?

        1. 4

          Yup, I was surprised at it too!

          I’ll try to get the musl image to build statically, as that’s what musl was thought for, and in containers there’s no point in dynamically linking. So that might reduce size.

          But yeah, weird.

          Taking a glance with dive I see the musl-based binary takes 14MB while the glibc image has a 3.5MB HAProxy binary. The /lib folder takes 7.9MB in glibc and 5.3MB in musl.

          Weird. I’ll look into in over the weekend. Thanks!

          1. 6
            $ ls -lh /lib/libc-2.29.so /lib/musl/lib/libc.so /lib/libc.a /lib/musl/lib/libc.a
            -rwxr-xr-x 1 root root 2.1M Apr 17 21:11 /lib/libc-2.29.so*
            -rw-r--r-- 1 root root 5.2M Apr 17 21:11 /lib/libc.a
            -rw-r--r-- 1 root root 2.5M Apr 16 13:49 /lib/musl/lib/libc.a
            -rwxr-xr-x 1 root root 595K Apr 16 13:49 /lib/musl/lib/libc.so*
            

            Sounds like an issue with compile flags or whatnot.

            1. 1

              The author doesn’t need the .a, do they? I thought the .a were the static libraries. I don’t think I’ve seen them since I did a Linux From Scratch build over 10 years ago.

              1. 2

                I’m not sure how cc -static works to be honest, I just included it to demonstrate that musl is smaller on my system both as a dynamic library and a static one.

                1. 1

                  cc -static should look into the .a file (which is an archive of .o files) and pick out only the parts actually needed to build the static binary and then you don’t need the .a file anymore.

          2. 1

            I’ve been looking into if during my free time. I’ve managed to get rid of all other shared objects but libc, whose removal would cause a segfault. Adding CFLAGS="-static" and LDFLAGS="-static" to the make step doesn’t help.

            It does not reduce binary size, though, right now it’s down to 18.6MB the image and 17.2M the binary (with the other objects statically linked, of course).

            See the changes in this branch.

          3. 9

            Wow another “world’s” something or other Docker build…

            1. 5

              So I’ll have to do my own tuning via Nix. But I reckon I can get this down to a cook 1MB at a least. I’ll post back with a nix config in the next day or so.

              1. 3

                You can probably get it much smaller by stripping the binary. The glibc image has 8.8k symbols IIRC. Yet I’m still to find a great reason to strip the binaries besides size.

                You could also replace ZLIB with LIBSLZ, which I believe is smaller and 100% compatible.

                You can also remove liblua and save some extra kBs.

                By smallest I meant size not in bytes but in file tree. The musl-based image is dynamically linked and has only 10 files. I’ll try to get it to build statically the next weekend and remove the whole /lib directory, leaving a total of just 5 files to run full-blown HAProxy.

              2. 7

                What guarantees do we have that it will be kept up to date?

                It might be the most secure image today but without security patches it might not matter that much.

                1. 7

                  The same guarantees that any other free project will be kept up to date.

                  It’s a simple dockerfile, if a new library version comes out, update the file and rebuild the container.

                  1. 4

                    Yea, but we’re depending on the author here to do so. If the author has a cool CI that checks for dependency updates and attempts to auto-rebuild and get new hashes, that’d be a cool automation step that could be open sourced and applied to a bunch of other stuff.

                    This is a big problem with Docker containers in general. Unless you rebuild them regularly or have tools to scan your containers for out of date libraries/dependencies, security problems can creep up. Sure if they break your app, they’re stuck in the container, but what if you’re running a Kernel with cgroup vulnerabilities? It’s unlikely, and the layers of security do help make breaching more difficult, but it also makes updating all the micro-components more challenging as well.

                    1. 2

                      It’s almost fully automated:

                      1. I have a system set up that checks the software I rely upon (HAProxy, PCRE, NGINX, etc.) for available upgrades, and sends me a report every day at 5AM.

                      2. I read my email first thing in the morning, at around 5:15AM.

                      3. It takes me less than a minute to tick up a version number, change a tarball checksum, commit, tag and push to GitHub.

                      4. I have automatic build rules set up on the Docker Hub to build both variants of the image as soon as a new tag is emitted on the GitHub repo.

                      5. The Docker Hub takes between 0 and 2h to start building my image, and takes about 20m to build it.

                      I tried to put it in code, I think it makes it easier to understand:

                      t_patch  = t_notice + t_fix
                      
                      t_notice = t_alert + t_delay
                      0 <= t_alert <= 1d
                      t_delay = 15m
                      >>> 15m <= t_notice <= 1d15m
                      
                      t_fix = t_push + t_queue + t_build
                      t_push <= 1m
                      0 <= t_queue <= 2h
                      t_build = 20m
                      >>> 21m <= t_fix <= 2h21m
                      
                      Therefore: 36m <= t_patch <= 1d2h36m
                      

                      It (theoretically) takes a minimum of 36m and a maximum of 1d2h36m, assuming there’s no other cause preventing me form pushing the changes and increasing t_delay.

                      If for whatever reason a build fails, I get notified via email, and the loop reiterates.

                      One of the main reasons of having only the files required during runtime inside the image is that I don’t have to worry about vulnerabilities found on other software packages I happen to carry from the base image, I only care about HAProxy, it’s dependencies (PCRE, zlib, OpenSSL, Lua and libreadline) and the toolchain (GCC, G++, Perl and Make).

                      The whole process is verified by GitHub (see the commits page, click on the green ticks next to the commit timestamp. All my commits are signed via GPG, too.

                      I don’t want to fully automate it because I like to read the changelogs and manually do some checks before applying the changes. I also can’t as I have to manually introduce the private’s key passphrase before commiting.

                    2. 1

                      No, Debian has been keeping my systems patched and up to date since 1995.

                  2. 5

                    This is pretty well done. Great job!

                    1. 2

                      Thank you!

                    2. 3

                      Wandering just now if there is something like a curated list, or curated repository, of OCI container images somewhere out there?