1. 24

Linux containers are enjoying quite some hype these days, especially due to Docker. In my engagements, be it at user groups, conferences or trainings I see regularly around 10% of the folks there having containers in production.

I’d be interested why you are using containers—anything goes, really, doesn’t have to be Docker, even Solaris zones count ;)

And if you are not using containers, why not (yet)?


  2. 24

    By “are you using containers”, I assume you mean “are you using containers for developing/shipping an application in an isolated environment”, to which the answer is - I don’t. @pushcx did a good job explaining many reasons, I’d try to put it in my own words:

    I find containers to be semantically broken for that purpose. What developers want is better and easier management of state/configuration, and I don’t see how putting all that non-managed state into a deeper room is the solution. How about we fix the problem where it exists instead of covering it under rugs and calling it a day.

    Here is a crazy idea, which is probably never going to see life:

    Lets standardize software configuration! One of my favorite examples of existing “configuration management” is the linux utility visudo. It is used to edit the /etc/sudoers configuration file. While you can just edit the file as it is, using visudo ensures that only one user is editing the file at once, does syntax checks on the file before it takes effect etc. Lets take this a bit further…

    I’d like to see a configuration system (and not dbus or windows' registry) which lets me manage configuration files in plain text, and statically verifies that they will work. i.e., syntax checks, type checks, environment checks (eg. check if the same port is going to be used by multiple applications) etc. and when it catches errors, it should return useful error messages and probably hints on what to fix. We already have amazing static analysis for programming languages - why not configuration systems, which aren’t even turing complete? :)

    1. 47

      I don’t use them because they don’t solve any problems. Rather than address the problem of stateful server deployment and management, they just zip up a stateful system (which probably has crazy side effects like apt-get update) to copy around. The configuration management problem is at best pushed up one level. If I needed stronger process isolation than users, groups, and chroot I’d go right to a hypervisor or separate servers rather than take a huge, complex dependency to get halfway there.

      Besides not having benefits, they have serious drawbacks. Rather than building directly on a distro, there’s another in there for a random dev or coworker to add surprises to. There’s gigs of disk space gone. There’s bandwidth hassles in shipping them around. There’s setting up some kind of block storage for them. There’s another moving part to deploy and monitor. There’s a ton of new tools and configs to learn and maintain.

      We have real problems configuring and managing systems. Containers so fundamentally misunderstand the problems that they are fighting fire with napalm.

      1. 4

        I don’t use them because they don’t solve any problems.

        They do solve the problem of limiting resource usage, forcing contained programs to be good citizens of multi-tenant boxes. This is something most people don’t need, and most people don’t use.

        1. 3

          In a significantly better way than just properly sizing your boxes? I don’t see an advantage compared to just provisioning better-sized hosts for applications to run on.

          1. 3

            Limiting resource usage to the biggest isn’t something that you need containers for. You can actually do that on per process or per user level.

          2. 3

            I used to think exactly that way, but I’ve come around to thinking that’s an unfair frame. Configuration management on Unix has gotten no better over the course of the last 30 years that I’ve been using it, so you’re really comparing ‘opaque encapsulated configuration management’, with all the negatives that implies, with the existing solution, which is a thousand different configuration tools and formats and systems poorly-customized to every different conceivable working environment, which has downsides that staggeringly dwarf the downsides that containerization brings. It’s not like apt-get update doesn’t happen on bare hosts.

            I know of about three good answers to the configuration and dependency management problem today. One of them is containers, one of them is Nix, and the last one is proprietary internal tooling at my current employer. Of those three, only containers are even remotely likely to be widespread in general availability within the next 5-10 years. Everything else is terrible.

            Most of the problems you go on to cite – random surprise distros, gigabyte containers, and fully stateful systems – are generally container antipatterns these days. Not that people don’t do them, but there’s a growing consensus around, e.g., alpine-based containers that do one thing, don’t run an init, and have state & storage as properly separated concerns. The Kubernetes culture in particular tends to lean in the right direction and is worth a look.

            There are additional benefits to containers at scale besides just making the dev process easier. Change management becomes super simple if you don’t have to worry about the internals of an entire deployed application. Bin-packing and resource allocation become radically better. Unifying a state story, while prickly in current implementation, is way better than the usual method of not having a plan.

            1. 2

              I really appreciate this response, it sounds like you have a lot of experience with containers and the emerging best practices.

              I think you’re right about containers being the approach to most likely succeed, but not in a positive way. I’m glad the community is starting to recognize and address antipatterns, but I think containers are going to succeed because they look most like hacking a snowflake server together. Users will have the glow of new toys without the hard, frustrating work of challenging deep assumptions.

              1. 3

                I get your perspective, but consider the alternatives.

                If you don’t have containers, then you end up with super hacked up crap like systemd and rvm and running prod apps in a disconnected tmux window – our current world, in which people who have actual urgent real problems but whom are unwilling or unable to understand unix concoct their own tiny amateur mini-operating systems to try to meet their perceived need.

                Containers make it possible for you to have people of that type, but solve their problems of configuration and dependency and discovery and communication in a very severe and clean and uniform and directly comprehensible way. If they have other problems (and really those are 4 of the top 5 concerns in most environments) then they are still safely isolated away from everyone else, limiting the blast radius of any terribly-solved snowflake solutions.

                The last big unsolved problem – #1, state – is still unsolved, and so you still have snowflake nonsense going around and people making the mistake of running postgres in docker swarm and so on. Yet to be figured out, for sure.

                But the ‘hard, frustrating work of challenging deep assumptions’ is abstracted away from the developer in systems like Kubernetes, and concentrated in the people who are designing Kubernetes. That’s a win-win, because most of the people working on Kubernetes have more experience and better analysis skills than your median developer.

                1. 1

                  people who have actual urgent real problems but whom are unwilling or unable to understand unix concoct their own tiny amateur mini-operating systems to try to meet their perceived need

                  In fairness, Unix does nothing to solve any of these problems – in fact, it is, by design, the cause of most of them.

          3. 12

            I’m not strictly “not using containers” (I’m consulting, so sometimes I’m using what the client uses).

            I generally don’t recommend for using them though.

            First of all, I am under the impression that containers worsen the snowflake problem and not lessen it. Every container maintainer does what they want for providing dependencies and there’s little to no insight in which one they actually use without strong auditing. Auditing is more costly then another server.

            Second, many of my clients are very competent at running virtual machine clusters. Implementing container infrastructures on top of them is a huge endeavor with low gains in my opinion. I saw multiple projects failing at getting those infrastructures up and running.

            Thirdly, an immature ecosystem. For example, I expect a project like CoreOS to have some guidelines on how to ship logs from their machines. It’s an uncovered topic with only minimal work done there currently (systemd’s log shipping capabilities are a joke).

            Containers do fill an interesting space when the infrastructure is already in place (like at Amazon or wherever), but I don’t feel like the additional moving parts merit moving away from a well-run VM based infrastructure currently.

            1. 9

              I’ve never understood what containers are supposed to give me.

              The advice for how to use docker from the people who sound most competent seems to be to avoid network namespaces and process namespaces and just use it to run as a, well, process. I can see how it might be useful as a packaging system for ecosystems whose native packaging sucks (cough Python), but I use the JVM, so I already build my apps as a single file (shaded jar) that can be run consistently on any system with the JVM installed (java -jar myapp.jar). I could even do things like restricting filesystem/network access if I needed to, but honestly I’ve never had a problem with multiple processes running on the same server coming into conflict (admittedly the JVM is unusual in already having memory-allocation built in via -Xmx et al, I can see how using some kind of OS-side quota would be useful for runtimes that don’t provide that natively). So why add the complexity/overhead?

              1. 8

                I work with Kubernetes all day so I’m a little biased towards containers. But for new projects, I’ve been using something similar to Tim Hockin’s go-build-template [0] for dockerized golang builds. Its great for managing dependencies and having reproducible builds, but comes in really handy when you’re trying to cross compile binaries that need CGO and architecture specific C headers for linux-amd64 and darwin-amd64 [1].

                I also use the nvidia-docker plugin[2] for GPU passthrough and to bring up a TensorFlow environment locally.

                [0] https://github.com/thockin/go-build-template

                [1] https://github.com/karalabe/xgo

                [2] https://github.com/NVIDIA/nvidia-docker

                1. 4

                  Ultimately, I just want to execute binaries across a heterogenous set of hosts and have them supervised. Docker is a very heavy solution for that. When possible, I like to just make my jars executable and send them to be executed in a cgroup by Mesos. We’re moving towards Kubernetes though, which perhaps doesn’t make arbitrary binaries as easy unless you bake it into a Docker/rkt image.

                  I think using Docker to capture a stateful system is a decent hack for taking systems with sprawl (e.g. Python/Ruby) and turn them into self-contained binaries. Java/Go/C/C++ makes it a little easier with runnable executables that don’t necessarily need a baked image.

                  Tip: you can prepend a shellscript onto a jar to make it a real executable.

                  1. 4

                    The clickbait version of my opinion.

                    The nuanced version of my opinion.

                    I’d like to specifically call out the thing I talk about right near the end of the post. If you already have good automation around building VM images and blue-green deploys, containers probably don’t give you anything worthwhile (caveats: ease of using the same setup for development, machine utilisation).

                    1. 3

                      I am (sometimes) using FreeBSD jails. I mostly use it as a way to structure things. Even though I don’t use subjails yet, it allows me to see things as modules more easily.

                      This means I don’t always use them the same way. For example I might have a WordPress jail, containing MySQL/MariaDB, PHP and nginx (or apache). Often I want to see this as one unit. It depends a lot on how it is intended.

                      I also have systems where I have a database for multiple things, then I have a jail only running that.

                      It also allows me to use PF (packet filter) and other FreeBSD functionality to essentially set up a single server like a network.

                      Other than density (so I can rent a big server or two or three to have multiple networks) this allows one to really quickly add and get rid of such modules/units, …

                      I usually start out (single server set) with installing poudriere (package builder), then add some basic services (DNS cache, logging, maybe mailing), the to shared application related infrastructure, like a DB and then add applications.

                      It’s mostly okay to fail here, cause I can quickly switch and I can more easily get back to where I started. Then I use ZFS (or rsync, if I don’t have that available) to replicate that setup to other physical servers and deal with failover, etc.

                      Something that I really like about this is that interesting things can be done on the FS layer. For example I can set up poudriere, running in a jail and the resulting packages can be mounted over nullfs to all the jails. Similar things can be done with unix sockets. This allows to do things without networks.

                      Nowadays (ever since the jail(8) utility came up I more frequently use single binary setups.

                      The next thing I do is having some instruction file (think Dockerfile) to more easily handle those. I otherwise would simply use ansible.

                      Another thing I use it for, also locally is where I don’t trust a service enough or just wanna test it. It’s quick to simply start it up. What I really like is being able to very fine grained explain what is allowed and what isn’t. For example, a normal jail isn’t allowed to use raw sockets on FreeBSD . One has to allow it (simple option on the cli or in conf file), so no ping for example.

                      I don’t think containers are the future for a multitude of reasons. First of.. they have been around for ages in Solaris and FreeBSD, so not exactly the future. But also because they simply don’t fit every problem. They can make horizontal scaling hard. In most cases you might not need to deal with the kernel, but when you want performance you sometimes have to change something on the kernel level. If you are big enough some minor tweak there can mean hundreds of physical servers.

                      There is overhead and in many cases it’s actually reinventing muti-processing. When processes came up one goal was to not not just have one computer but virtual computers. That’s exactly what we now want again. We also to a degree redo the same things that we used to do with binaries and processes.

                      A famous use case for containers is solving the problem of libraries in multiple versions. This is historically looking a very strange problem and solution. Binaries used to be static. So each binary could simply be run on its own, having libraries compiled into it. This means there was no such problem. Later dynamically linked shared libraries and this issue came up. Now we decide to keep that, but inside containers. For most part we remove the benefit by wrapping a layer around it instead of doing the opposite and removing a layer, making things simpler. Other than the main reason being more a problem of how files and libraries are handled. No virtualization required to solve this.

                      A similar thing is true for the security part. There are capability systems that are way better suited to tell a binary or a process tree (or a user) what is allowed and what isn’t. Containers add an additional way, that by far isn’t as advanced yet and comes with way more overhead. And then again, if you think about history. Processes were individual machines so wrapping a layer around it is like a lightweight virtual machine (process) in another light weight virtual machine (container), probably on a heavier virtual machine (virtualbox, qemu, kvm, xen, bhyve, …) on a physical machine.

                      For the security aspect. People like to ignore the fact that there always are new issues allowing one to break out of containers, virtual machines (yes, even Xen) and sometimes don’t really add a lot of benefit anyway. For example there is a process has full access to a DB and that’s the only valuable part there. In most scenarios that connection there means that you don’t get a lot of security. I know, if you do it sane and fine grained, you don’t have the issue, but for some odd reason it feels like containers are (still?) seen as mystery boxes with properties that they cannot possibly provide.

                      I’d argue that a lot of issues should be fixed by advancing POSIX. I think containers are a bit of a hack of our time and that the general direction of things like CloudABI, pledge, seccomp will in the long run win. Maybe and likely it will be a combination, but I think at some point it might be that the benefits of containers won’t outweight the overhead in many scenarios. I don’t think it will in all scenarios, but simply based on what people that had containers for longer than the Linux world it seems that the hype of this will go down again and the question whether a container is really a good idea for a specific use case. Currently it often seems too much of a “because we can” decision.

                      Also I have a bit of a hope for a rise of small systems. The way ScaleWay does it, so we can get way stronger properties, by having a physical cloud. That would maybe also make it easier to just get rid of defect hardware in a better way. I also hope things like Cheri pick up resulting in those things becoming as common as VT (even though i think it’s a bigger change).

                      I think the strongest argument for containers (and VMs) is ease of development. Even when the software (stack) won’t be run inside a container in the end.

                      1. 2

                        Havent tried using it in personal projects, but my last job would use vagrant a lot, except the guy who set it all up wasnt working there anymore and nobody had any idea how it worked, so anytime we had a problem I had a lot of searching to do.. I dont think it’s a very good solution to system configuration management.

                        I like the idea of standardised config management that awal brought up, isnt nixos quite close to that? Seems to me like it’s able to manage multiple versions of the same software and librarires at once and it makes you interact with their own config language to config stuff like apache.

                        1. 2

                          I’ve been playing with Docker containers and have been sort of following the ecosystem since the very early days, but even today I can’t use Docker to solve a problem of deploying multiple related apps in separate security contexts. This is because networking/plumbing between containers is a joke, root on containers is too close to root on the host for comfort, and block storage is still a problem that has to be solved out of band. It’s ridiculous how half baked the whole thing is.

                          For true separation, VMs are the way to go. For easy deploys when you don’t want to ship VMs, a setup script for your project is easy and maintainable thanks to the rbenv/virtualenv/systemd/and-friends ecosystems.

                          Docker sucks and I wish it didn’t :(

                          1. 2

                            I can’t talk about whether or not I use containers, but I’d like to posit the possibility that just because they don’t “solve any problems” for you, doesn’t mean that they don’t solve problems for other people. Maybe not in the most elegant way, maybe Docker can be a real PITA, and maybe there’s also a lot of hype, but before my current job, I found Docker to be super helpful, because unlike other choices (such as Vagrant), which were not suitable for automated deployments onto systems I didn’t have complete control over, I could use Docker’s API to do what I needed.

                            1. 1

                              I can’t talk about whether or not I use containers, but I’d like to posit the possibility that just because they don’t “solve any problems” for you, doesn’t mean that they don’t solve problems for other people.

                              Maybe. But it’s also possible that they really are just marketing hype and there is no there there. I think if they really were useful people would be able to clearly explain the benefits.

                            2. 2

                              We deploy mostly to Heroku, so we technically are using containers at some level, but it is so far removed from us that I don’t feel saying “yes” would fit the spirit of the question.

                              I am happy with the arrangement as well. On one hand we have less control, so-called, over the environment. On the other hand, by restraining our options, it forces us to never to kludge things and always build with the ideas of 12factor in mind.

                              The only caveat is that we don’t have the problem that drives some to turn to Docker and friends for “a Heroku for local development”. We are lucky that all of our devs work in some Linux variant with an Ubuntu LTS base. Our designer likes Windows but is happy to run a VM. Lastly, I acknowledge that this state of affairs might not scale.

                              1. 1

                                Cloudformation, Boto and Puppet solve the problems Docker tries to address much better. Chef, Ansible, Cloud-init and Terraform are also helpful. I don’t see the use case for Docker at all.

                                1. 8

                                  One benefit, that I do in fact make use of at work, is that devs can trivially run docker containers on their local development machines. Configuration management tools like Puppet, Chef or Ansible are less straightforward and Cloudformation is right out. (One of my major complaints about Cloudformation is that it’s impossible to test without actually standing up infrastructure in AWS.)

                                  1. 2

                                    If there’s a real reason not to install the dev tools you need or if there’s no other way of containing them, Docker or somesuch might be ok.

                                    Docker security is a mess, though.

                                    Oh and some types of tests can be run in containers.

                                    1. 1

                                      As a casual contributor to some projects at work, that we deploy on AWS EB with Docker, I really appreciate this. We use docker compose to launch a complete development environment, with all dependencies. (I don’t actually rely on running the application itself in docker.) I appreciate that one could probably do something similar with vagrant.

                                    2. 1

                                      None of those tools actually solve the dependency problem, especially in the presence of global package managers like apt. For example, you may depend on an apt version of $yourprogramminglanguage or libc.

                                    3. 1

                                      Thankfully, I don’t spend a lot of time in the management of servers these days, but to my understanding, Docker et al fail both theoretically and in practice to make things better. In theory, solving the problem of decades of terrible Unix sludge with yet more gigabytes of terrible Unix sludge is simply exchanging one problem for n problems; and in practice, all I ever really hear about Docker is how great things will be after this next breaking change.

                                      1. 1

                                        Why am I not? Well they don’t solve a problem I have. They add more problems too.

                                        Example, you might be running on a newer kernel with newer sys calls available. You compile your app in your $olddockerimage and think you’re fine. But you’ve compiled your app to have dependence upon the sys calls available in the running kernel where your image was built.

                                        Its glibc versioning all over again only worse. Not a fan. Also I work on kernel modules/filesystems, containers are entirely pointless in those times.

                                        I do use containers to build things and save time on rebuilds. But its basically a glorified make at that point.