1. 33
  1.  

  2. 12

    I’m a big fan of having data in files, where you can look at their contents or replace or alter them in a pinch to solve a problem. Where the operating system can be at least somewhat aware of the structure of the data, so that we can observe and instrument an application with OS-level tools without needing to trust whatever debugging facilities are able to run inside the process itself. There are lots of occasions where I want to make use of facilities provided by another process alongside, e.g., cron or rsyslogd. Once you have at least two thoroughly different programs in the mix, the whole “fat binary” approach doesn’t really help anyway.

    I really don’t buy the “everything must be in the one binary!” approach at all, regardless of how trendy it is in Go and Rust at the moment. If it works for you, I suppose that’s great – but some of us will continue along the less windswept and interesting path of incrementally improving the systems that already work very well for us and countless others.

    1. 2

      I could build into a “fat binary” a FUSE-like filesystem and you could integrate it like same, so I’d like to understand your objection better. Is it convenience, taste, … or operational need.

      A long time ago I had to move a comprehensive system from OS/360 under MVS, and a variant running under Multics … to an early UNIX system. There were tons of dependencies on different OS features/capabilities not then present on UNIX. Eventually found that all of them were distractions, some quite costly, that I remedied with a “fat binary”, because that was the only thing possible at the time.

      The experience left me wary of arbitrary OS abstractions that in the end did not pass muster. I intentionally left out shared libraries from an OS I did, because it did not benefit enough for the added complexity that it added.

      1. 11

        Why would I want to reinvent the file system that I already have, which works just fine, inside a program?

        I understand that shared libraries are a minefield of nuance and are difficult to get right, which is why they often get left out of new projects (e.g., Go) in the early years. Even Go seems to be getting some sort of nascent shared library support, though: see plugins.

        On an established system where we’re already able to use them to great effect, I really see no reason to stop. As with the file system: we’ve built it already, and it works, and we’ll keep improving it. I’m really not at all worried about some hypothetical future in which we’re all suddenly forced to throw out the baby, the bath water, and the bath itself.

        1. 2

          Not to mention the security implications. If there’s a security problem in a library, you can update that library. For Rust/Go apps, you need to update the dependency and recompile and redistribute that application.

          There was a researcher at Ruxcon years ago who was studying embedded libraries in C/C++ projects. An excellent example is Firefox, which doesn’t link to a lot of system libraries, but has its own embedded JPEG and PNG decoders. FF is kinda insane because it’s pretty much it’s own little operating system in a lot of ways.

          1. 2

            It’s a tough balance to strike sometimes. If you’re trying to ship software that will run on lots of systems, but you need to depend on things which don’t really promise a stable interface, sometimes you have no choice but to bundle a private copy.

          2. 1

            You misunderstand. No reinvention is required, one can redirect the kernel to perform the filesystem within an embedded object that is part of an existing container, namely the executable. And this was a “for example” to deal with your need for “data in files” call out. Please don’t obsess on the leaves instead of the forest being discussed.

            The direction being argued here is why do we have some much crap in OS/kernel in the first place. Understand that many just make use of what happens to be there, that’s fine, we all need to get shit done.

            But shared objects create contention for multi-threaded, multi-core systems - they add complexity and reduce the benefits of parallelism and fault-tolerance. So if one aspires to 100x cores/threads/parallelism … we don’t want to spend resource uselessly for abstractions that subtract for little/no gain.

            So back to what I asked you - what are the objections to “everything in the existing container” other than “that’s not what I do right now”? I don’t find all the overhead for additional containers justified by that.

            1. 10

              I don’t “misunderstand”, I just think you’re talking out of your hat.

              What do you mean shared objects create contention? Once the initial relocation is performed, subsequent calls to functions are generally just calls. It’s not like there’s a mutex in the path of every call to a shared library.

              Also, the idea that you can just “redirect the kernel” to store all your files inside an ELF object seems like a stretch at best. What if you need to append to or modify one of those files? Why take on the complexity of something like FUSE to paper over the deficiency of requiring everything to be jammed into the executable, when you could have otherwise just called fopen()?

              It’s true that operating systems are larger and more complicated than they used to be, but that’s because they used to be bloody awful. They used to panic instead of failing a system call. They used to need to operate on machines that had a single processor and I/O bus, and a flat memory architecture. Machines now are hugely more complicated themselves, and modern operating systems reflect our best efforts to provide a substrate for a wide variety of applications.

              I’d rather continue to improve on all of the work done already, where we already have some amazing tools and a lot of accumulated experience, rather than pretend that reinventing everything is a good engineering decision.

              1. 2

                At work, I wrote a program in using Lua. To simplify the installation of the program, I embedded all the Lua modules (both written in C and Lua) into the executable. All I had to do then was extend the normal Lua method of loading modules to include looking inside the executable for them (not hard to do—Lua has hooks for doing just that). That way, we only have one file to install in production, instead of a dozen or so. I don’t need the ability to modify those files, so in this case, it works.

                You got me about the shared libraries though.

                1. 3

                  Yup, do this with statically linked modules and an interpreter (not Lua but similar). Works great, for all the same reasons.

                2. 1

                  Do you understand the term “shared”? That means multiple processes/threads “share” it. As opposed to a “shared nothing” environment, which is entirely impermeable.

                  In a true shared library system, the same library file is mmap’ed content in all the MMU’s of all using processes/threads. Sure the data is copy on modify, but all the support for the shared abstraction involves hardware/software to maintain this abstraction, which isn’t for free. And yes you can have re-entrent sitations with libraries, like I/O and event handing, where you do need to have the code able to anticipate these issues.

                  Many of the early multicore systems had tons of esoteric bugs of this sort, which is why we had global locks to “safe” the programming environment.

                  The kernel has virtual filesystem interfaces that express the semantics of a filesystem implementation, where you can redirect the functionality elsewhere (other hosts via RPC, user processes via library reflection/exception). With it one can embed the content of a filesystem inside an executable container in various ways. And I didn’t say ELF either. (Note that one can shortcut content transfer various ways so its even faster than going through the kernel.)

                  Your argument is presumptive that something is being “papered over” - it’s actually more optimal because there is less code for the common case of the subsystem being within the address space for the references needed by the microservice it is implementing. In net far simpler than what is being done today.

                  Operating systems have a much larger scope of what they have to contend with, that’s why they all tend to “bit rot”, because everyone wants to keep their bit of crap in it, human nature.

                  I didn’t ask what you liked, I asked to defend the need for something. You act as if I’m stealing your dog or something. I doubt you have a clue to anything I’m talking about, and just want to keep obfuscating the discussion so as to hide you lack of understanding of why you use containers, because … you just use them.

                  FIne. Use them. But you don’t know why, and you don’t want to do anything but distract from this fact by focusing on a red herring that was served up as a means to start a discussion. Die on that hill if you must, but you still aren’t responsive to my inquiry.

                  1. 15

                    Hello person relatively new to lobste.rs, just a comment on your how you communicate:

                    In this thread you seem to make a vague and poorly specified claim, like put a FUSE filesystem in your binary or same thing about how the named shared objects means they are shared so things have to be more complicated and error prone. I believe that your vague comments do not help the discussion. While I do not think @jclulow is doing himself a service by responding to you, I to assumed you meant an ELF binary with some crazy FUSE setup.

                    It’s really hard for me to tell if you have a good idea or are just being contrarian given that you aren’t being more specific in your proposal. We’re left making assumptions about what you mean (which is actually our fault) but rather than clarify how our assumptions are incorrect you are seeming to use it as a way to be smug. In particular:

                    Do you understand the term “shared”? That means multiple processes/threads “share” it. As opposed to a “shared nothing” environment, which is entirely impermeable.

                    and

                    Please don’t obsess on the leaves instead of the forest being discussed.

                    I think this discussion would be much more productive if you could specify what you are proposing. I’m quite interested in it finding out.

                    Feel free to completely disregard this comment if you think it’s full of shit.

                    1. 3

                      Even if there is less code running using this method, that does not make the code less potentially buggy. Any OS you’ll likely deploy on will have more tested and proven code than what you or your small team can produce.

                      Less depedance on third parties can be good, but up to a certain to a certain point.

                      I believe, by the way, that Docker at least shares the filesystem layers between similar containers. Not sure how well that works, and if then binaries are also still shared in memory.

                      1. -1

                        But then we have dependencies between containers? Don’t we want containers to be idempotent?

                3. 0

                  Why would I want to reinvent the file system that I already have, which works just fine, inside a program?

                  Why don’t we put sqlite into the kernel?

                  Invention is not the issue. You would use well known libraries in your program instead of developing stuff yourself.

                  I picked sqlite as an example, because it considers itself a competitor to the open syscall and thus related the file systems. Similarly compression and encryption is file system related. Why can’t my kernel treat zip files as directories? Instead some GUIs reinvented that while bash/zsh cannot do that.

                  1. 2

                    Could you elaborate on what your response means? The proposed solution is to build a fat binary that includes a file system in it and to use FUSE to interact with this fat binary. What does that have to do with sqlite in the kernel? To me, the FUSE suggestion seems like a huge hack to get around using kernel primitives for no particularly good reason other than one can, as far as I can tell. I’m not even really sure what it would mean to start editing the binary given one’s usual expectations on binaries.

                    1. 0

                      It is a balancing act, what should be put where. Sometimes it makes sense to put the file system into the binary. Sometimes to put the file system into the kernel. Sometimes to put the file system into a daemon (microkernels). For example I have heard that some databases run directly on block devices because using the file system is slower.

                      Jclulow is “a big fan of having data in files” because the default tools of the operating system can then be used to inspect and change the data. To pursue that way means we should extend the capabilities of the OS, for example by putting sqlite into the kernel. Then default tools of the operating system can then be used to inspect and change the data. I now think that zip file support is actually the better example. You could use ls, grep, find, and awk on zip file contents. It seems to be available via FUSE. It does not seem to be a popular option though. Why? Genuine question but I guess there are good reasons.

                      I do not consider the Unix philosophy that great and it seems to be underlying this discussion (Just use simple files!). Unix likes to compose via files, pipes, and processes (leading to containers). Others prefer to compose via data types, functions, and libraries (leading to fat binaries). I do not see inherent advantages in either of the approaches.

                      1. 3

                        I don’t see the connection to the Unix philosophy here. In Windows it’s common for executables and their configurations to be separate as well. I’m trying to understand exactly what is being proposed but struggling. Are you saying that when I build nginx, the nginx executable has all of the resources it will use including config, and any code for dynamic content it will generate?

                        1. 1

                          Please describe what you mean by “UNIX philosphy”?

              2. 11

                Some people want easy access to the benefits of containerization such as: resource limits, network isolation, privsep, capabilities, etc. Docker is one system that makes that all relatively easy to configure, and utilize.

                1. 4

                  Docker is one system that makes me wish Solaris Zones took off, which had all of that, but without the VM.

                  1. 15

                    What VM? Docker only requires a VM if you run it “non-natively”, on OS X or Windows.

                    1. 1

                      Docker isn’t running in a VM on Linux machines. It uses LXC.

                      1. 10

                        Docker hasn’t used LXC on Linux in a while. It uses its own libcontainer which sets up the Linux namespaces and cgroups.

                    2. 1

                      This is the correct answer. It’s a silly question. Docker has nothing to do with fat binaries. It’s all about creating containers for security purposes. That’s it. It’s about security. You can’t have security with a bunch of fat binaries unless you use a custom jail, and jails are complicated to configure. You have to do it manually for each one. Containers just work.

                      1. 9

                        security

                        That is definitely not why I use it. I use it for managing many projects (go, python, php, rails, emberjs, etc) with many different dependencies. Docker makes managing all this in development very easy and organized.

                        I don’t use it thinking I’m getting any added security.

                        1. 3

                          I don’t use it thinking I’m getting any added security.

                          The question was “Why would anyone choose Docker over fat binaries?”

                          You could use fat binaries of the AppImage variety to get the same, and probably better organization.

                          Maybe if AppImages could be automatically restricted with firejail-type stuff they would be equivalent. I just haven’t seen many developers making their apps that way. Containers let you deal with apps that don’t create AppImages.

                          1. 1

                            Interesting. So in effect you wish to “scope” portions for “protected” or “limited” use in a “fat binarie”. As opposed to the wide open scope implicit in static linking?

                            So we have symbol resolution by simply satisfying an external, resolution by explicit dynamic binding (dynload call), or chains of these connected together? These are all the cases, right?

                            We’d get the static cases handled via the linker, and the dynamic cases through either the dynamic loading functions or possibly wrapping the mmap calls they use.

                          2. 1

                            That sounds genuine.

                            So I get that its one place, already working, to put all the parts in one place. I buy that.

                            So in this case, it’s not so much Docker as Docker, as it is a means to an end. This answers my question well, thank you. Any arguments to the contrary with this? Please?

                            1. 5

                              This answers my question well, thank you. Any arguments to the contrary with this? Please?

                              While I think @adamrt is genuine, I’m interested in seeing how it pans out over the long run. My, limited, experience with Docker has been:

                              • It’s always changing and hard to keep up, and sometimes changing in backwards breaking ways.
                              • Most container builds I come across are not reproducible, depending on HEAD of a bunch of deps, which makes a lot of things more challenging.
                              • Nobody really knows what’s in these containers or how they were built, so they are big black boxes. One can open it up and poke around but it’s really hard to tell what was put in it and why.

                              I suspect the last point is going to lead to many “we have this thing that runs but don’t know how to make it again so just don’t touch it and let’s invest in not touching” situations. People that are thoughtful and make conscious decisions will love containers. People inheriting someone’s lack of thoughtfulness are going to be miserable. But time will tell.

                              1. 1

                                Well these aren’t arguments to the contrary but accurate issues with Docker that I can confirm as well. Thank you for detailing them.

                          3. 5

                            I think there’s something more to it than that. On Solaris and SmartOS, you can have security/isolation with either approach. Individual binaries have privileges, or you can use Zones (a container technology). Isolating a fat binary using ppriv is if anything less complicated to configure than Zones. Yet people still use Zones…

                            1. 4

                              I thought it was about better managing infrastructure. Docker itself runs on binary blobs of priveleged or kernel code IIRC (dont use it). When I pointed out its TCB, most people talking about it on HN told me they really used it for management and deployment benefits. There was also a slideshow a year or two ago showing security issues in lots of deployments.

                              What’s the current state in security versus VM’s on something like Xen or a separation kernel like LynxSecure or INTEGRITY-178B?

                              1. 5

                                Correct. It is unclear the compartmentalization aspect of containers to security specially.

                                I’ve implemented TSEC Orange Book Class B2/B3 systems with labelling, and worked with Class A hardware systems that had provable security at the memory cycle level. Even these had intrusion evaluations that didn’t close, but at least the models showed the bright line of where the actual value of security was delivered, as opposed to a loose, vague concept of security present as a defense here of security.

                                FWIW, what the actual objective that the framers of that security model was, was program verifiable object oriented programming model to limit information leakage in programming environments that let programs “leak” trusted information to trusted channels.

                                You can embed crypto objects inside an executable container and that would deliver a better security model w/o additional containers, because then you deal with issues involving key distribution w/o having the additional leakage of the intervening loss of the additional intracontainer references that are necessary for same.

                                So again I’m looking for where’s the beef instead of the existing marketing buzz that makes people feel good/scure because they use the stuff that’s cool of the moment. I’m all ears for a good argument for all this things, I really am, … but I’m not hearing it yet.

                                1. 1

                                  Thanks to Lobsters, I already met people that worked in capability companies such as that behind KeyKOS and E. Then, heard from one from SecureWare who had eye opening information. Now, someone that worked on the MLS systems I’ve been studying a long time. I wonder if it was SCOMP/STOP, GEMSOS, or LOCK since your memory cycle statement is ambiguous. I’m thinking STOP at least once since you said B3. Do send me an email to address in my profile as I rarely meet folks knowledgeable about high-assurance security period much less that worked on systems I’ve studied for a long time at a distance. I stay overloaded but I’ll try to squeeze some time in my schedule for those discussions esp on old versus current.

                                2. 2

                                  thought it was about better managing infrastructure.

                                  I mean, yes, it does that as well, and you’re right, a lot of people use it just for that purpose.

                                  However, you can also manage infrastructure quite well without containers by using something like Ansible to manage and deploy your services without overhead.

                                  So what’s the benefit of Docker over that approach? Well… I think it’s security through isolation, and not much else.

                                  Docker itself runs on binary blobs of priveleged or kernel code IIRC (dont use it).

                                  Yes, but that’s where capabilities kicks in. In Docker you can run a process as root and still restrict its abilities.

                                  Edit: if you’re referring to the dockerd daemon which runs as root, well, yes, that is a concern, and some people, like Jessie Frazelle, hack together stuff to get “rootless container” setups.

                                  When I pointed out its TCB, most people talking about it on HN told me they really used it for management and deployment benefits. There was also a slideshow a year or two ago showing security issues in lots of deployments.

                                  Like any security tool, there’s ways of misusing it / doing it wrong, I’m sure.

                                3. 4

                                  According to Jessie Frazelle, Linux containers are not designed to be secure: https://blog.jessfraz.com/post/containers-zones-jails-vms/

                                  Secure container solutions existed long before Linux containers, such as Solaris Zones and FreeBSD Jails yet there wasn’t a container revolution.

                                  If you believe @bcantrill, he claims that the container revolution is driven by developers being faster, not necessarily more secure.

                                  1. 2

                                    According to Jessie Frazelle, Linux containers are not designed to be secure:

                                    Out of context it sounds to me like you’re saying “containers are not secure”, which is not what Jessie was saying.

                                    In context, to someone who read the entire post, it was more like, “Linux containers are not all-in-one solutions like FreeBSD jails, and because they consist of components that must be properly put together, it is possible that they can be put together incorrectly in an insecure manner.”

                                    Oh sure, I agree with that.

                                    Secure container solutions existed long before Linux containers, such as Solaris Zones and FreeBSD Jails yet there wasn’t a container revolution.

                                    That has exactly nothing (?) to do with the conversation? Ask FreeBSD why people aren’t using it as much as linux, but leave that convo for a different thread.

                                    1. 1

                                      That has exactly nothing (?) to do with the conversation?

                                      I’m not sure how the secure part has nothing to do with the conversation since the comment this is responding to is you saying that security is the reason people use containers/Docker on Linux. I understood that as you implying that was the game change. My experience is that it has nothing to do with security, it’s about developer experience. I pointed to FreeBSD and Solaris as examples of technologies that had secure containers long ago, but they did not have a great developer story. So I think your believe that security is the driver for adoption is incorrect.

                                      1. -1

                                        Yes. Agree not to discuss more on this thread, … but … jails both too powerful and not enough at the same time.

                                      2. 2

                                        Generally when you add complexity to any system, you decrease its scope of security, because you’ve increased the footprint that can be attacked.

                                  2. 7

                                    Docker (and common practices of its uses) mix several concerns: virtualization (including virtual network), dynamic linking, dependency management, versioning, build system, init-like process supervising. This results in complexity and bloat.

                                    Many people really need only dependency management for dynamically linked libraries. Another alternative to statically linking and fat binaries is Nix, which only manages dependencies (including external binaries like sed and awk mentioned in post) and is like (mentioned in post) bundler and virtualenv, but system-level. Unfortunately, it’s somewhat hard to use and has strange configuration language.

                                    1. 1

                                      Yes, I’ve noticed this as well. And it’s sensible to use a less cryptic tool. Mostly I’m interested not in use cases but underlying architecture, so not a critic of Docket but an iconoclast over unnecessary complexity in an already too conflicted computer systems architecture.

                                    2. 6

                                      I mean, as far as I can tell Docker only exists because it’s impossible to link to glibc statically, so it’s virtually impossible to make Linux binaries that are even vaguely portable. Except now Go and Rust make it very easy to compile static Linux binaries that don’t depend on glibc, and even cross-compile them easily.

                                      This is a quote within the article, so I’m not sure if the author is explicitly endorsing it, but isn’t this wrong? Static linking has always been possible, but the LGPL makes static linking potentially legally problematic, as I understood it. You could always link statically using an alternative libc.

                                      1. 7

                                        If you’re using it entirely within an organization, which is the usual case for Docker-style deployment, neither the GPL nor LGPL impose any particular requirements.

                                        1. 4

                                          LGPL is I think more important reason, but technical one is still there. Parts of glibc must be linked dynamically like DNS resolver.

                                        2. 3

                                          often an app running isn’t just a single binary but a bunch of static files, containers are nice for these situations.

                                          1. 3

                                            System library bindings or dependency bindings aren’t the only kind of application dependencies. Many (most?) applications also have system level dependencies, e.g. sendmail, specific versions of a language runtime, specific versions of imagemagick, organisation specific custom applications, legacy configuration details etc. It’s not clear to me what the author’s answer to this problem is.

                                            He also glosses over what I consider to be one of docker’s biggest advantages: simplifying development environment dependency. Docker as a simpler and faster development environment tool is unmatched. After all, at some point you have to build that static binary from it’s constituent parts, most often in different environments (CI, development machines).

                                            1. 1

                                              Specifically what dependencies? If I go and build say separate images of each of these as isolated intermediate files, then link them (sort of like busybox), what is the net benefit to be “simpler and faster” with Docker create?

                                            2. 1

                                              It’s because for 10+ years we were told that dynamic linking is the only sensible way. But we need static linking because everything is so much easier that way. Docker is just a way to appease this dissonance, plus some features (you gotta have features!) on top.

                                              1. 1

                                                Don’t buy this. At Sun when they (among others) did shared libraries, which was done to allow the growth of libraries/api’s at a time of extreme memory constraints - the idea being that less redundant portions of memory would supply more processes in memory not swapped with access to the same content - kind of a hamburger helper for VM.

                                                At Tandem computers with shared nothing clusters adapted to SMP UNIX variant, the fragility of this arrangement was huge, as the rest of the industry found. It’s easy to have a mess of conflicting versions and nested dependencies that are hard to sort out statically, that are immediately found with static binarys - e.g. the dynamic postpones too much the semantics of all of the bindings, at least the scope of the problem is less with the static binaries. This didn’t help fault tolerance but could be worked around.

                                                In using Docker, one creates a “static world” for dynamic things to live within, so it helpfully bounds the problem. However, when you try to track down all the dependencies/library names/paths/configuration files … trying to make the smallest Docker containers with the most resilience, you end up with a larger NP complete issue than before,

                                                Now I know everyone just likes to make things work that they are comfortable with, so one accepts this as a norm. That’s how we get into “normalization of deviance” - it becomes the new normal as accepted.

                                                Having been down this path with dozens of excellent for their time OS’s, they’re like tribbles eating grain, the excess bulk builds up until it kills them. Entropy, of a sort, enters, grows, … and never leaves.

                                                When I saw UNIX the first time, I couldn’t understand it at first because it was so dense, so few shibboleths, where I’d gotten used to them. It was too concentrated/useful. Because they didn’t have any “space” to tolerate things that didn’t have a definite role. Where the many paths of Multics (“all things to all people”) could only be “one thing to one person” UNIX. So I get the fear of taking away someone’s toy.

                                                But if we are “staticfying” a container again, to bound … why not simplify the contents, where we might be able to prove more easily the deterministic scope/size/security within a smaller, simpler thing to begin with.

                                                And also to be fair … perhaps these addition mechanisms I’m challenging still have a benefit above the reductionism I’ve described. I’m open to that. If so, what is it other than “I’m used to it, don’t want my world to change in any way, stop scaring me and making me feel small about the tools I already love”?

                                                Sorry if I piss you off.

                                              2. 1

                                                You put your “fat binary” in Docker to control the resources that the process is allowed to make use of. Docker isn’t a replacement for binaries. The question here is akin to asking, “Why would anyone choose Docker over chroot?”

                                                1. 1

                                                  Because I can build a durable object that can be singularly managed instead of a collection that doesn’t have a well defined boundary.

                                                  Why I build something in a container. It is one thing to contend with. But I’ve also done this with static binaries too, on systems w/o cgroups and the like. Is it a lack of cgroups/tools that currently cause static binaries to fail at the same thing?

                                                2. 1

                                                  I think for microservices and small programs, fat binaries are arguably better(or at least no worse) than using a container with shared libraries, extra files, etc. But like @jclulow said, once you move past small programs/microservices, then having it all shoved into a single binary will only cause pain. Especially if you have/need multiple programs for some reason, say syslog, linkerd, a watcher process to restart or some other helper app, etc. Then suddenly containers start to be arguably better than fat binaries. Getting multiple programs into a binary is do-able, things like busybox do it, but it definitely complicates things unnecessarily, when there is little to no need. If your program has lots of data files, a database or other static data and shoving that into a binary starts to seem.. less than wise. We have a perfectly good OS and filesystem, that works reliably.

                                                  Is Docker all that and a box of chocolates, definitely not, it has it’s use cases. Shoving Go(or other fat binary) into Docker makes little sense, of that I agree.

                                                  1. 1

                                                    How specifically “does it cause pain”? How specifically are “containers arguably better”? Just because someone says its so? What happens when they say something different/conflicting?

                                                    Feels vs reals?

                                                    1. 3

                                                      Using Python as a specific example, but other VM based languages tend to have similar pains in my experience.

                                                      Pain: Things like Python for example tend to not do well when shoved into fat binaries, other VM based languages are the same. PyInstaller(the app that shoves things into fat binaries) for example still doesn’t support Python3.6. plus my other examples I think are fairly specific, what part would you like more explanation on?

                                                      Containers are arguably better because they are a lot easier to reason about and get all dependencies together. Yes things like Virtualenv exist, but building C libs into Virtualenv’s is not the easiest. It’s much easier to use system package managers/libraries for C libraries and venv’s for python code. Or just use a container and shove all the various dependencies into that, so you get to contain everything you need with the app while also being lazy and using system libraries, python libs, etc when using a container you don’t need to worry about the venv/building C library headaches, you can shove that responsibility off to your system package manager, and still contain 1 application from another. The alternative is something like Nix/Guix, but last I played with them they were not really ready for prime-time.

                                                      I started my entire thing with ‘I think’, so clearly it’s opinion, not fact. If you decided to take it as fact, I worry about your reading comprehension. I’d also worry if you decided to take the linked article as fact.

                                                      As for different/conflicting views, I welcome them! It helps me learn and re-think my attitudes and opinions. How about you?

                                                      How specifically are containers arguably worse (which I assume is the position you are taking)?

                                                      1. 0

                                                        Excellent response, thank you. (Python is also important to me, and understand the difficulties in doing a “pip install” into a static binary w/o needing PyInstaller - already have a nested filesystem. Consider this not to be a problem for this discussion.) Please explain further any of your other examples you decide needs greater scope than just with what I’ve described here, as I’ve like to hear them.

                                                        (I’m beginning to think that its just poor support for doing useful things with static binaries that might be at the heart of creating new containers - one adds to entropy because it’s easier to do a “clean sheet” that way, without regard for messing with peoples dependence on the past.)

                                                        I’ve used Virtual environments to encompass multiple development environments with limited effect. You’re right, C’s too messy to fit that model, although pure Python development its good enough. Package managers always seem to be “work in progress”, where things mostly work, but then you trip across something undone, under done, or flat out wrong, so you spend too much time having to debug someone else’s poorly documented code. Yes I didn’t care much for Nix either. I guess the problem with all of these is you have to rely on others to maintain what you’ll depend on, and so if it isn’t closely related to your own tool base / shell utils, its just too much pain for too little gain. Is that about right?

                                                        Opinions aren’t bad, they just help more if there’s some collateral to justify them. I realize that takes effort, but I do appreciate it when you take the effort. (Also, when it doesn’t appear to bruise egos as my remarks seem to do some - not what I’m after in contributing to this community by challenging opinions.)

                                                        Haven’t taken anything as fact from the linked article. Like many articles, it’s a bit conclusory and absurd, but it does “edge onto” an interesting area. (BTW am no fan of GO and I think Rob Pike should have his head examined.) My “agenda” is one is more compact, less fragile, more obvious, deployable Python distributed applications where N > 10,000 and I can change the OS/kernel to do this the best way w/o any involvement of anyone else. I like my stuff.

                                                        Thank you for your inclusory mind set, I share that aim, and I’d like to encourage your trust in your genuine expression. If what I’m speaking to doesn’t work for you, I’d just like to understand it better, because I’m sure what you’re after is what I’m after to, once I understand it. Don’t want to waste anyone’s time with noise.

                                                        Not talking the position that containers are arguably worse. Just pushing back with “wait, is this really doing what I want, what baggage is it bringing along, and why do I beat my head on this thing when I didn’t before”. So just some casual skepticism, where I’m willing to explore other approaches industriously to check out a hypothesis. (Like building a filesystem into a static executable container just to see that I can do a pip install inside it.)

                                                        So when I make Docker containers, I find that they are difficult to bound with content required by various packages. You’ll end up with things that mostly work, but the exceptions/omissions are often hard to find. If one builds a regression framework to prove a container’s scope of use/function/capability, one seems to spend as much time maintaining the regression framework as one does the container itself. (With the static binary, the issue becomes more of the scope of path names, for that you can chroot/jail and catch the exception and do a fixup.)

                                                        Docker containers thus get rebuilt a lot, which is overly complex to that of a static executable. Also, it’s easier to trace/profile a static executable to get a map of where time/memory is used in a fine grained way - with containers its much more hit/miss, and most of the container’s I’ve inherited from others seem to contain much unused portions as well as obscure additions that are left in “just because they seem to be needed somehow”. These may be insignificant, … but how do you then know the scope of what the container will do then?

                                                        Then, there’s the sudden spikes in memory/storage usage that exceeds the container’s size/resources. For a static binary, one can more easily backtrack memory allocations to find the determinism of “why?”

                                                        Finally, when you want to change code to dynamically shift relocation addresses to foil injection attacks, it’s really simple to do so by relinking the static binary as a single, atomic operation. Doing such with Docker is fraught with surprises, as sometimes you discover dependencies within the libraries that are in part set off by the artifacts of how the libraries are dynamically linked i.e. ordering/assignment. Not to mention debugging this to find these surprises.

                                                        Hope this isn’t TL;DR.

                                                        1. 2

                                                          Containers(i.e. docker) are very well defined, so I don’t see these issues you speak of. Perhaps they come hither and yonder when doing funky things like requiring external disks, etc.

                                                          Why try to push a static filesystem into a static binary when chroot and/or containers do that for you? That’s sort of the whole point.

                                                          As for memory usage of Docker, Docker does have a horrible case of no resource limits by default, but you can definitely force them on. Hashicorp Nomad for example does this by default with Docker containers. If Kubernetes/Marathon/etc don’t do this, that’s kind of sad.

                                                          There are of course surprises when dependency management hits you on the head, but if you limit yourself as much as possible to the OS level dependencies(i.e. dpkg/yum/rpm), especially for C level code, then you shoot yourself in the foot a lot less when dealing with these problems.

                                                          We haven’t really covered security here, and Docker IN THEORY gives you better security, but it’s sort of laughable now to claim that it actually does give you better security, especially with Docker set to default values. Jails and Zones definitely give you better security, and I’d like to think Docker will get there.. eventually, but I’m not holding my breath. It’s hard to bolt security on after the fact.

                                                    2. 1

                                                      Shoving Go(or other fat binary) into Docker makes little sense, of that I agree.

                                                      There are use cases for this. There is a base Go docker image that you can pull into your CI for building/distributing your application. If you use some type of scheduling system (DC/OS with Marathon or Kubernetes), you can then easily cluster that Go app.

                                                      There are lots of different use cases for containers and they can be used to solve a lot of problems … and introduce new ones, like not having security update checks for libraries within the containers.

                                                      1. 2

                                                        Using Docker to build go apps makes some sense, you want your build environments to be well defined and that’s something containers give you. If you are of the mindset to deploy everything via Marathon/Kubernetes I could see a use case for hiding it in a Docker image just to make deployment easier. I’d argue that’s one of the good parts of Hashicorp Nomad, it supports exec as well as docker, so you can run anything, not only docker containers.