1. 1

    I like to think in terms of layers of stability. Application logic at the top, with a stack of libraries underneath, and a library is defined as an API that provides a stability guarantee. Often it’s third-party libraries and then they have their own tests. but sometimes it’s an internal library that’s only used in that one application, and then the stability guarantees can be much looser, but you likely still want to test it.

    App logic
    ----- library interface ---
    unstable implementation goop
    ----- library interface ---
    unstable implementation goop
    

    Which is to say, you might still need tests for some internal APIs, but definitely not all due to the costs mentioned in the article.

    1. 2

      it’s an internal library that’s only used in that one application, and then the stability guarantees can be much looser, but you likely still want to test it.

      I’d say it might be fine to just test it via an application, but yeah, if it is a library enough to have some sort of a interface, it’s better to test this interface. Basically, treat library like a layer from the layers section of the post.

      I’ve just realize that I have an appropriate war story to share. In rust-analyzer, we originally started with keeping the syntax tree library in-tree. Than, at some point we’ve extracted it into a stand-alone rowan package. One problem with it though is that all the tests are still in the rust-analyzer repo. This actually is rather OK for my self: I can easily build rust-analyzer test suite against a custom rowan and see if it breaks. It does make contributing to the library rather finicky though for external contributors, as testing workflow becomes rather non-traditional.

      1. 1

        I’ve made the mistake of testing all layers instead of only layers where stability was meaningful. So key I think is “where is stability a useful property”. Generalized version of “only test public APIs”, I think, since that’s not as meaningful in larger applications.

        1. 1

          Hm, I don’t think the mistake is necessary the subject of testing. It might be just the way tests are written.

          Here’s an example test which tests a very internal layer of rust-analyzer, but without stability implications:

          https://github.com/rust-analyzer/rust-analyzer/blob/5193728e1d650e6732a42ac5afd57609ff82e407/crates/hir_ty/src/tests/simple.rs#L91-L110

          It tests type-inference engine, and prints, for each expression, it’s type. Notably, this operates at the level of completely unstable and changing internal representation. However, because input is rust code, and output is an automatically updatable expectation, the test is actually independent of the specifics of type inference.

    1. 1

      Won’t you lose your root image, though, if spot.io makes a bad prediction?

      1. 1

        See Root Volume persistence section. They create an ami based on your root image (plus root volume snapshots). That’s not to say you shouldn’t have backups, using something like litestream, mysql-backup, etc.

        1. 1

          I read that, but my assumption was they did that on migration, but maybe I misunderstood and they do it regularly?

          1. 1

            It seems to be both

        2. 1

          At least for Azure, you have a choice of VMs being destroyed or deallocated. If they’re deallocated, the disks persist (and you keep paying for them) and you can redeploy the VM. You can opt in to a notification that lets you do a graceful shutdown, as long as you can do so within 30 seconds. I’m tempted to try one of the 72-core machines for package building and have it throw the built packages at an NFS share from a file storage thing and use the local SSD (which is thrown away if the VM is deallocated) for all intermediate build products.

          1. 1

            I believe spot.io supports Azure and other providers, but I haven’t tried it myself.

        1. 2

          For the implementation, I used a simple API server built with libreactor, an event-driven application framework written in C

          Sounded interesting up till this part. Kind of an unrealistic scenario for me. Hardly anything will be so in need of speed that people will forgo a) what they already use and b) what makes sense. A C web framework I’ve never heard of? Hard pass.

          1. 1

            Most of the optimizations he does are likely doable for other servers, and many of those servers had comparable performance in the initial pre-optimized comparison.

          1. 6

            The fact that PyStone showed a significant difference is surprising to me because PyStone doesn’t do any I/O and makes a small fixed number of syscalls, with the number of syscalls not changing at all when I change the number of iterations I ask it to do. For me it does exactly 525 syscalls regardless of whether I ask for 50k iterations or 100k iterations.

            Are you sure the python3.9 inside and outside the container are identical?

            I see the speed reported by PyStone not change when I run it under strace, which I believe makes syscalls slower by a lot more than seccomp does. The time to start the benchmark changes from about 20ms to about 140ms, but the rate at which benchmark iterations run didn’t noticeably change for me under strace. I see about 3%ish variation in the reported speed between repeated runs.

            (Disclaimer: I tested PyStone 1.1 with a tiny patch to call time.clock_gettime(time.CLOCK_MONOTONIC) instead of time.clock(), so my pystone is slightly different than yours. Still I don’t think the inside of the benchmark loop was changed.)

            1. 3
              1. Yes, it’s weird to me too.
              2. Notice I run with --privileged and the performance difference goes away. So it’s not the Python version.
              1. 2

                At the end of the article, the author compared the same container with and without --privileged, the github issue linked in the article seems to narrow down on seccomp being the likely culprit (and/or the possibility that enabling seccomp is triggering additional meltdown/spectre mitigations) .

                1. 3

                  The meltdown/spectre mitigations seems plausible, see https://github.com/docker-library/python/issues/575#issuecomment-840737977

                  1. 1

                    I also read the article. Thing is, you’d expect seccomp would probably only affect syscalls, so it’s still surprising.

                  2. 1

                    The benchmark result looks weird to me as well, since it’s kind of pure computation workload with constant syscalls.

                  1. 1

                    I wonder which Docker images this was tested on. I seem to remember that the official python images are slower than ubuntu because of some detail around how it’s compiled. No link handy, though.

                      1. 1

                        You got it, thanks.

                      2. 1

                        fedora:33, in order to match the host operating system and remove that as a factor.

                      1. 1

                        Always love seeing people use LD_PRELOAD to do terrible-yet-useful things.

                        1. 1

                          This comes across as somewhat vain, but he’s not wrong. Meyer has contributed far more to the world of software engineering than he is given credit for. Eiffel is a beautiful language that I wish were more widely used.

                          (I remember that it was reading Object-Oriented Software Construction sometime in the 90’s or early 2000’s when I finally “got” generic data types. It was revelatory.)

                          (Which is not to say that Meyer invented generic types. That’s just when I first understood them.)

                          1. 2

                            Honestly this makes me suspicious of the contributions I did attribute to him. Looks he’s claiming he invented repeated undo? Seriously?

                            unless someone can point to an earlier reference, then anytime anyone anywhere using an interactive system enters a few “CTRL-Z” to undo commands, possibly followed by some “CTRL-Y” to redo them (or uses other UI conventions to achieve these goals), the software most likely relying on a technique that I first described in the place mentioned above.

                            1. 3

                              Per wikipedia, https://en.wikipedia.org/wiki/CygnusEd had multi-level undo in 1987, before OOSC1 came out in 1988.

                                1. 1

                                  Good old CygnusEd. I still used ED for all my editing needs, but I remember playing with a coverdisk (maybe Aminet? This was 20+ years ago…) version of CygnusEd. I liked TurboText too.

                                  The oldest one I can remember (other than ED itself) was TxEd. That was beautiful in its simplicity.

                            1. 3

                              I am currently fighting a bug in my memory profiler for Python (github.com/pythonspeed/filprofiler/issues/149). Basically the profiler tracks every allocation, and keeps a hashmap from address to allocation size + callstack. Somehow, an allocation is disappearing.

                              After a lot of thinking and debugging, I found one cause (race condition—Rust doesn’t help with “my cached copy of this data got out of sync with original data” bugs). But there’s more… and plausibly it’s the reentrancy prevention code. If the tracking code does malloc() you can end up infinite loop as you track the allocations from the tracking code, so there’s reentrancy flags to prevent that… but doing it wrong can result in memory disappearing. Currently trying to figure out ways to log just enough to get the info I need, without spewing massive amounts of data.

                              Doesn’t help the reproducer is a complex Python program that doesn’t always reproduce the problem.

                              1. 5

                                I’d love to have a website search solution that works in static sites, but also doesn’t use a lot of bandwidth. This is almost that, but it looks like the WASM file is still pretty hefty (400KB) which balances out the efficient retrieval of results.

                                1. 2

                                  There’s https://endler.dev/2019/tinysearch/, for instance.

                                  1. 2

                                    Yeah, the search there is pretty limited, and (unlike this) it downloads whole index. And stork downloads whole index and has large WASM code.

                                  2. 1

                                    The WASM file is pretty hefty, but it isn’t data dependent. That means you can probably avoid fetching it until the user needs it and can set its cache policy to something huge so that it’s a 400KB download that’s amortized over repeated visits in a year. It would be nice if this could be stored somewhere public and shared between multiple sites.

                                  1. 12

                                    And either the Rust standard library or possibly the Rust compiler–I’m not sure which–are smart enough to use a (slightly different) fixed-time calculation.

                                    That’s LLVM. It actually is surprisingly smart. If you have a sum for i from a to b, where the summation term is a polynomial of degree n, there exists a closed form expression for the summation, which would be a polynomial of degree n+1. LLVM can figure this out, so sums of squares, cubes, etc get a nice O(1) formula.

                                    Another good story with the same theme is https://code.visualstudio.com/blogs/2018/03/23/text-buffer-reimplementation. “You can always rewrite hot spots in a faster, lower level language” isn’t generally true if you combine two arbitrary languages.

                                    1. 1

                                      Thanks for the explanation, I’ll update the article.

                                      1. 1

                                        A fairer comparison would thus mean using LLVM on the Python code too (see: Numba). Given the example domain, I’d further be interested to know the speed difference between Rust on the CPU to Python on the GPU.

                                        1. 1

                                          These are all toy examples, the point was never that Rust is faster, as I mention Rust can actually be slower than Cython. The updated article points out on the default compiler on macOS is clang, so you might get that optimization with Cython too.

                                    1. 2

                                      It feels a bit wrong to test a bunch of Cython code and say this is all endemic to C extensions. I’m not 100% fresh on all this but the serialization/deserialization problem exists if you do that in the first place! You could choose not to if you were interfacing with the raw CPythom interface.

                                      I wonder what mypyc would give as a result here as well… probably worse but its codegen tries to rely heavily on branch prediction to make unwrapping cheap.

                                      1. 3

                                        Cython is a highly optimized use of the raw CPython interface. PyO3 uses the CPython interface too and it has much higher overhead since it’s not been optimized as much yet.

                                        And yes, you can just do math using Python integers without serialization/deserialization… and then it won’t be any faster than Python’s slow math (internally Python has to deserialize in order to do the underlying CPU addition instructions, and then reserialize into a Python integer, for example).

                                        You can’t get away from the cost given the way CPython is implemented. PyPy is a whole different thing.

                                      1. 5

                                        Great article, thanks for posting it! The date on it is 2013, not 2016; I wonder what parts of it are out of date by now.

                                        Unlike a desktop, wherein virtual memory can “spill over” to external storage, the same does not hold true [on ios] (largely due to limitations of flash memory)

                                        I wonder about that. Nowadays desktop systems are almost all flash-based too, and I assume the same type of flash (iOS storage is about as fast as typical desktop SSDs), so it seems unlikely that limitations of flash are to blame. And current iOS devices come with tons of storage, more than enough for VM swap space.

                                        I think the reason iOS kills apps rather than swapping them out is because the UI only displays one app at a time (maybe sometimes two, if you’re that rare person who actually knows how to use iPad multitasking.) Since iOS apps tend to launch quickly and restore their state, it makes sense to swap them out on a process level rather than a memory-page level. It’s usually barely noticeable when you switch back to an app and it has to relaunch.

                                        1. 3

                                          The iOS policy is also a little bit about garbage collection. The fastest GC that you can build on a *NIX system is exit. This collects all of your memory in a tiny amount of time. Because Objective-C is a superset of C, and C makes an object’s address available as a convenient identifier for hashes / comparisons, you can’t add a copying GC to Objective-C and so you’re likely to see fragmentation over time. If you kill the process, collect all objects, and then recreate the ones that you actually need, it improves locality.

                                          macOS also uses the sudden-termination support that was added for iOS. Applications notify the OS that they’ve saved all important state (and a lot of GUI state is saved by Cocoa automatically) and they will be quietly killed if the user doesn’t switch to them. The window server keeps the contents of their windows as textures and when the user does try to switch to any of them again they’re displayed immediately and the application is restarted in the background and resumes ownership of the windows. You can sometimes see a jerk when this happens. For example, Preview used to store which page of a PDF you were reading, but not where and so if you switched away, used a load of memory, and switched back, you’d see a short pause and then the view would jerk to the top of the page.

                                          1. 2

                                            Yes I saw a video of someone swiping (killing) apps on an iPhone to ‘clean up’ the memory. I don’t think they realised that this is done automatically as memory is needed for ‘in use’ apps, and that most of what they swiped was just screenshots.

                                            1. 2

                                              Android people do it also. I had to stop family members from doing so after they didn’t want to much “Backgroind stuff” and a “clean system”. Which obviously ended up creating slower a system.

                                              1. 2

                                                This is apparently a meme that was probably started back in the OG iPhone days.

                                                Some people might just like to keep things “neat” though. It’s a bit like a digital fidget spinner.

                                              2. 1

                                                At the bottom it mentions an update in 2016. But yeah it’s unclear how much has changed since.

                                                On macOS I’ve noticed swapping seems more aggressive than Linux, it tries to keep more RAM available, but that’s in limited situations/configurations.

                                              1. 3

                                                Since the original write-up BuildKit has stabilized and is now the preferred way to do build secrets, so this covers that as well as other mechanisms like expiring access tokens.

                                                1. 8

                                                  This has -march=x86-64-v[234], which is a nice step forward for getting more common baseline usage of modern CPU features. See e.g. https://www.phoronix.com/scan.php?page=news_item&px=GCC-11-x86-64-Feature-Levels for details.

                                                  1. 3

                                                    This comes in real handy for speeding up container image builds. Package managers like dpkg, rpm and the like tend to be real finickety about making sure the files that make up your system are actually safely on the disk. So when your build system (effectively) gets transactional updates, all they do is slow things down for little benefit.

                                                    1. 2

                                                      I believe the “official” Debian (and probably Ubuntu) images are now configured to disable fsync()ing for stuff installed with dpkg, at least:

                                                      https://github.com/debuerreotype/debuerreotype/blob/d29dd5e030525d9a5d9bd925030d1c11a163380c/scripts/debuerreotype-minimizing-config#L45

                                                      1. 1

                                                        Presumably you could just use a tmpfs for the backing file system, though? Or if using ZFS, you could set sync=disabled which effectively makes things unsafe but fast.

                                                        1. 1

                                                          It’s hard to use tmpfs or zfs when you use dpkg/rpm to update the system or build docker image.

                                                          1. 1

                                                            As someone who doesn’t use dpkg/rpm/docker, what makes tmpfs or zfs difficult in those cases?

                                                      1. 1

                                                        What are common use cases - beside machine learning - that need loading SQL data in memory?

                                                        1. 2

                                                          I suspect there’s a whole lot of business processes that do this: generating reports, loading data into another system, etc..

                                                          1. 1

                                                            Why not relying on SQL queries? Why would we prefer to fetch the rows and query inside the application?

                                                        1. 1

                                                          My family’s (admittedly refurbished) ThinkPads ended up having enough problems that we’ve given up on them. Some clearly hardware, some maybe hardware maybe Linux. When latest one died we got System 76 instead, at least it’s designed to run Linux.

                                                          1. 15

                                                            It looks like megacorps are starting to take Bitcoin seriously. What happened to corporate social responsibility? Oh that’s right. It only applies when it doesn’t affect the bottom line.

                                                            In the immortal words of Pink Floyd, “ha ha, charade you are”.

                                                            1. 4

                                                              I would not assume they’re doing this to make money. In large organizations individual incentives are often quite divorced from making money for the organization. Instead, incentives might be “creating a splashy product will get me promoted” or “everyone is doing this, if it happens to turn out to be a big thing I’ll look stupid if I didn’t have a project in this area”.

                                                              1. 10

                                                                I’m frankly worried by an uptick in bitcoin adoption by well-known companies and “nerd-celebrities” over the last several months. Here is a selection of links.

                                                                Last but not least, we have the height of hypocrisy: people can buy a Tesla with Bitcoin. (@skyfaller already called Tesla out upthread).

                                                                Herd instinct appears to be taking its course.

                                                                1. 2

                                                                  I’ve only read a small amount about Microsoft’s incentives here. But according to product lead Daniel Buchner (https://github.com/csuwildcat), Microsoft gave him this opportunity after years of toiling away on standards and working at Mozilla. So someone at Microsoft with some influence really pursued the talent and the money to put this together.

                                                                2. 4

                                                                  social responsibility

                                                                  There is a social benefit to decentralized technology (of which blockchain is one implementation mechanism) as well, which is mainly to do with circumventing centralized censorship and thereby enabling various subcultures to co-exist on the internet (as it used to be before Big Tech began controlling narratives) without compromising on localized moderation[1] of them.

                                                                  [1] cf. ‘decentralized moderation’, eg: https://matrix.org/blog/2020/10/19/combating-abuse-in-matrix-without-backdoors

                                                                  1. 5

                                                                    If they wanted a decentralized system, they could have used one that wasn’t so egregiously wasteful, or invested in bringing more efficient options like proof-of-stake to fruition instead of latching onto bitcoin.

                                                                    1. 1

                                                                      Yeah, I’m not sure what’s going on here. From their 2020 docs:

                                                                      Currently, we’re developing support for the following ledgers: Bitcoin, Ethereum, via uPort, Sovrin

                                                                      Our intention is to be chain agnostic, enabling users to choose a DID variant that runs on their preferred ledger.

                                                                      I’ve attempted to play with the API here, but it seems like it has been depreciated. At some point they must have decided to go all in on Bitcoin. Maybe they’re also going to next uphold the promise to develop on other ledgers.

                                                                  2. 2

                                                                    Well sure - it’s right there in the articles of incorporation. For better or for worse, social responsibility isn’t part of the material of operating a business.

                                                                    It’s interesting that Microsoft sees a place to profit here.

                                                                    1. 1

                                                                      It’s never been a genuine thing, and can’t really be.

                                                                    1. 17

                                                                      I’m trying to find a charitable interpretation for the fact that “avoid installing security updates because this distribution tool can’t handle updating in a secure manner” has ever even been considered as a form of best practice. Charitable as in not leaning towards “web developers gonna web develop”, which I would’ve been happy with 15 years ago but I realise perfectly well that’s not the right explanation. I just can’t, for the life of me, figure out the right one.

                                                                      Can someone who knows more about Docker and DevOps explain this old Unix fart why “packages inside parent images can’t upgrade inside an unprivileged container” is an argument for not installing updates, as opposed to throwing Docker into the thrash bin, sealing the lid, and setting it on fire?

                                                                      1. 13

                                                                        This is not a problem with Docker the software. Docker can install system updates and run application as non-privileged user. The article demonstrates how, and it’s not like some secret technique, it’s just the normal way documented way.

                                                                        This is a problem with whoever wrote this document just… making nonsensical statements, and Docker the organization leaving the bad documentation up for years.

                                                                        So again, Docker the software has many problems, but inability to install security updates is not one of them.

                                                                        1. 1

                                                                          Has that method always worked? Or is it a recent addition for unprivileged containers? I’m just curious to understand how this ended up being the Docker project’s official recommendation for so many years that it ended up in linters and OWASP lists and whatnot. I mean none of these cite some random Internet dude saying maybe don’t do that, they all cite the program’s documentation…

                                                                          1. 5

                                                                            When I (and the documentation in question) say “unpriviliged” in this context it means “process uid is not root”.

                                                                            There’s also “unpriviliged containers” in the sense that Docker isn’t running as root, which is indeed a new thing also is completely orthogonal to this issue.

                                                                            1. 1

                                                                              Now it sounds even weirder, because the documentation literally says “unprivileged container”, but I think I got your point. Thanks!

                                                                        2. 5

                                                                          Well, the article did point out that you can upgrade from within Docker. The problem is that the OS running inside Docker can’t assume it has access to certain things. I only skimmed the article, but I think it mentioned an example where updating an Linux distro might cause it to try to (re)start something like systemd or some other system service that probably doesn’t work inside a Docker container.

                                                                          However, that really doesn’t address your main point/question. Why was this ever advice? Even back in the day, when some OSes would misbehave inside Docker, the advice should have been “Don’t use that OS inside Docker”, not “Don’t install updates”.

                                                                          I think the most charitable explanation is that developers today are expected to do everything and know about everything. I love my current role at my company, but I wear a lot of hats. I work on our mobile app, several backend services in several languages/frameworks, our web site (ecommerce style site PHP + JS), and even a hardware interfacing tool that I wrote from scratch because it only came with a Windows .exe to communicate with it. I have also had to craft several Dockerfiles and become familiar with actually using/deploying Docker containers, and our CI tool/service.

                                                                          It’s just a lot. While I always do my best to make sure everything I do is secure and robust, etc, it does mean that sometimes I end up just leaning on “best practices” because I don’t have the mental bandwidth to be an expert on everything.

                                                                          1. 2

                                                                            it mentioned an example where updating an Linux distro might cause it to try to (re)start something like systemd or some other system service that probably doesn’t work inside a Docker container.

                                                                            That’s not been true for years, for most packages. That quote was from an obsolete article from 2014, and only quoted in order to point out it’s wrong.

                                                                            1. 2

                                                                              I didn’t mean to imply that it was! If you read my next paragraph, it might be a little more clear that this isn’t an issue today. But I still wonder aloud why the resulting advice was ever good advice- even when this particular issue was common-ish.

                                                                              1. 1

                                                                                AFAICT the current version of best practices page in Docker docs was written in 2018 (per Wayback Machine), by which point that wouldn’t have been an issue. But maybe that’s left over from an older page at a different URL.

                                                                          2. 5

                                                                            I am not a Docker expert (or even user), but as I understand the OCI model you shouldn’t upgrade things from the base image because it’s a violation of separation of concerns between layers (in the sense of overlay filesystem layers). If there are security concerns in the base packages then you should update to a newer version of the image that provides those packages, not add more deltas in the layer that sits on top of it.

                                                                            1. 2

                                                                              That makes a lot more sense – I thought it might be something like this, by analogy with e.g. OpenEmbedded/Yocto layers. Thanks!

                                                                              1. 1

                                                                                This doesn’t hold water and is addressed in the article.

                                                                                The way Docker containers work is that they’re built out of multiple, composable layers. Each layer is independent and the standard separation of concerns layer based.

                                                                                So after pulling a base container, the next layer that makes sense is to install security updates for the base image. Any subsequent changes to the base image will re-install security updates.

                                                                                Often base images are updated infrequently, So relying on their security update is just allowing security flaws to persist your application.

                                                                                1. 1

                                                                                  To me, an outsider who uses Docker for development once in a while but nothing else, a separate layer for security updates doesn’t make much sense. Why would that be treated as a separate concern? It’s not something that is conceptually or operationally independent of the previous layer, something that you could in principle run on top of any base image if you configure it right – it’s a set of changes to packages in the parent layer. Why not have “the right” packages in the parent layer in the first place, then? The fact that base images aren’t updated as often as they ought to be doesn’t make security updates any more independent of the base images that they ought to be applied to. If that’s done strictly as a “real-world optimisation”, i.e. to avoid rebuilding more images than necessary or to deal with slow-moving third parties, that’s fine, but I don’t think we should retrofit a “serious” reason for it.

                                                                            2. 3

                                                                              Charitable as in not leaning towards “web developers gonna web develop”

                                                                              I kind of want to push back on this, because while it’s easy to find examples of “bad” developers in any field of programming, I think it’s actually interesting to point out that many other fields of programming solve this problem by… not solving it. Even for products which are internet-connected by design and thus potentially exploitable remotely if/when the right vulnerability shows up. So while web folks may not be up to your standards, I’d argue that by even being expected to try to solve this problem in the first place, we’re probably ahead of a lot of other groups.

                                                                              1. 1

                                                                                Yeah, that’s exactly why I was looking for the right explanation :-). There’s a lot of smugness going around that ascribes any bad practice in a given field to “yeah that’s just how people in are”, when the actual explanation is simply a problem that’s not obvious to people outside that field. Best practices guides are particularly susceptible to this because they’re often taken for granted. I apologise if I gave the wrong impression here, web folks are very much up to my standards.

                                                                            1. 3

                                                                              this leaves out the update/upgrade results in image size bloat as well, and things like AWS ECR can charge by the byte.

                                                                              … but, not surprised; this same author once wrote against using alpine because they “once couldn’t do DNS lookups in Alpine images running on minikube (Kubernetes in a VM) when using the WeWork coworking space’s WiFi.”

                                                                              1. 14

                                                                                this leaves out the update/upgrade results in image size bloat as well

                                                                                The author has published a lot of stuff on Docker, including tips for how to avoid that. And given that a lot of real-world deployments will install at least some system packages beyond the base distro, this isn’t exactly a knock-down argument – they’ll have to grapple with how to do that sooner or later.

                                                                                this same author once wrote against using alpine because

                                                                                This is a disingenuous cherry-pick. The article in question mentions a couple weird bugs and incompatibilities as a kind of “cherry on top” at the end, but they’re not the main argument against using Alpine images. In fact the argument is specifically against using Alpine for containers that deploy Python applications, and is based on pointing out the big disadvantage: on Alpine you don’t have access to pre-compiled “wheel” packages from the Python Package Index, which both slows down your builds (you have to compile from source any C/other-language extensions used in a Python package) and potentially increases image size (since you need to have a compiler toolchain and any necessary libraries to link against, and it’s harder to do the “all in a single RUN” trick without breaking out a large separate script that the Dockerfile invokes to do all the compilation and installation of compiler dependencies).

                                                                                1. 12

                                                                                  The good news is that there’s a PEP now for musl wheels (https://www.python.org/dev/peps/pep-0656/) so if it’s accepted, it’s possible that the situation on Alpine will improve.

                                                                                  1. 7

                                                                                    Multi-stage builds solve the “need to install the compilers and they bloat the image” problem, right? The majority of my Dockerfiles are structured like

                                                                                    FROM base-image AS build
                                                                                    RUN install_some_packages
                                                                                    RUN build_some_code
                                                                                    
                                                                                    FROM base-image
                                                                                    COPY --from=build /some/build/artifacts /final/destination
                                                                                    

                                                                                    One convenient thing about doing it this way is that you don’t need to worry about minimizing layers in the build stage since none of them will end up in the final result anyway. You can do one command per RUN with none of the && shenanigans.

                                                                                    1. 1

                                                                                      Now try the same exercise with Java 8 and Python 3.9 installed from packages. We have 3 images, one needs only Java one needs Python and one needs both. I think it is nice to have a build image and you can achieve this with most of the CI/CD solutions out there. What is not possible to have a combination of two images easily. It is only possible in a crude way, copy pasting the installation steps between Dockerfiles. So we are back to copy paste computing.

                                                                                      1. 4

                                                                                        What is not possible to have a combination of two images easily.

                                                                                        The real issue is that Dockerfiles are not declarative and do not have a conception of packages and their dependencies. As a result you have to rely on fragile sequences of steps and ‘manually’ copying stuff to the final image and hope that you didn’t miss anything.

                                                                                        Building a complex Docker image is fairly trivial with most declarative package managers, such as Nix, Guix, or Bazel. E.g. with Nix you do not specify the steps, but what a container image should contain. Everything gets built outside containers in the Nix sandbox, then the image is built from the transitive (runtime) closure. You get images that only contain all the necessary and only the necessary packages. Moreover, the images are far more reproducible, since all the dependencies are explicitly specified [1], down to the sources through fixed-output derivations.

                                                                                        [1] Unless you rely on channels or other impurities, but these can be avoided by pinning dependencies and avoiding certain functions and/or using flakes (beware, beta).

                                                                                        1. 3

                                                                                          There is a slightly more ergonomic way to do this: the builder pattern. You use a set of base Dockerfiles. One will have your compilers and such in it, then another Dockerfile for each image you intend to build. In the subsequent, or dependent, Dockerfiles, you just copy the files out of the first image you built. Wire the whole thing up using a shell script of about 5 - 10 lines of code.

                                                                                          Yes, this breaks the ability to just docker build -t myimage . but it gives you the ability to not have to copy and paste.

                                                                                          In my world, I use this pattern to build a php-fpm backend API server and then have a container with nginx and the static assets in a separate container. It takes three Dockerfiles and a small shell script (or Makefile).

                                                                                    2. 1

                                                                                      It does say “Dockerfiles in this article are not examples of best practices”; perhaps the clean-up from an upgrade is one of the things omitted for clarity. It’s a fairly straightforward chunk of disto-specific boilerplate, an easy problem to solve.

                                                                                      1. 7

                                                                                        There is actually image size bloat, unfortunately. The package index/download cleanup isn’t the bloat (in fact these days Debian and Ubuntu official images clean that up atuomatically) but the fact you end up with two copies of the installed packages. So e.g. if you upgraded libzstd1, now your image is storing two copies.

                                                                                        So if the base Ubuntu image (which is created from tarball) was always up-to-date with security updates that would in fact result in slightly smaller images. It isn’t, though, and it’s better to have a slightly larger image than to not have security updates.