1. 35
  1.  

  2. 12

    There are many good counter arguments in this thread, but I still do like the main premise of the article - infrastructure should be boring as hell and without huge changes every so often. How it is to be done, that depends on many factors (not commenting on recommendations from the article). From what I noticed in practice, old and boring stuff works for me, as there are less dark corners, it is easier to find someone else who had the same problem (and solution, hopefully), and is often simpler to reason about, which I value when things go south.

    1. 11

      I really don’t want to spend my free time tracking down how the latest kernel pulls in additional functionality from systemd that promptly breaks stuff that hasn’t changed in a decade, or how needing an openssl update ends up in a cascading infinitely expanding vortex of doom that desperately wants to be the first software-defined black hole and demonstrates this by requiring all the packages on my system to be upgraded to versions that haven’t been tested with my application.

      I find it impossible to continue reading after this. Nobody is forced to run Gentoo or Arch Linux on a production server, or whatever the hipster distribution of the day is. There are CentOS and Debian when some years of stability are required. More than any of the BSDs offer.

      1. 3

        Well, the rest also mentions apt-hell with debian and package upgrading.

        Can you elaborate on the last sentence?

        1. 10

          Well, the rest also mentions apt-hell with debian and package upgrading.

          I read that section now… it seems to imply you are forced to update Debian every year to the latest version otherwise you don’t get security updates. Does the author even know Debian? apt-hell? Details are missing. I’m sure you can get into all kinds of trouble when you fiddle with (non official) repositories and/or try to mix&match packages from different releases. To attempt this in production is kinda silly. Nobody does that, I hope :-P

          Can you elaborate on the last sentence?

          I’m not aware of any BSD offering 10 year (security) support for a released version, I’m sure OpenBSD does not, for good reason, mind you. It is not fair to claim updates need to be installed “all the time” as the poster implies and will result in destroying your system or ending up in “apt-hell”. Also, I’m sure BSD updates can go wrong occasionally as well!

          I’m happy the author is not maintaining my servers on whatever OS…

          1. 18

            I read that section now… it seems to imply you are forced to update Debian every year to the latest version otherwise you don’t get security updates.

            We have many thousands of Debian hosts, and the cadence of reimaging older ones as they EOL is painful but IMO, necessary. We just about wrapped up getting rid of Squeeze, some Wheezy hosts still run some critical shit. Jessie’s EOL is coming soon and that one is going to hurt and require all hands on deck.

            Maybe CVEs still get patched on Wheezy, but I think the pain of upgrading will come sooner or later (if not for security updates, then for performance, stability, features, etc.).

            As an ops team it’s better to tackle upgrades head on, than to one day realize how fucked you are, and you’re forced to upgrade but you’ve never had practice at it, and then you’re supremely fucked.

            And, yes, every time I discover that systemd is doing a new weird thing, like overwriting pam/limit.d with it’s own notion of limits, I get a bit of acid reflux, but it’s par for the course now, apparently.

            1. 3

              This is a great comment! Thanks for a real-world story about Debian ops!

              1. 5

                I have more stories if you’re interested.

                1. 3

                  yes please. I think it’s extremely interesting to compare with other folks’ experiences.

                  1. 7

                    So, here’s one that I’m primarily guilty for.

                    I wasn’t used to working at a Debian shop, and the existing tooling when I joined was written as Debian packages. That means that to deploy anything (a Go binary e.g. Prometheus, a Python Flask REST server), you’d need to write a Debian package for it, with all the goodness of pbuilder, debhelper, etc.

                    Now, I didn’t like that - and, I won’t pretend that I was instrumental in getting rid of it, but I preferred to deploy things quicker, without needing to learn the ins and outs of Debian packaging. In fact, the worst manifestation of my hubris is in an open source project, where I actually prefer to create an RPM, and then use alien to convert it to a deb, than to natively package a .deb file (https://github.com/sevagh/goat/blob/master/Dockerfile.build#L27) - that’s how much I’ve maneuvered to avoid learning Debian packaging.

                    After writing lots of Ansible deployment scripts for code, binaries, Python Flask apps with virtualenvs, etc., I’ve learned the doomsday warnings of the Debian packaging diehards.

                    1. dpkg -S lets you find out what files belong to a package. Without that, there’s a lot of “hey, who does /etc/stupidshit.yml belong to?” all the time. The “fix” of putting {% managed by ansible %} on top is a start, I guess.
                    2. Debian packages clean up after themselves. You can’t undo an Ansible playbook, you need to write an inverse Playbook. Doing apt-get remove horrendous-diarrhea-thing will remove all of the diarrhea.
                    3. Doing upgrades is much easier. I’ve needed to write lots of duplicated Ansible code to do things like stat: /path/to/binary, command: /path/to/binary --version, register: binary_version, get_url: url/to/new/binary when: {{ binary_version }} < {{ desired_version}}. With a Debian package, you just fucking install it and it does the right thing.

                    The best of both worlds is to write most packages as Debian packages, and then use Ansible with the apt: module to do upgrades, etc. I think I did more harm than good by going too far down the Ansible path.

                    1. 1

                      Yeah, this is exactly my experience. Creating Debian packages, correctly, is very complicated. Making RPM packages is quite easy as there’s extensive documentation on packaging software written in various languages. From PHP to Go. On Debian there is basically no documentation, except for packaging software written in C that is not more complicated than hello_world.c. And there are 20 ways of doing something, I still don’t know what the “right” way is to build packages that works similar to e.g. mock on CentOS/Fedora. Aptly seems to work somewhat, but I didn’t manage to get it working on Buster yet… and of course it still doesn’t do “scratch” builds on a clean “mock” environment. All “solutions” for Debian I found so far are extremely complicated, no idea where to start…

                      1. 1

                        FreeBSD’s ports system creates packages via pkg(8) which has a really simple format. I have lots many months of my life maintaining debian packages and pkg is in most ways superior to .deb. My path to being a freebsd committer was submitting new and updated packages, the acceptance rate and help in sorting out my contributions was so much more pleasurable than the torturous process that I underwent for debian packages. Obviously everbody’s experience is different, and I’m sure there are those who have been burned by *BSD ports zealots too.

                        Anyway it’s great to see other people who also feel that 50% of sysadmin work could be alleviated by better use of packages & containers. If you’re interested in pkg, https://hackmd.io/@dch/HkwIhv6x7 is notes from a talk I gave a while back.

        2. 1

          Ive been using the same apps on Ubuntu for years. They occasionally do dumb things with the interface, package manager, etc. Not much to manage, though. Mostly seemless just using icons, search, and the package manager.

        3. 11

          I think my favorite part is when they start off by complaining about all those new fangled features that Linux has, only to then sing the praises of filesystem level snapshots in ZFS.

          Second place would be pointing to FreeBSD having one way of doing things (there aren’t enough FreeBSD developers left to maintain two ways) as great design, followed by being irritated that different Linux distributions have adopted one way of doing things via systemd.

          Ultimately it feels like they are bewildered by the ability of the Linux world to maintain multiple distributions, each with different audiences and functionality. They prop up the argument that this somehow means a given user would need to be knowledgeable of and interacting with all distributions at any given time, rather than just picking one that has the qualities they need and following its conventions. Turns out when you have lots of users they come with lots of use cases, that’s where diversity across distributions really shines.

          Also if I never saw another BSD Makefile again it’d be too soon.

          1. 10

            The article isn’t very good, but neither is this counterargument. File system snapshots don’t break existing scripts or workflows. Replacing ‘ifconfig’ with ‘ip’, switching to systems, and so on does.

            1. 3

              The article isn’t very good, but neither is this counterargument. File system snapshots don’t break existing scripts or workflows.

              Of course they can. If your data sets are not fine-grained enough, then reverting to an older snapshot will result in a loss of data since the snapshot. If you use fine-grained snapshots, you can snapshot the system separate from data (e.g. /var). However, this can have other bad side-effects, e.g. a never version may update the format of an application’s data files, alter the database schema of an application that you are hosting etc. If you revert to an older snapshot, applications may stop working, because the data is incompatible with the older application.

              Replacing ‘ifconfig’ with ‘ip’, switching to systems, and so on does.

              That tired argument again. First of all, iproute2 (of which ip is part) has been around since 2001. It’s almost 20 years old! Secondly, most Linux systems still have net-tools (the package that contains ifconfig). E.g. on my machine:

              % readlink $(which ifconfig)
              /nix/store/3km31zw50hh5madank3ja4dvrq6rgvcl-net-tools-1.60_p20170221182432/bin/ifconfig
              
              1. 1

                Of course they can. If your data sets are not fine-grained enough, then reverting to an older snapshot will result in a loss of data since the snapshot

                If my workflow doesn’t involve snapshots, how are they getting reverted? Are you implying that there’s something that’s automatically reverting things without a change in my behavior?

            2. 2

              Sorry nickbp you seem to have read far more into my rant than I wrote. I’m not “bewildered by linux” as you put it, I’m just bored by wasting my time (and customer money) on needlessly fiddling with the carpet that we use to run our business on. I frequently work with small orgs who have their entire tech team being under ten people, and they simply don’t want to bother with “the stuff below the app”. zfs & jails are old tech, and even in 2016 for FreeBSD old tech. At that time, docker and containers outside the lxc world was evolving very very fast. The company had a choice between spending substantial time and effort keeping up to date with docker, or shipping valuable features to customers and the business. If you’re working for a company that has 200 people working on supporting devs, then that’s a totally different proposition.

            3. 5

              author of said post here, there’s a fair bit of context that I had to elide at the time of writing but as I’ve since left that org and it’s subsequently been sold, enough water has passed under the bridge to put a few more cards on the table…BTW lobsters has a rant tag, it should be used for this. So enjoy my use of florid language and gloss over the lack of perfection in the article’s flow.

              I’ll endeavour to answer some of the comments as best as I can below. BTW I’m impressed at how much people have been able to assume from this short article about my general experience and competence. Nobody is perfect, and the older we get the more we appreciate the scale of our ignorance… live a little folks. It’s ok to rant on the internet about tech.

              This post written in 2016 after the experience of ~ 2 years of continuing severe openssl bugs & CVEs https://www.openssl.org/news/vulnerabilities.html - a massive increase in reported & critical OpenSSL library vulnerabilities - more than 20 each year, and while we could avoid some of them, or mitigate by tweaking ciphers, not all were avoidable and we needed to upgrade both kernel and userland regularly. The dev team was 3 people, no CI, no code reviews, no test suite, design docs nor ops documentation, and no ethos of care in the dev team. Some people would yolo their patches in and disappear off for a few days, unreachable. Shortly after I started, the remaining original person disappeared, and the founder again went on a 3 month holiday “call me if you need anything”. In short it was a complete mess, and the infrastructure reflected that.

              For some reason debian unstable had been chosen as the base os, apparently “coz security”. This meant it was almost impossible to do reproducible builds, and any software deployed last month would end up with a different set of packages than today. The organisation’s main app was written in perl, and relied on a mix of OS-provided packages, and CPAN & bespoke ones. This meant that every deploy to a fresh system would be a slightly different mix than the last time. Because of OpenSSL’s rate of change, pinning wasn’t appropriate either.

              The last straw that broke the camel’s back was another kernel update to pull in a critical OpenSSL fix (I forget which one now), which also pulled in a perl RC, including a bug in the HTTP socket layer that caused any transfer that was exactly 1024 bytes long to hang. I recall that took about 6 hours of serious debugging to get down from the top-level “our app isn’t working sometimes with 3rd party APIs” to the specific change, and the corresponding 1 line fix.

              The obvious thing in this case is to simply roll back to the previous app version, and figure this out in the morning, but there was so simple way of doing that in the current environment. At the time it was decided to push on through and try to track down the bug (the yolo hero culture), when in hindsight we should have backed the truck up and examined our poor life choices including not having sufficient test suites.

              This, more or less, was the mindset that I wrote the rant in. IIRC the previous year included moving to the debian-unstable-with-systemd changes, which was a complete balls-up. I would have gladly stayed at that point and switchd jessie, and waited for the dust to settle, but that wasn’t my call.

              2016 was an interesting time - docker was bringing lxc to the masses, and looked interesting but was evolving rapidly – too rapidly to be rely on in a small dev team, and container performance was noticeably sucky at that time. For contrast, zfs & jails in FreeBSD was available for a long time, well enough to be very stable. Boot Environments and snapshots were a killer feature, and IMO still is.

              i’ve commented where it helps below, if you feel the need to troll, please show us your l33t p30p1e sk1llz.

              1. 5

                I have been a huge fan of the OSX project called homebrew. […] [FreeBSD ports] is based on standard BSD makefiles, which while not as nice as homebrew’s ruby-based DSL, are very powerful.

                I find it hard to swallow an article telling me that my infrastructure should be boring, and then praising a Makefile remake in Ruby as “nice”. It’s too bad, because I do agree with the gist of the article.

                1. 2

                  Are you well versed in homebrew or is “makefile remake in ruby” your take from the outside?

                  1. 1

                    I noticed the author’s comparison between BSD Makefiles and “homebrew’s ruby-based SDL”, which I assume means that Homebrew uses it’s own Ruby-based SDL instead of Makefiles. I’d be interested to know if that is not correct, and what the author actually meant.

                    1. 1

                      DSLs are subjective. Instead of you assuming what the author meant you could take a look at the homebrew DSL and make your own judgement. For example, here’s the erlang one:

                      I know which one I prefer working with, but I also know which one I would rely on for reproducible builds.

                      1. 1

                        I’m not commenting on Homebrew’s DSL vs Makefile. I’m not commenting on the DSL at all. I’m commenting on Ruby vs Makefile. Ruby is big and wieldy. Makefiles are boring. I think I’d rather have a boring dependency for a building system.

                        If you use Ruby (or Python for that matter) as a dependency for your build system, your operating system must ship with this scripting language. It may need to ship with a specific version of that scripting language. Special care must be taken when there is need for another version of the scripting language in user land.

                        Which again reiterates my point; Makefiles are boring build dependencies. Ruby is, abeit being practical or “nice”, not boring. Use boring build dependencies for building systems.

                      2. 1

                        Brew does indeed have a ruby eDSL as the descriptive language for its packages. This encapsulates how to fetch the software, how to configure it, what patches to apply (if any), what configure flags, what make flags, and a bunch of other metadata things for the integration on the brew side of things. It’s a consistent “interface” to describe a brew package, whilst the software itself might use make, cmake, automake, autoconf, ninja, or who knows what else.

                        It would be possible to do all of that in Make, I imagine; just like we don’t really need automake/autoconf, or cmake, or the wrapping around makefiles in distribution packaging (RPM spec files, Debian’s rules file, itself, normally a makefile, etc.) — possible, but possibly herculean effort required and/or quixotic to attempt.

                        I once submitted a patch to a brew package for which I was an upstream maintainer. I changed a sha1-commit so they built from a different point in my upstream git repo; I think I removed a patch that was fixed or applied since whoever had added my package to brew had done so. I was able to basically type “brew edit ” and then something similar to submit my change. I was really impressed with the experience.

                  2. 2

                    I dislike these articles. The outcome is completely predictable: the BSD folks will argue how terrible Linux is, how Linux developers break things all the time. systemd, Wayland, etc. will be brought up. The Linux folks will argue how the BSDs are stuck in the stone age, how Linux won, and that the BSDs barely have users/contributors anymore. People get entrenched in their viewpoints even more than before.

                    The world is not black and white. Yes, your infrastructure should be predictable (boring). Newer technology, on the other hand, can also increase productivity or make systems more robust. Find a balance between being a Luddite and ‘move fast and break things’.

                    1. 1

                      This has got to be PR, right? This guy completely glosses over the existence of Docker.

                      1. 3

                        We don’t need docker, we’ve had jails since 1999

                        1. 3

                          There is no Docker for FreeBSD.

                          1. 2

                            It’s in progress, as far as I know: https://reviews.freebsd.org/D21570

                            1. 2

                              I meant like, he’s complaining about how you can’t use Linux to have reproducible environments, and he completely glosses over Docker

                              1. 1

                                Was Docker already widely-used in 2016? (Genuine question)

                                1. 2

                                  I don’t have as good of an eye on the space as others, but from what I remember, it started picking up steam in 2014-2015

                            2. 2

                              Early in 2016 somebody tried out kubernetes, got a long way on having it up & running, but couldn’t keep up with the maintenance burden vs moving the environment onto it. At the time docker was a too-fast-moving target for a 3 person dev team to pick up the burden of maintaining this. Now, it would be a different kettle of fish, with mainstream stable distros providing this. LXC would have been an option at the time, though.

                              Don’t get my wrong, I’m violently in favour of containers, but violently against FLOSS using a commercially controlled service as the main way of building and consuming software. Feel free to enlighten me if Docker’s moved on now and isn’t beholden to the VC behemoths for a sweet exit & cash windfall?

                              1. 1

                                To be clear, I have never used Docker (other than messing around with it for an hour a few years back). I have no place to say if it’s good software or not.

                                I just find it fishy that Docker didn’t get a mention at all.