1. 15

    I recently discovered how horribly complicated traditional init scripts are whilst using Alpine Linux. OpenRC might be modern, but it’s still complicated.

    Runit seems to be the nicest I’ve come across. It asks the question “why do we need to do all of this anyway? What’s the point?”

    It rejects the idea of forking and instead requires everything to run in the foreground:

    /etc/sv/nginx/run:

    #!/bin/sh
    exec nginx -g 'daemon off;'
    

    /etc/sv/smbd/run

    #!/bin/sh
    mkdir -p /run/samba
    exec smbd -F -S
    

    /etc/sv/murmur/run

    #!/bin/sh
    exec murmurd -ini /etc/murmur.ini -fg 2>&1
    

    Waiting for other services to load first does not require special features in the init system itself. Instead you can write the dependency directly into the service file in the form of a “start this service” request:

    /etc/sv/cron/run

     #!/bin/sh
     sv start socklog-unix || exit 1
     exec cron -f
    

    Where my implementation of runit (Void Linux) seems to fall flat on its face is logging. I hoped it would do something nice like redirect stdout and stderr of these supervised processes by default. Instead you manually have to create a new file and folder for each service that explicitly runs its own copy of the logger. Annoying. I hope I’ve been missing something.

    The only other feature I can think of is “reloading” a service, which Aker does in the article via this line:

    ExecReload=kill -HUP $MAINPID

    I’d make the argument that in all circumstances where you need this you could probably run the command yourself. Thoughts?

    1. 6

      Where my implementation of runit (Void Linux) seems to fall flat on its face is logging. I hoped it would do something nice like redirect stdout and stderr of these supervised processes by default. Instead you manually have to create a new file and folder for each service that explicitly runs its own copy of the logger. Annoying. I hope I’ve been missing something.

      The logging mechanism works like this to be stable and only lose logs in case runsv and the log service would die. Another thing about separate logging services is that stdout/stderror are not necessarily tagged, adding all this stuff to runsv would just bloat it.

      There is definitively room for improvements as logger(1) is broken since some time in the way void uses it at the moment (You can blame systemd for that). My idea to simplify logging services to centralize the way how logging is done can be found here https://github.com/voidlinux/void-runit/pull/65. For me the ability to exec svlogd(8) from vlogger(8) to have a more lossless logging mechanism is more important than the main functionality of replacing logger(1).

      1. 1

        Ooh thankyou, having a look :)

      2. 6

        Instead you can write the dependency directly into the service file in the form of a “start this service” request

        But that neither solves starting daemons in parallel, or even at all, if they are run in the ‘wrong’ order. Depending on network being setup, for example, brings complexity to each of those shell scripts.

        I’m of the opinion that a dsl of whitelisted items (systemd) is much nicer to handle than writing shell scripts, along with the standardized commands instead of having to know which services that accepts ‘reload’ vs ‘restart’ or some other variation in commands - those kind of niceties are gone when the shell scripts are individually an interface each.

        1. 6

          The runit/daemontools philosophy is to just keep trying until something finally runs. So if the order is wrong, presumably the service dies if a dependent service is not running, in which case it’ll just get restart. So eventually things progress towards a functioning state. IMO, given that a service needs to handle the services it depends on crashing at any time anyways to ensure correct behaviour, I don’t feel there is significant value in encoding this in an init system. A dependent service could also be moved to running on another machine which this would not work in as well.

          1. 3

            It’s the same philosophy as network-level dependencies. A web app that depends on a mail service for some operations is not going to shutdown or wait to boot if the mail service is down. Each dependency should have a tunable retry logic, usually with an exponential backoff.

          2. 4

            But that neither solves starting daemons in parallel, or even at all, if they are run in the ‘wrong’ order.

            That was my initial thought, but it turns out the opposite is true. The services are retried until they work. Things are definitely paralleled – there is not “exit” in these scripts, so there is no physical way of running them in a linear (non-parallel) nature.

            Ignoring the theory: void’s runit provides the second fastest init boot I’ve ever had. The only thing that beats it is a custom init I wrote, but that was very hardware (ARM Chromebook) and user specific.

          3. 5

            Dependency resolving on daemon manager level is very important so that it will kill/restart dependent services.

            runit and s6 also don’t support cgroups, which can be very useful.

            1. 5

              Dependency resolving on daemon manager level is very important so that it will kill/restart dependent services

              Why? The runit/daemontools philsophy is just to try to keep something running forever, so if something dies, just restart it. If one restarts a service, than either those that depend on it will die or they will handle it fine and continue with their life.

              1. 4

                either those that depend on it will die or they will handle it fine

                If they die, and are configured to restart, they will keep bouncing up and down while the dependency is down? I think having dependency resolution is definitely better than that. Restart the dependency, then the dependent.

                1. 4

                  Yes they will. But what’s wrong with that?

                  1. 2

                    Wasted cycles, wasted time, not nearly as clean?

                    1. 10

                      It’s a computer, it’s meant to do dumb things over and over again. And presumably that faulty component will be fixed pretty quickly anyways, right?

                      1. 5

                        It’s a computer, it’s meant to do dumb things over and over again

                        I would rather have my computer do less dumb things over and over personally.

                        And presumably that faulty component will be fixed pretty quickly anyways, right?

                        Maybe; it depends on what went wrong precisely, how easy it is to fix, etc. We’re not necessarily just talking about standard daemons - plenty of places run their own custom services (web apps, microservices, whatever). The dependency tree can be complicated. Ideally once something is fixed everything that depends on it can restart immediately, rather than waiting for the next automatic attempt which could (with the exponential backoff that proponents typically propose) take quite a while. And personally I’d rather have my logs show only a single failure rather than several for one incident.

                        But, there are merits to having a super-simple system too, I can see that. It depends on your needs and preferences. I think both ways of handling things are valid; I prefer dependency management, but I’m not a fan of Systemd.

                        1. 4

                          I would rather have my computer do less dumb things over and over personally.

                          Why, though? What’s the technical argument. daemontools (and I assume runit) do sleep 1 second between retries, which for a computer is basically equivalent to it being entirely idle. It seems to me that a lot of people just get a bad feeling about running something that will immediately crash.

                          Maybe; it depends on what went wrong precisely, how easy it is to fix, etc. We’re not necessarily just talking about standard daemons - plenty of places run their own custom services (web apps, microservices, whatever).

                          What’s the distinction here? Also, with microservices the dependency graph in the init system almost certainly doesn’t represent the dependency graph of the microservice as it’s likely talking to services on other machines.

                          I think both ways of handling things are valid

                          Yeah, I cannot provide an objective argument as to why one should prefer one to the other. I do think this is a nice little example of the slow creep of complexity in systems. Adding a pinch of dependency management here because it feels right, and a teaspoon of plugin system there because we want things to be extensible, and a deciliter of proxies everywhere because of microservices. I think it’s worth taking a moment every now and again and stepping back and considering where we want to spend our complexity budget. I, personally, don’t want to spend it on the init system so I like the simple approach here (especially since with microservies the init dependency graph doesn’t reflect the reality of the service anymore). But as you point out, positions may vary.

                          1. 2

                            Why, though? What’s the technical argument

                            Unnecessary wakeup, power use (especially for a laptop), noise in the logs from restarts that were always bound to fail, unnecessary delay before restart when restart actually does become possible. None of these arguments are particularly strong, but they’re not completely invalid either.

                            We’re not necessarily just talking about standard daemons …

                            What’s the distinction here?

                            I was trying to point out that we shouldn’t make too many generalisations about how services might behave when they have a dependency missing, nor assume that it is always ok just to let them fail (edit:) or that they will be easy to fix. There could be exceptions.

                        2. 2

                          Perhaps wandering off topic, but this is a good way to trigger even worse cascade failures.

                          eg, an RSS reader that falls back to polling every second if it gets something other than 200. I retire a URL, and now a million clients start pounding my server with a flood of traffic.

                          There are a number of local services (time, dns) which probably make some noise upon startup. It may not annoy you to have one computer misbehave, but the recipient of that noise may disagree.

                          In short, dumb systems are irresponsible.

                          1. 2

                            But what is someone supposed to do? I cannot force a million people using my RSS tool not to retry every second on failure. This is just the reality of running services. Not to mention all the other issues that come up with not being in a controlled environment and running something loose on the internet such as being DDoS’d.

                            1. 2

                              I think you are responsible if you are the one who puts the dumb loop in your code. If end users do something dumb, then that’s on them, but especially, especially, for failure cases where the user may not know or observe what happens until it’s too late, do not ship dangerous defaults. Most users will not change them.

                              1. 1

                                In this case we’re talking about init systems like daemontools and runit. I’m having trouble connecting what you’re saying to that.

                        3. 2

                          If those thing bother you, why run Linux at all? :P

                      2. 2

                        N.B. bouncing up and down ~= polling. Polling always intrinsically seems inferior to event based systems, but in practice much of your computer runs on polling perfectly fine and doesn’t eat your CPU. Example: USB keyboards and mice.

                        1. 2

                          USB keyboard/mouse polling doesn’t eat CPU because it isn’t done by the CPU. IIUC the USB controller generates an interrupt when data is received. I feel like this analogy isn’t a good one (regardless). Checking a USB device for a few bytes of data is nothing like (for example) starting a Java VM to host a web service which takes some time to read its config and load its caches only to then fall over because some dependency isn’t running.

                        2. 1

                          Sleep 1 and restart is the default. It is possible to have another behavior by adding a ./finish script to the ./run script.

                      3. 2

                        I really like runit on void. I do like the simplicity of SystemD target files from a package manager perspective, but I don’t like how systemd tries to do everything (consolekit/logind, mounting, xinet, etc.)

                        I wish it just did services and dependencies. Then it’d be easier to write other systemd implementations, with better tooling (I’m not a fan of systemctl or journalctl’s interfaces).

                        1. 1

                          You might like my own dinit (https://github.com/davmac314/dinit). It somewhat aims for that - handle services and dependencies, leave everything else to the pre-existing toolchain. It’s not quite finished but it’s becoming quite usable and I’ve been booting my system with it for some time now.

                      4. 4

                        I’d make the argument that in all circumstances where you need this you could probably run the command yourself. Thoughts?

                        It’s nice to be able to reload a well-written service without having to look up what mechanism it offers, if any.

                        1. 5

                          Runits sv(8) has the reload command which sends SIGHUP by default. The default behavior (for each control command) can be changed in runit by creating a small script under $service_name/control/$control_code.

                          https://man.voidlinux.eu/runsv#CUSTOMIZE_CONTROL

                          1. 1

                            I was thinking of the difference between ‘restart’ and ‘reload’.

                            Reload is only useful when:

                            • You can’t afford to lose a few seconds of service uptime (OR the service is ridiculously slow to load)
                            • AND the daemon supports an on-line reload functionality.

                            I have not been in environments where this is necessary, restart has always done me well. I assume that the primary use cases are high-uptime webservers and databases.

                            My thoughts were along the lines o: If you’re running a high-uptime service, you probably don’t care about the extra effort of writing ‘killall -HUP nginx’ than ‘systemctl reload nginx’. In fact I’d prefer to do that than take the risk of the init system re-interpreting a reload to be something else, like reloading other services too, and bringing down my uptime.

                          2. 3

                            I hoped it would do something nice like redirect stdout and stderr of these supervised processes by default. Instead you manually have to create a new file and folder for each service that explicitly runs its own copy of the logger. Annoying. I hope I’ve been missing something.

                            I used to use something like logexec for that, to “wrap” the program inside the runit script, and send output to syslog. I agree it would be nice if it were builtin.

                          1. 1

                            I used to use pass, but I never liked gpg nor the leaking of the metadata. So I wrote a simple symmetric file encryption tool (based on monocypher, with argon2 password hashing) and I now store my passwords in a single file where even lines are sites, and odd lines the matching password. I keep this password file in my (public) dotfiles for more convenience too.

                            1. 1

                              What a pile of ugly hacks. The only real problem I see is in sharing clipboard to/from ssh sessions, the rest is just self-inflicted by using programs with poor design. I could understand why a text editor would keep its own clipboard, but I don’t see a good reason why tmux and zsh need to have one as well.

                              Also the clipboard attack (and defense) seems overblown. Either you trust the source and you can paste code happily, or you don’t and then you better be extremely careful about what you’re executing from them. Bracketed paste is just not going to cut it, it’s trivial to hide malicious commands in a shell script.

                              1. 7

                                What a pile of ugly hacks.

                                Isn’t this modern computing in a nutshell? ;)

                                The “poor design” is a historical artifact of using terminals. tmux and zsh have their own because there is no guarantee they are going to be used in an integrated environment: they may be that integrated environment. (Consider: running a machine without a GUI at all.)

                              1. 13

                                Some of the ‘alternatives’ are a bit more iffy than others. For any service that you don’t have the source to or can’t self-host (telegram, protonmail, duckduckgo, mega, macOS, siri to name a few), you’re essentially trusting them to uphold their privacy policy and to respect your data (now, but also hopefully in the future).

                                And in some cases it seems to me that it’s little more than fancy marketing capitalizing on privacy-conscious users.

                                1. 18

                                  Telegram group messages aren’t even e2e encrypted, Telegram has access to full message content. The only thing Telegram is good at is marketing, because they’ve somehow convinced people they’re a secure messenger.

                                  1. 6

                                    To be fair, they at least had the following going for them:

                                    • no need to use a phone client, as compared to WhatsApp which deletes your account if you access it with an unofficial client. You can just buy a pay-as-you-go SIM card and receive your PIN with a normal cell-phone
                                    • they had an option for e2e encrypted chats, with self deleting messages (there was this whole fuss with the creator offering a million dollars (?) if anyone could find a loophole)
                                    • their clients were open source, and anyone could implement their API

                                    Maybe there was more, but these were the arguments I could think of on the spot. I agree that it isn’t enough, but it’s not like their claim was unsubstantiated. It just so happened that other services started adopting some of Telegrams features, making them loose their edge over the competition.

                                    1. 4

                                      Also the client UX is pretty solid imho. Bells and whistles are not too intrusive, and stuff works as you’d expect.

                                      Regarding its security: It is discussed in the FAQ what security models they offer in which chat mode.

                                    2. 6

                                      I’m much less worried about the source code than I am the incentives of the organization behind the software. YMMV, of course.

                                      1. 2

                                        Even if you have source code, it’s difficult to verify a service or piece of software (binary) matches that source code.

                                        1. 2

                                          Yes, but then if anything feels wrong, it gets possible to find an alternative provider for the same software.

                                          Still… Hard to beat the privacy of a hard drive at home accessed through SFTP.

                                        2. 2

                                          I was checking email SaaS providers last weekend as the privacy policy changes at current provider urge me not to renew my subscription when it ends. I have found mostly the same offers, and to be honest neither seemed convincing to me.

                                          For example the Tutanota offer seemed questionable: They keep me so secured that the email account can only be accessed by their email client, no free/open protocol is available. Only their mail client can be used, they use proprietary encryption scheme for my own benefit… OK, it is open sourced, but come on… I cannot export my data in a meaningful way to change providers. So what kind of encryption scheme is it? It is RSA-2048+AES, not using GPG/PGP “standards”, and is hosted in Germany, pretty much a surveillance state… This makes their claims questionable at least.

                                        1. 4

                                          I know this thread is already filled with alternative workflows for this particular task, but this is one of the things I love the most about elvish. With ctrl-l you can bring up an fzf-like ‘location’ menu that contains the directories you’ve been to, sorted by score (similar to firefox’s frecency, which I love). The best part is that it’s essentially ‘zero-conf’, since you just have to use the shell to build the results, and in my experience it works very well.

                                          Some will say that this is outside the scope of a shell, but it’s hard to reach this level of integration by combining external tools.

                                          1. 2

                                            Elvish is my favorite shell at the moment, and its surprisingly efficient directory navigation is only one of the reasons. A weighted-directory-history is kept automatically by the shell, so over time, your most-used directories float to the top, and are easily accessible in location mode. In this sense it’s not too different from AutoJump, but because it’s visual, you can see your selection live as you type. These days, it doesn’t take more than Alt-L (I have remapped from the default Ctrl-L that @nomto mentions) and a couple of keystrokes to select the directory I want. It works great.