1. 11

      Neat! You could easily eliminate the Redis dependency by using files.

      The post-commit hook could add a build job by creating e.g. jobs/pending/UUID.txt with the job spec. Each build runner would watch that directory for changes, and claim the job with the earliest last-modified time by moving it to e.g. jobs/running/UUID.txt. It would run the job spec, and finish the job by moving the file to jobs/{success,failed}/UUID.txt.

      If you’re reasonably careful, this would support arbitrarily many build runners. Build status would be find jobs/ -name UUID.txt. Jobs in the pending directory have a queue wait time of the delta between the last-modified time and the current time. You’d get easy job output streaming: the build runner would tee -a UUID.txt all of their commands, and watchers would tail -F UUID.txt.

      Simple job success/fail notifications could be implemented via a trap in the job spec. More sophisticated notifications could be implemented by watching the job output stream, and triggering events on e.g. regexp matches.

      A reaper could watch for any jobs in the running directory that hadn’t been modified in awhile, and weren’t open by any active process, and move them to the failed directory, or maybe back into the queue.

      1. 8

        My first thought was “why not use files?” as well, but the biggest concern I’d have with it is that file locking and atomicity are kind of hard on unix systems, and something very few people fully understand in all its nuance (I sure don’t), and that you need to be very careful with this especially once you start implementing some of the things in the “Possible further improvements” list.

        All in all, it seems that Redis is simple and light-weight enough, and it eliminates source of bugs that’s usually confusing and hard to track down. I’m all for avoiding dependencies, but not to the point of making things stupid light.

        1. 7

          What I suggested doesn’t require locking, just atomic mv, so keep the jobs/ tree on a single volume and you’re golden.

          Redis is lightweight compared to other stateful servers, but it’s extremely heavyweight compared to a filesystem, and opens the system up to entirely new classes of risk.

          1. 1

            Off-topic but I appreciated your “stupid light” link. I’m moving to Colorado tomorrow and it helped me think through / validate some gear choices I’m making.

          2. 3

            Another approach I’ve used in a small project with infrequent pushes is to use the batch command to queue the job to run later, when load is low. An upside is built-in notification through the standard mail facility.

            1. 1

              Very nice! Thanks for mentioning it, this totally fits the pattern I was going for.

            2. 3

              In my mind going file-per-job was my first approach, alluded to in “strictly speaking this can be achieved without additional software”. But Redis felt like a very reasonable choice, especially in terms of thinking about scaling this to more than one worker machine or just building on a machine that’s distinct from the git server.

              Your design is clean and straightforward, of course. And the point about build output streaming is very good, as is the point about adding a reaper for stale builds. I guess it’s just a fun problem space to think of, so many interesting things to solve.

              1. 5

                Build farms would be easy: have the runners on the git server run all their job specs via ssh to a random host in the farm. Everything else works transparently.

                It’s always good to avoid dependencies, if you can. Especially runtime dependencies. In this case Redis is serving as a source of truth for key-value data with a few simple access patterns, which is more or less the definition of a filesystem ;)

                1. 5

                  I like your ssh idea, simple and elegant.

                  Instead of plain files, which others have pointed out can have surprising edge cases or race conditions, why not SQLite? After all, SQLite competes with fopen(). And it’s relatively easy to use from shell scripts, much like the redis-cli examples from the original article.

                  1. 6

                    SQLite would be a good choice in terms of taking care of any locking / data race issues and because some sort of results backend is needed anyway.

                    1. 1

                      Why introduce a large dependency if you can avoid it? There are no edge cases or race conditions in the system I’ve described.

                      1. 4

                        You would need to take extreme care to never create a job file outside of the initial push hook, only move and append to an existing file. Accidentally duplicating a job into multiple states would be bad. And guarantee a single writer at all times. Interleaving output from multiple workers would corrupt your state files.

                        Shells don’t make these things easy to accomplish in scripts, not with usual output redirection anyway. Even if you’re not using a shell, but a full featured programming language, you’d always need to be careful about such consistency details whenever interacting with your file tree.

                        SQLite has primary keys, foreign keys, and transactions, i.e. robust tools for managing consistency. It’s hardly a large dependency, and does not require a persistent server.

                        1. 1

                          perhaps managing the job files in their own git repo would address this…

                        2. 2

                          Not really an edge case or race condition but if your script does the BLPOP but then does not complete (machine dies, script crashes, network glitches, etc.), that job is lost forever - there’s no way for another run to pick that up as incomplete which there would be with a more persistent store like SQLite.

                          1. 2

                            Couldn’t you catch this by inspecting the contents of jobs/running? It’s not clear to me how SQLite would give you more information than queued/running/completed.

                            I agree SQLite would be a reasonable way to store this information though. It might be more portable than relying on filesystem metadata and would certainly make it easier to run analytics.

                            1. 1

                              My suggestion involves no persistent processes like Redis.

                              1. 1

                                Yeah, I mis-replied.

                        3. 1

                          I once build something similar file-based relying on NFS for distribution. I don’t remember anymore what happened to it. The Kerberos token timeout was ugly but I it worked.

                      2. 3

                        IMO a better architecture would be to use a named pipe; post-commit hook is to write to it, and the job server reads from it and spawns a job in response.

                        1. 1

                          You wouldn’t want the hook to block waiting for a runner; can you make a named pipe with a buffer, so you could write without a reader?

                          1. 1

                            You could use systemd socket activation with Accept=yes on TCP socket. This would automatically handle all incoming messages and provide proper isolation primitives (by proper configuration of service). Seems like win-win for me with just 3 files needed to handle everything (ci.socket, ci.service and shell script in for running tasks).

                        2. 1

                          Unix fifos would be even cooler here.

                        1. 2

                          Hopefully making some long-planned optimizations to synsh.dev.

                          1. 3

                            Skate, weather permitting, and finish a rewrite of the frontend for synsh.dev, a tool for creating shell pipelines from example inputs and outputs.

                            1. 2

                              Do a kick flip!