1. 12
    1. 3

      I wrote a TLA+ model that finds the same bug: https://gist.github.com/hwayne/a5ca39d062235cd1b3bf410c96b155f0

      You can turn on and off the bug with the BUGMODE constant. It took me about an hour to write, partially because I misinterpreted the requirement that the queuer assigns to workers, workers don’t pick events themselves. Uses a few tricks I don’t think are that accessible to a beginner, though. Spec’s in pluscal and I didn’t clean it up at all.

      1. 3

        I spent a large part of my career working with message queues and fixing other people’s software using them. As I was reading I began thinking of all the common ways I’ve seen software using them fail, make mistakes, or have incorrect assumptions: duplicate messages, truncated messages, bad encoding/wrong code page, “poison” messages of various kinds (often the preceding), XA/two-phase (or more!) commit, monotonicity, clock skew, priority, selectors, multiple readers, multiple writers, incorrect durability. I don’t miss debugging it but I enjoyed reading your gist :)

        1. 2

          Seems like a good addition to https://github.com/danluu/debugging-stories if you want to send a PR.