You can turn on and off the bug with the BUGMODE constant. It took me about an hour to write, partially because I misinterpreted the requirement that the queuer assigns to workers, workers don’t pick events themselves. Uses a few tricks I don’t think are that accessible to a beginner, though. Spec’s in pluscal and I didn’t clean it up at all.
I spent a large part of my career working with message queues and fixing other people’s software using them. As I was reading I began thinking of all the common ways I’ve seen software using them fail, make mistakes, or have incorrect assumptions: duplicate messages, truncated messages, bad encoding/wrong code page, “poison” messages of various kinds (often the preceding), XA/two-phase (or more!) commit, monotonicity, clock skew, priority, selectors, multiple readers, multiple writers, incorrect durability. I don’t miss debugging it but I enjoyed reading your gist :)
I wrote a TLA+ model that finds the same bug: https://gist.github.com/hwayne/a5ca39d062235cd1b3bf410c96b155f0
You can turn on and off the bug with the BUGMODE constant. It took me about an hour to write, partially because I misinterpreted the requirement that the queuer assigns to workers, workers don’t pick events themselves. Uses a few tricks I don’t think are that accessible to a beginner, though. Spec’s in pluscal and I didn’t clean it up at all.
I spent a large part of my career working with message queues and fixing other people’s software using them. As I was reading I began thinking of all the common ways I’ve seen software using them fail, make mistakes, or have incorrect assumptions: duplicate messages, truncated messages, bad encoding/wrong code page, “poison” messages of various kinds (often the preceding), XA/two-phase (or more!) commit, monotonicity, clock skew, priority, selectors, multiple readers, multiple writers, incorrect durability. I don’t miss debugging it but I enjoyed reading your gist :)
Seems like a good addition to https://github.com/danluu/debugging-stories if you want to send a PR.