1. 3

    Your goal is to take control of how threads are scheduled, which is why the paper is looking to be able to replace all concurrency primitives with their own versions in a seamless way.

    To reproduce this with ptrace, your steps would be to:

    • place a breakpoint at the entry of all concurrency primitives (e.g. pthread lock calls)
    • Start executing one thread. When it hits a breakpoint, resume a random thread.
    • Use the same seed to run the same interleaving of threads again.

    I might suggest looking at the rr project for a full implementation of making an execution deterministic.

    1. 2

      I think I’ll do something very much like what you’re describing for an initial PoC, maybe just using a python lldb script or something to get off the ground fast, and take advantage of the DWARF support. Eventually I’d like to run through the space of all possible interleavings in a more rigorous way:

      • create a generator for all important interleavings, t1 runs to completion then t2 does the same, t1 and t2 alternate in all possible ways across the specified atomic loads and stores, finally t2 runs to completion followed by t1
      • for cases where this will use more computational resources than the exhaustive test is worth, use a similar approach to Déjà Fu for improving diversity

      The author of Déjà Fu just sent me this interesting paper that may be a big performance improvement over the bounded partial-order reduction used for generating relevant interleavings: https://parasol.tamu.edu/~jeff/academic/mcr.pdf

    1. 1

      Looks pretty interesting. Some of the claims (e.g. adding nodes -> increased speed) seem eyebrow raising under paxos, but that’s what independent benchmarks are for, right?

      1. 4

        It reads to me like data is sharded out via hashing, and then updates are paxos’d around. Transactions appear to be OCC. I think the claim is that if you add more servers, your shards get smaller, and you could process more transactions per second on smaller shards.

        That said, I would have loved to have seen a paragraph or two that actually talked in more detail about the design and what tradeoffs were taken.