1. 14
    1. 4

      You can actually prevent deadlocks in multi-threaded code by writing code that keeps a directed graph of lock acquisitions. During acquisition, if a potential cycle is detected, you throw an error. You can optionally only do this in DEBUG mode.

      In the case in the article, both being on the main thread and trying to switch to it would be a node in the DAG, even if it’s not actually an explicit mutex lock in the code. The Lock(main) -> Lock(cache) nodes and edge get created. Then when the other thread does a Lock(cache) operation, it cannot Lock(main) without causing an error.

      The only trick would be enforcing that all operations that implicitly lock main consult the DAG code.

      1. 3

        That’s indeed a simple, yet powerful mechanism for detecting deadlocks. This is basically how TSAN (thread sanitizer) works for C++ and it doesn’t even require writing any “special” code.

        Unfortunately doing this analysis statically is generally impossible, so it can only work in runtime under real workload. So it doesn’t really “prevent” deadlocks, but rather helps to detect and debug them (as it can provide detailed information about which threads took which locks).

        1. 2

          Yes, it’s definitely good to call that out. What I described is more like asserting a constraint at runtime to try and catch a problem that might otherwise be silently swallowed.

          Although, it actually can be used to prevent some deadlock (if not all), through one nuance I didn’t mention: you persist the DAG. You never remove nodes. When the cache is unlocked, you do NOT remove the node from the graph.

          Con: you introduce a constraint on your code that mutexes are only locked in a certain order. This contraint might be unnecessary sometimes.

          Pro: if you Thread 5 locks A and then B, and then unlocks both, and Thread 7 locks B and then A, your code will detect this and throw. Even though Thread 5 released B so there is no potential for deadlock this time, you might have a set of circumstances in the future where Thread 5 and 7 try to acquire their locks at the same time causing the deadlock, but only under extremely rare conditions that you don’t normally hit during development. Deadlock averted.

      2. 3

        Linux kernel has an implementation of this called lockdep.

        1. 4

          FreeBSD also has one (which predates the Linux one by a couple of decades) called WITNESS. It’s on by default in pre-release builds, which is why FreeBSD developers get so annoyed about Phoeonix articles comparing the performance of FreeBSD betas to other systems.