1. 27
  1.  

  2. 7

    Hey my research is kind of relevant here: Iris is a great logic for reasoning about lock free algorithms if you want to try playing around with them. See for instance Clausen’s lock-free hashtable or the lock-free concurrent stack with helping that I helped verify :)

    1. 5

      I highly recommend this paper on futex-based mutex implementation (using atomic instructions) from Ulrich Drepper, also known for his paper on computer memory if you are serious about this kind of things.

      1. 3

        Would you be willing to link a TLA+ model of a lock-free algorithm? I think it’d be really cool to see how an “obviously correct” system has a subtle bug.

        1. 3

          I actually set out to write an article about correctness tooling, but then I realized I had 4 blog posts worth of stuff, so I split away this (probably already too long) part giving some background on lock-free algorithms! I linked to your TLA+ manual in the section about responsible construction, but I agree that it could be nice to demonstrate an obviously correct system that TLA+ shows to have a subtle bug. Do you have a particular subtly broken yet simple algorithm in mind? I worry that this is already too long, but it could be nice to demonstrate just how subtle some bugs can be.

          1. 3

            Do you have a particular subtly broken yet simple algorithm in mind?

            I’m not familiar with lock free algorithms and after reading your essay I never want to be D:

            If there’s a broken algorithm you know of I’d be happy to write the TLA+ spec, though!

          2. 3

            I did some quick Googling on that. Here’s one for Rust who is working through a large number of algorithms:

            https://github.com/spacejam/tla-rust

            EDIT: Wait, it’s the same guy. I probably got that bookmark from him on Lobsters lol. Well, stay tuned for periodic updates there anyway haha.

            Most of the research uses either custom tools, SPIN, or provers. The top teams recently shifted to augmenting the normal provers to handle this stuff to get more assurance. Obviously, not much useful for the masses. The few I found with model-checking were paywalled and of unknown quality. These non-paywalled papers were representative of a few angles I saw in my search:

            https://www.cs.rice.edu/~sc40/pubs/cav10.pdf

            http://www.mi.fu-berlin.de/inf/groups/ag-tech/intern/19548-V-Model-Checking/Hui_Gao.pdf

            https://www-old.cs.uni-paderborn.de/uploads/tx_sibibtex/article.pdf

            I bring them up as some of the properties or analyses driving them might be ported to TLA+. Note that I threw that last one in there to give you a taste of the SPIN work that I say is often similar to TLA+ in application. They have good, code examples in it on top of discussions about model-checking linearizability for systems with out-of-order execution.

          3. 3

            Small correction: 128-bit CAS is supported on all(?) x64 processors. I used it for a lock-free hash table one time, since 16 character keys should be enough for anybody :).

            1. 1

              Thanks for the correction!

            2. 1

              Programming is hard. Concurrency is easy to screw up. Single thread of execution with callbacks is a nightmare. Mutexes can be sources of deadlock or fail to protect critical regions or both, if not used carefully. Lock free is much more conceptually simple than claimed. The earliest lock free data structures - which are still in wide use - are flip buffers. What you need for flip buffers to work is some relatively weak memory ordering from the processor - so that “transitive visibility of stores – i.e. stores that are causally related appear to execute in an order consistent with the causal relation. “ - or a memory barrier instruction that will do the same thing. Set all the data structures in the buffer to EMPTY. The single producer is the only thread allowed to write FULL and it will only do so if it reads EMPTY. The single consumer thread is the only one that can write EMPTY and it will only do so if it reads FULL. Efficient, simple, and good programming discipline. If you need multiple consumers/producers, probably you’d be better off with a server architecture but one mutex on each end solves the problem too. If you don’t design to avoid contention, your program will run badly.