1. 39
    1. 12

      This post only shows that specifications written in temporal logic don’t compose well. Fair enough.

      However, this is far from the only kind of specifications that could be used for formal methods. In particular, it looks like nowadays some concurrent separation logic is the way to go; in particular, Iris https://iris-project.org/, a project which, among other achievements, presents several papers on POPL each year. The main advancement made by Iris—using commutative monoids to represent resources [which are great for representing specs due to being the best things to compose!] and invariants to represent immutable state—seems like something that could be applied to the example in the post with great success, producing a composable specification (assuming, possibly, the presence of a shared clock that drives the blinkers EDIT: looking at the post more closely, it seems that it is okay if the blinkers blink at unpredictable rates? If so, there is no need for a shared clock at all and everything simplifies a lot).

      The way this could be done is by modelling a blinker in two parts: first, the opaque blob that says “this is a blinker with such and such properties”, and second, a unique, non-shareable resource one instance of each exists for each instance of a blinker; this non-shareable resource is parameterized by the state in which the blinker currently is. This “state” is not necessarily the state of the variables that the blinker uses, it can be any data structure at all (for example, when modelling some concurrent linearizable queue or something, the state could be just the list of the elements, regardless of the inner workings). Then, the opaque blobs, which contain the heavy logic behind the blinkers, never need to be expected in-depth, and to compose several blinkers, one just has to describe the relations between their user-visible states.

      If you want to know more, be sure to check the website!

      Disclaimer: I’d spent at least half a year doing formal proofs of a complex concurrent data structure in Iris, though I’m not affiliated with the Iris project.

      1. 3

        I’m reading some paper now, but my outside view base prior is caution. Not because there’s anything I see in the logic, but just because Iris came out in 2014 and there don’t yet seem to be a lot of examples of composing independent “real-world” Iris specs. The blinker is just an instructive example to showcase frame problems. Even if a specification language makes it specifically easy, it could run into problems composing larger dependent specifications.

        1. 5

          I have some perspective on why Iris hasn’t been widely used.

          • The paper on Iris was published in 2015. However, this is only the logic itself, with no tooling to accompany it. The paper introducing the Coq library for working with Iris (“Iris Proof Mode”) was published in 2017. It is unlikely that anyone aside from professional logicians would sit down to draw some circles, squares, and triangles (the symbols Iris uses for various modalities) with pen and paper in the meantime.
          • If I recall correctly (and I haven’t had an extensive hands-on experience with TLA+, so could be mistaken), then TLA+ allows writing arbitrary specifications, but nothing is there to check if they are sensible and relate to code in any way. This makes the task of writing specs easy, but at the expense of being completely sure that the code fits the spec. The scope of Iris is much broader, however. You could just write some specs, compose them and hope that the code fulfills them, but this is not how Iris is marketed. Instead, it is positioned to be used to prove the correctness of mind-bogging concurrent data structures. And let me tell you, this is hard! Each non-trivial line of code—that is, some line that affects the global state of the system—takes anywhere from two hours to several days to prove. The half a year that I was working with Iris, I was proving a single lock-free data structure. While it consisted of multiple interacting components, it is still some long time. I could have just written the specs for each component and be done with it—actually, this is what I did in the first week or so—and just ensure manually that the specs make sense, but then Iris would not be helping much, the challenge in our case was not with the specs, but with the interleavings of lock-free code.
          • Iris itself is great, I never had a problem with it. However, the buggy and obscure mess that is Coq was the bane of my existence. Besides having to learn Coq—which, let’s just say, few people do—and having to learn the Iris framework, one must also be prepared to learn what to do when Coq refuses to find a type class instance or unify X with X, or loses the ability to automatically reason about arithmetic in the middle of the proof, so one has to prove things like a + (b + c) = (a + b) + c manually. This is a huge barrier for entry, so it’s not a surprise that few people use Coq with some framework as complex as Iris. Unfortunately, I don’t know of any other language that is more production-ready than Coq and yet still is able to host Iris.

          To answer your doubts about Iris scaling well—in my experience, the specs do turn out exactly as intrusive or opaque as they need to be to convey what the component does. I was proving real, production code, and I was able to split my proof across the exact same boundaries that the original code has, providing a separate specification for each class, without any changes to the code to accommodate proofs whatsoever, and then composing those specifications in order to prove the correctness of the encompassing classes.

          So, I think that adoption of Iris is a technological problem and not some fundamental issue with composition of specifications.

          1. 1

            Just wanted to echo this:

            “…TLA+ allows writing arbitrary specifications, but nothing is there to check if they are sensible and relate to code in any way…”

            I looked for something like this (for at least Java, C++, JavaScript code bases), but could not find any. I think this is a significant impediment to wider adoption of TLA+.

            1. 4

              I really need to write about why it is this way and some of the stuff the community is doing to change that. tl;dr refinement is a hard problem

              (I agree it’s an adoption barrier, but it’s not the most “efficient” adoption barrier rn.)

    2. 4

      Some (interesting!) comments here try to counter-attack the blog post with the angle “it’s not a problem with specs in general, it’s a problem with LTL”. I don’t disagree, but I think that another interesting angle is: the form of composition that you want (in the last part of the post) is concurrency with shared mutable state, and this is known to be very hard. (And it’s not really about specifications: implementing those systems is also very hard.) In this light the suggestion to try Iris, or in general separation logic, make a lot of sense to me.

    3. 3

      I’m not sure I got a one-sentence answer to the question. It’s because we’re composing subtheories according to the composition rules of the full theory, right? In the examples, the full theory assumes a sort of continuous time variable, but the subtheories are cut-out events which are isolated from continuous time; the composition of those cut-outs needs to also compose the ad-hoc isolation which specifies the subtheory.

    4. 3

      The issue with the two blinker example looks more like a limitation of this particular specification language (TLA+?), than a fundamental issue with formal specifications in general? While implementation hiding doesn’t necessarily work (or even make sense) with formal methods, I see no reason that the two blinker example couldn’t “compose” using linear logic as the specification language. Of course there’s new math needed in the two blinker case, e.g. update reordering. More is, after all, different. But I’m pretty sure that new math should be generalizable to larger numbers of blinkers.

      1. 2

        The issue with the two blinker example looks more like a limitation of this particular specification language (TLA+?)

        LTL! TLA+ adds a lot of extra machinery to make composing and refining specs easier that I didn’t want to get into in this demo.

        I see no reason that the two blinker example couldn’t “compose” using linear logic as the specification language.

        Please share an example!

        But I’m pretty sure that new math should be generalizable to larger numbers of blinkers.

        That’s a limitation of explaining this: I chose blinkers because it’s a very simple spec that showcases well the examples of the frame problem and dependent specs. Most specifications aren’t nearly this simple, not anywhere close. Math that handles a blinker spec might be overfitting and completely break down when you instead have multiple readers and writers with a shared queue.

    5. 1

      I found this very enlightening as to some of the difficulties of composing different specifications. I’ve often leaned on specification approaches that model processes and communication. I would imagine if you defined the blinkers as a independent processes (ala PlusCal) the composition would be closer to a developers intuition.

      1. 3

        There’s a tricky subtlety here. PlusCal is a DSL that compiles to TLA+. If you look at the compiled TLA+, you’ll see that the different processes are represented by a global pc state that tracks the label of both. There’s no point where we have two specifications, we have just one specification with two subcomponents.

        The equivalent “composition” here would be putting two processes in separate files, compiling both to TLA+, and then writing a third spec that instantiates both files and composes A!Spec and B!Spec.

        1. 1

          That makes perfect sense. I guess I’m trying to say PlusCal is designed to represent process and communication between them, and that composes well without the same difficulties described. It seems that is because they avoid the problems described of specification composing. The DSL could be used to compose a larger spec via careful concatenation. There isn’t really tooling to support it so not really any solution to the problem. But I find it interesting that re-framing the concept to an existing DSL may provide route to composition that matches a larger set of developers mental model.