1. 42
  1.  

  2. 13

    I absolutely love this blog post. Not only does it address a longstanding false dichotomy, but does so in a very interesting and entertaining way, and each step along the way was pretty clear even to someone without experience in the domains in question.

    Kudos!

    1. 3

      Could a language like Coq be used for similar purposes than TLA while being able to extract useful programs?

      1. 1

        Yes. If your specification language of choice (and/or a framework built on top of it) can do synthesis and code generation, it’d be able to derive correct implementations of your spec.

      2. 2

        The non-syncing of spec code with implementation code really feels like the big barrier to making this usable in general.

        One idea I had to tackle this issue in a language like Python would be to allow for executable doc-strings within the code that could let you write specs inline, and have those be parsed out (but by default it would use the actual in-code implementation)

        That way you could write simplifying specs for certain parts of the code (say, the result of input will be any string instead of waiting on stdin when checking), while still avoiding duplication because most code is straightforward

        Though to be honest this might be very hard to get right. I feel like it’s a bit like the ORM/Type System issue, where type systems are usually rigid and don’t give much “type-check-time” flexibility, but ORMs are usually defined dynamically (relative to the type system)

        1. 6

          This is why I ended up spending less time with TLA after learning it. However, learning it was an incredibly useful exercise that has dramatically informed the way I build systems. It made me start to ask why I can’t write TLA style invariants and check executions of concurrent and distributed algorithms I build in general purpose languages.

          I realized I actually can get similar results on real code if I build systems carefully: schedule multithreaded interleavings at cross-thread communication points, simulate distributed clusters with buggy networks in a single process at accelerated speed a la discrete event simulation, things that use files are communicating with future instances of themselves and you can record logs of file operations and arbitrarily truncate them and ensure invariants hold after restart.

          my main project right now is trying to make the above ideas into nice libraries that let people run their code in more realistic ways before opening pull requests, and integrating those tools into the construction process of the sled database.

          1. 3

            An idle thought which can go on my list of side projects to start “one day” (probably right after the bus accident): probably symbolic execution can be used to demonstrate, if not enforce, the synchronisation of TLA+-type models with code. A symbolic executor can show the different cases a program will execute based on its input and the outputs that result; those can be compared with the cases discovered by the model-checking tool.

            Hooray, I’m not the first person to have that idea! You can combine formal methods with symbolic execution and meet in the middle.

            1. 2

              One idea I had to tackle this issue in a language like Python would be to allow for executable doc-strings within the code that could let you write specs inline, and have those be parsed out (but by default it would use the actual in-code implementation)

              While this example was pretty close to the code implementation, TLA+ (and most specification languages) are too flexible to allow easy embedding. Here the processes were actual threads, but they could just as easily be servers, or human agents, or an abstracted day/night cycle. In one spec I wrote, one process represented two separate interacting systems that, at that level of detail, were assumed to have perfect communication.

            2. 1

              Great article!

              I’m a bit late to the party because I wanted to try this one myself by just eyeballing and refactoring.

              Java concurrency

              The first thing that confused me was the concurrency model. The methods are supposed to be atomic because of synchonized but wait obviously make it non-atomic (the operation is broken between before the wait call and after). Then what does notify do? Does it pause the thread that called notify? Or is the woken up thread paused but next to execute?

              Looking at the first few Google results didn’t get me a clear answer to this.

              I assumed the former and this could make a lot of take thread wake each other up, block on notify and then all continue pushing occupied into the negatives and return invalid values.

              I think your spec assumes the latter.

              Concurrency and testing
              if random.randint(0, 10000) == 1234:
                  cause_error
              

              would also be hard to test, need many runs and still have a chance of missing the error. This feels like a missing feature of the testing tool. For any non-deterministic part, the test should be allowed to pick the “random” values. In this case, the test should be allowed to pick which thread notify wakes up. And which thread runs next if there is a choice.

              This is not saying anything about the difficulty in implementing such a testing tool but it also seems necessary if there’s going to be non-deterministic parts.

              1. 3

                Then what does notify do? Does it pause the thread that called notify? Or is the woken up thread paused but next to execute?

                I assumed the former and this could make a lot of take thread wake each other up, block on notify and then all continue pushing occupied into the negatives and return invalid values.

                I think your spec assumes the latter.

                To my understanding it’s neither: notify wakes up a thread but does not execute it, nor does it guarantee it’ll be executed next. It just adds the woken thread into the scheduled pool.

              2. 0

                positively beautiful use of Formal Methods, but the story about XP doesn’t ring true: if someone comes to you with Java code that uses notify(), you fire them on the spot and rewrite everything they ever touched. :P

                1. 1

                  But how fast is the turnover? Was that common knowledge at the time? What else have we yet to realize?

                  1. 1

                    the recommendation was intended as more of a sarcastic commentary on the pitfalls of some Java language features. whilst I appreciate the beauty of a tool to find bugs in such a rough problem domain, I think the wise architect advises to avoid such risks at a whole other level of design.

                    1. 1

                      The knowledge not to use notify() seems kind of arbitrary to me—I’ve certainly never heard of it—and I’m not sure what level of design would be required to have that realization. But I think that’s an important principle in general.