1. 10
    1. 3

      Similar idea of testing based on the notion of refinement (I am one of the authors )

      https://arxiv.org/pdf/1703.05317.pdf

      1. 1

        I’ll definitely give this a read, thanks for sharing.

        It seems like testing for refinement has a couple of nice properties for practical application. 1, it’s easier to generate tests if refinement is your correctness statement. 2, testing for refinement is much easier than proving it.

        1. 1
          1. That is an interesting thought. I would love to hear your idea. I do not have much experience to generate tests for refinement. The paper used generic testcases to expose typical bugs in the design.
          2. It can be harder to prove; requires a global invariant. The upside is there is only one property to prove.
          1. 1

            For #1 (generating a refinement test), my idea there is model transformation: you write the abstract spec and use the structure of the model to generate a refinement test. Because it’s a test and performance matters, you’ll likely want to generate a test similar to the one in this post, i.e. using refinement mapping / simulation rather than actually checking for behavior sequences.

            I’m playing around with that idea in a model transformation language based on metaprogramming. Here’s a simple spec for some CRUD-like behavior. And here’s a “transformation script” which generates the necessary data setup boilerplate for each action. This isn’t a complete example yet, but it’s pretty close to generating the test in this post.

            There’s also Cogent, which is a language with a certifying compiler, where each compiler run generates a proof that the generated C code refines the more abstract semantics of the source language.

            I initially wanted to make a similar certifying compiler, but since I’m targeting web applications, it’s a lot more difficult to build a general purpose compiler for that (though I am still giving it a go as a long shot). It’s a lot easier to focus on generating only the test suite from a model, and leaving the implementation to still be handwritten.

            1. 1

              That is an interesting view point and use of “compiler” to transform test from an High level semantic to the low level semantics. Am i correct that to begin with the test is generated for a property written in the abstract model?

              1. 1

                Well, what I’ve been trying to get across is more like the following:

                • We write a spec / model in the high-level semantics
                • We hand-write an implementation
                • A refinement test is generated from the model that compares model and implementation

                It sounds like you might be talking about checking properties about the model. That’s definitely possible, and a workflow I think makes a lot of sense since if refinement holds, then properties about the model are also true of the implementation. But, I haven’t done anything to aid that workflow exactly yet. I’m just focusing on getting to the point where generating a refinement test has good UX.

    2. 2

      Great article. How would you say this compares with TLA/TLA+ ? Are you basically walking through the steps from first principles that lead to something like TLA?

      1. 3

        I don’t have the confidence to create a logic like TLA+, but I’ve been hugely impacted by learning it. My actual goal is to have a generative testing workflow that I can use at work and recommend to other people.

        One big issue with formal methods is that it is very hard to scale to the implementation level. Leslie Lamport even says this all the time - paraphrasing, but something like: it would be very rare to prove a refinement to the implementation level in practice. seL4 did do this, but on a ~8K LOC codebase, and it took them like 3 years and multiple new PhD’s. My codebase at work is ~500K LOC.

        I think a happy medium is “lightweight formal methods,” where we can borrow ideas from amazing tools like TLA+, but apply them to executable code via generative testing. Amazon has done this on a component in S3 for example. I think there’s an opportunity to “connect” code like this to a TLA+ spec too, similar to how Cogent code can be embedded in Isabelle/HOL for proof.

    3. [Comment removed by author]