Separation logic approaches like Dafny, my favorites, are both short and simple as they are quite declarative.

Obviously, this is just one problem and many factors might make it hard to extract general conclusions. Dafny has been used to verify IronFleet, which is a pretty large and complicated distributed system. There are similar success stories for Isabelle (seL4), Coq (CompCert) and F* (Everest).

In practice, are constructive tools simpler for verifying software?

In the general case, no. In a specific case, it depends. There are certainly things that dependent types make much simpler. There are also things that it does not. Check out this dependently typed quicksort in Idris vs. a proof of quicksort in Isabelle. The Idris version captures the correctness criteria in a dependent-type, whereas the Isabelle version just implements a simple version of quicksort and proves it separately. The Isabelle version is much simpler.

Dependent types are a “niche” feature with a couple of convenient uses: …
In all other cases, I believe it’s just more effective to write plain ML/Haskell-style data structures and functions, and separately state and prove properties about them.

This is generally how I feel, but I am not at the forefront of math academia. What I am most definitely opposed to, though, is the religion around constructivism, and that it’s the only acceptable way to reason about programs. For example, take TLA+ which is based purely on untyped set theory. It’s been used to verify many real-world systems, and its utility and effectiveness is pretty inarguable at this point. It is based on a firmly classical logic, and doesn’t present many practical problems. In fact, it makes TLA+ easy to pick up, easy to use, and easy to get results with. Quickly.

I dare you to try and prove something about a distributed algorithm in Coq.

Constructive vs. classical logics will unfortunately always be a flame war, like Emacs vs. Vim, or static vs. dynamic types. They each really do have benefits, and at the end of the day it just boils down to which tool you know better sometimes. I currently know Isabelle better, and I very much appreciate its simplicity over something like Coq. I have proven things in Isabelle, and never thought to myself “this proof isn’t as valid, because it’s not constructive.”

I’m not a Coq hater, just reiterating that. I was commenting on the steepness of the learning curve of Coq, meaning if you don’t have any experience in theorem proving or verification, how long will it take you to get productive in Coq?

There are also things that it does not. Check out this dependently typed quicksort in Idris vs. a proof of quicksort in Isabelle. The Idris version captures the correctness criteria in a dependent-type, whereas the Isabelle version just implements a simple version of quicksort and proves it separately. The Isabelle version is much simpler.

I don’t think that’s proof a dependent-typed proof of quicksort is necessarily more complex than the Isabelle proof. Notice how the Idris proof proves element-equivalence by literally encoding the properties of a permutation, while the Isabelle proof instead proves it via multisets. IME the Isabelle approach is much smarter— maybe if the Idris approach would be much simpler if it used that too.

For example, take TLA+ which is based purely on untyped set theory. It’s been used to verify many real-world systems, and its utility and effectiveness is pretty inarguable at this point. It is based on a firmly classical logic, and doesn’t present many practical problems. In fact, it makes TLA+ easy to pick up, easy to use, and easy to get results with. Quickly.

Industrial programmers don’t really use TLA+ to “prove” things in the same way you do Isabelle. TLA+ “proves” things by model checking a finite state space. So you can use it to prove that quicksort works on lists up length N, with numbers in the set a..b, but not that all finite lists with any range of numbers will quicksort properly. You can do this with TLAPS, but it’s much harder and more in line with the difficulty of theorem provers. Apalache might also be able to do it, but I don’t know for certain. I think it’s still limited to a given list length bound, but can prove quicksort for any such list with that bound (a..b = ℤ).

I don’t think that’s proof a dependent-typed proof of quicksort is necessarily more complex than the Isabelle proof.

They’re definitely different approaches, so not really apples to apples, that’s fair. But the dependent-type machinery itself is what I feel is more complicated. If I were to sum up my feeling on dependent types it would be: “just because types are propositions doesn’t mean that all of your propositions should be types.”

It just seems easier to make a separate proposition about your program vs. encoding all of the propositions as types within your program. This may just be a syntactic thing today? Who knows.

Industrial programmers don’t really use TLA+ to “prove” things in the same way you do Isabelle.

It’s not so much about proving for me as much as formal reasoning. And TLA+ is amazing for formal reasoning! Which is where I agree with the post - treating constructivism as a goal in itself doesn’t make sense, because we have so many non-constructive tools out there that we get real results with. Those results shouldn’t be disqualified.

For context, Paulson did foundational work in ML and invented the Isabelle theorem prover.

In practice, are constructive tools simpler for verifying software?

In @hwayne cool benchmark https://github.com/hwayne/lets-prove-leftpad, dependent-type based systems (Lean, Agda, F* and Idris) look quite compact. Coq looks a bit more verbose, though.

On the other hand, a classical approach like Isabelle looks pretty massive: https://github.com/hwayne/lets-prove-leftpad/blob/master/isabelle/Leftpad.thy.

Separation logic approaches like Dafny, my favorites, are both short and simple as they are quite declarative.

Obviously, this is just one problem and many factors might make it hard to extract general conclusions. Dafny has been used to verify IronFleet, which is a pretty large and complicated distributed system. There are similar success stories for Isabelle (seL4), Coq (CompCert) and F* (Everest).

In the general case, no. In a specific case, it depends. There are certainly things that dependent types make much simpler. There are also things that it does not. Check out this dependently typed quicksort in Idris vs. a proof of quicksort in Isabelle. The Idris version captures the correctness criteria in a dependent-type, whereas the Isabelle version just implements a simple version of quicksort and proves it separately. The Isabelle version is much simpler.

In Programming with Dependent Types: Passing Fad or Useful Tool?, Xavier Leroy has this to say:

This is generally how I feel, but I am not at the forefront of math academia. What I am most definitely opposed to, though, is the religion around constructivism, and that it’s the only acceptable way to reason about programs. For example, take TLA+ which is based purely on untyped set theory. It’s been used to verify many real-world systems, and its utility and effectiveness is pretty inarguable at this point. It is based on a firmly classical logic, and doesn’t present many practical problems. In fact, it makes TLA+ easy to pick up, easy to use, and easy to get results with. Quickly.

I dare you to try and prove something about a distributed algorithm in Coq.

Constructive vs. classical logics will unfortunately always be a flame war, like Emacs vs. Vim, or static vs. dynamic types. They each really do have benefits, and at the end of the day it just boils down to which tool you know better sometimes. I currently know Isabelle better, and I very much appreciate its simplicity over something like Coq. I have proven things in Isabelle, and never thought to myself “this proof isn’t as valid, because it’s not constructive.”

You get the proof, and you move on.

Isn’t this, like, the point of Aneris and Iris? https://github.com/logsem/aneris

I’m not a Coq hater, just reiterating that. I was commenting on the steepness of the learning curve of Coq, meaning if you don’t have any experience in theorem proving or verification, how long will it take you to get productive in Coq?

Huh, I thought Iris had LEM, but turns out it doesn’t (footnote 2). That’s pretty neat!

I don’t think that’s proof a dependent-typed proof of quicksort is necessarily more complex than the Isabelle proof. Notice how the Idris proof proves element-equivalence by literally encoding the properties of a permutation, while the Isabelle proof instead proves it via multisets. IME the Isabelle approach is much smarter— maybe if the Idris approach would be much simpler if it used that too.

Industrial programmers don’t really use TLA+ to “prove” things in the same way you do Isabelle. TLA+ “proves” things by model checking a finite state space. So you can use it to prove that quicksort works on lists up length N, with numbers in the set

`a..b`

, but not thatallfinite lists withanyrange of numbers will quicksort properly. You can do this with TLAPS, but it’s much harder and more in line with the difficulty of theorem provers. Apalache might also be able to do it, but I don’t know for certain. Ithinkit’s still limited to a given list length bound, but can prove quicksort for any such list with that bound (`a..b = ℤ`

).They’re definitely different approaches, so not really apples to apples, that’s fair. But the dependent-type machinery itself is what I feel is more complicated. If I were to sum up my feeling on dependent types it would be: “just because types

arepropositions doesn’t mean that all of your propositions should be types.”It just seems easier to make a separate proposition about your program vs. encoding all of the propositions as types within your program. This may just be a syntactic thing today? Who knows.

It’s not so much about proving for me as much as formal reasoning. And TLA+ is amazing for formal reasoning! Which is where I agree with the post - treating constructivism as a goal in itself doesn’t make sense, because we have so many non-constructive tools out there that we get real results with. Those results shouldn’t be disqualified.