This is interesting. What occurs to me is that the author seems to massively underestimate the expressiveness of the technical English used in the last proof. That last proof really directs the reader to understand and combine two other proofs. Similarly, without loss of generality is a cut operator that asserts that substitution of any of the other available terms will achieve the same result.

The other thing is that chessboard example is not obvious to me at all. It seems that there’s a disconnect here between what computers and the uninitiated can do (be lead through these things step by painful step) and what a mathematician does when they perceive something that could be proven but haven’t proven it such that a computer or mathematical idiot (such as myself) cannot contradict it.

Each tile must cover one white and one blue square, but there’s 32 white squares and only 30 blue squares.

It’s not actually that obvious, though, if you don’t know the trick. The reason Paulson called it obvious is because that the Mutilated Chessboard is part of math lore, so most mathematicians would have encountered it before.

When you put it that way, it seems like the mutilated chessboard is an instance of a pigeonhole problem! Such problems are interesting because they are connected to NP. Quoting Aaronson (p48, commenting on Theorem 46):

Haken’s proof formalized the intuition that any resolution proof of the Pigeonhole Principle will ultimately be stuck “reasoning locally”: “let’s see, if I put this pigeon there, and that one there … darn, it still doesn’t work!” Such a proof has no ability to engage in higher-level reasoning about the total number of pigeons.

In a certain sense, the formalized proofs shown in the article are dependent upon some sort of global counting (the number of chessboard tiles with each color, the number of partitions of a set, etc.) and maybe this is a systemic weakness with the chosen approach to formalization.

It is similar to pigeonhole problems, in particular in terms of the underlying combinatorics.

Pigeonhole problems are known to be very difficult for SAT solvers. Intuitively it’s because, despite the symmetry of the problem, a SAT solver has to prove every possible arrangement to be invalid before it can conclude that the problem is unsatisfiable. There are tricks around preprocessing that make the problem easier (“symmetry breaking”). However, recent advances in SAT solving make even the mutilated chessboard, unprocessed, within reach.

I tend to agree with the author. I tried too many times to prove even simple arithmetic facts in Coq or Agda but it is really difficult. The calculus proof in the post is frightening for someone who is not an expert in Isabelle.

One benefit of Isabelle is that you can write “structured” proofs, i.e. what they’re doing here with creating intermediate states with obtain and have, etc. So actually what’s mostly going on in that proof is math, not Isabelle. The Isabelle parts are the the proof methods, e.g. by (metis add_diff_cancel_right' dist_diff(1)), which to be fair are pretty cryptic, even when you know what they are. And also to be fair, the intermediate propositions that are being shown here wouldn’t be necessary on pen and paper where lots is glossed over.

Formalized proof is a whole different beast, that’s for sure.

Proof assistants can definitely be frustrating for this reason. Sometimes really complicated things can be proven opaquely with one line, and seemingly simple things require way extra effort. Example of something complicated getting proven automatically: Cantor’s theorem.

They feel pretty fickle in general, but like anything else, you can build an intuition for what they’re doing under the hood over time.

The tripartite triangle problem is really blowing my mind.

This is interesting. What occurs to me is that the author seems to massively underestimate the expressiveness of the technical English used in the last proof. That last proof really directs the reader to understand and combine two other proofs. Similarly, without loss of generality is a cut operator that asserts that substitution of any of the other available terms will achieve the same result.

The other thing is that chessboard example is not obvious to me at all. It seems that there’s a disconnect here between what computers and the uninitiated can do (be lead through these things step by painful step) and what a mathematician does when they perceive something that could be proven but haven’t proven it such that a computer or mathematical idiot (such as myself) cannot contradict it.

Each tile must cover one white and one blue square, but there’s 32 white squares and only 30 blue squares.

It’s not actually that obvious, though, if you don’t know the trick. The reason Paulson

calledit obvious is because that the Mutilated Chessboard is part of math lore, so most mathematicians would have encountered it before.When you put it that way, it seems like the mutilated chessboard is an instance of a pigeonhole problem! Such problems are interesting because they are connected to NP. Quoting Aaronson (p48, commenting on Theorem 46):

In a certain sense, the formalized proofs shown in the article are dependent upon some sort of global counting (the number of chessboard tiles with each color, the number of partitions of a set, etc.) and maybe this is a systemic weakness with the chosen approach to formalization.

It is similar to pigeonhole problems, in particular in terms of the underlying combinatorics.

Pigeonhole problems are known to be very difficult for SAT solvers. Intuitively it’s because, despite the symmetry of the problem, a SAT solver has to prove every possible arrangement to be invalid before it can conclude that the problem is unsatisfiable. There are tricks around preprocessing that make the problem easier (“symmetry breaking”). However, recent advances in SAT solving make even the mutilated chessboard, unprocessed, within reach.

Here’s a Wikipedia article summarizing the problem: https://en.wikipedia.org/wiki/Mutilated_chessboard_problem

I tend to agree with the author. I tried too many times to prove even simple arithmetic facts in Coq or Agda but it is really difficult. The calculus proof in the post is frightening for someone who is not an expert in Isabelle.

One benefit of Isabelle is that you can write “structured” proofs, i.e. what they’re doing here with creating intermediate states with

`obtain`

and`have`

, etc. So actually what’s mostly going on in that proof ismath, not Isabelle. The Isabelle parts are the the proof methods, e.g.`by (metis add_diff_cancel_right' dist_diff(1))`

, which to be fair are pretty cryptic, even when you know what they are. And also to be fair, the intermediate propositions that are being shown here wouldn’t be necessary on pen and paper where lots is glossed over.Formalized proof is a whole different beast, that’s for sure.

Proof assistants can definitely be frustrating for this reason. Sometimes really complicated things can be proven opaquely with one line, and seemingly simple things require way extra effort. Example of something complicated getting proven automatically: Cantor’s theorem.

They feel pretty fickle in general, but like anything else, you can build an intuition for what they’re doing under the hood over time.