1. 6
    1. 6

      I will admit to being leery of SMT proofs in my own work because of how “magic” they feel. With interactive theorem provers you can at least step through the proof and even break it down into further subproofs if you’d like, so you only have to trust the solver on each individual step. If a SMT solver just returns unsat though, there isn’t much else you can do with that. I do know there is ongoing work to get SMT solvers to generate evidence from unsat results that can be fed into other, independent theorem provers for additional certainty.

      1. 6

        I will admit to being leery of SMT proofs in my own work because of how “magic” they feel.

        I understand how you feel but this is not just about feelings: there’s no reason to believe that current SMT solvers aren’t just as buggy as any other software. My anecdotal experience confirms this, as I’ve ran into soundness bugs in multiple SMT solvers, including Z3 (the most popular one) and I’ve never even been a heavy SMT solver user.

        I do know there is ongoing work to get SMT solvers to generate evidence from unsat results that can be fed into other, independent theorem provers for additional certainty.

        Yes, I hope we go in this direction too, at least. As far as I know, CVC5 is both one of the most competitive SMT solvers and one of the most diligent ones in producing justifications for its reasoning, but IIRC I think this effort is still incomplete.

        Interestingly, I think Z3 also has support for proof generation but it’s so buggy that one of the HOL4 maintainers refused to implement its support for a modern version of Z3.

        As alluded to in the sibling comment, I think Isabelle is the ITP with by far the best support for finding/generating proofs using SMT solvers but still using its own kernel for proof verification.

      2. 2

        I feel the same way about SMT. Interestingly, Isabelle has tactics which can use SMT under the hood, but it feels much more organized to be able to apply it to one sub goal instead of the whole proof. You can optimize for clarity or ease during the proof, and you at least have the chance to debug incrementally if something goes wrong. SMT today feels too much like an opaque box.

    2. 1

      This is a strangely anti-formalist perspective for somebody working with formal methods. If a proof is syntactically valid, then isn’t any quibble with the proof actually a struggle within the mind of the verifier?

      1. 5

        This might be pedantic but it’s very easy to write a syntactically valid proof that doesn’t pass the proof checker in any proof language that exists. When it comes to enormous proofs that can’t be grasped by a human their validity just reduces to our faith that the checker is free of soundness bugs that the proof could take advantage of.

        1. 3

          When it comes to enormous proofs that can’t be grasped by a human their validity just reduces to our faith that the checker is free of soundness bugs that the proof could take advantage of.

          Exactly, I think any rational reluctance in accepting computer-verified proofs basically reduces to soundness worries (which I think is a solvable, if not already-solved problem in some systems) and also another point which I think was barely mentioned in the article: errors in correctly defining/specifying theorems.

          As for the latter, it’s not a problem exclusive to formal methods but I think there are some significant differences, with both advantages and disadvantages:

          1. The advantages being that formal methods are completely ruthless in both forcing you to clearly specify your theorem definitions down to the last detail, and also catching mistakes/flaws in them, when they exist.
          2. The disadvantage being that they are also completely ruthless in taking your definitions at face value rather than your intended meaning, which means that a tiny mistake in your theorem definitions can completely invalidate all of your proofs in some cases.

          I wonder if there are any studies or knowledge of systematic ways to make proofs more resilient to issues related to this latter point, apart from the popular idea of additionally proving theorems that are not strictly necessary to make your proof work (which is sort of the analogue of adding tests in software engineering).

          1. 3

            The second one is interesting, especially because of how “spooky” dependent types are. All you need to accidentally define the natural numbers in a dependently-typed language is a recursive datatype like:

            inductive Bread : Type where
              | unicorns : Bread
              | peaches : Bread → Bread
            

            So you have some element that takes zero parameters and another that takes one parameter of the same type. Any theorems you prove about this type will also hold for the natural numbers. Renaming Bread to Nat, unicorns to zero, and peaches to succ doesn’t change anything about the definition beyond how humans perceive it as being about the natural numbers now.

            At least in dependently typed languages you can actually exercise these type definitions to do computations at the value level, which can additionally function as unit tests beyond proving related theorems at the type level.

        2. 2

          That’s a good point. I suppose I’ve gotten used to Metamath, which is fundamentally unlikely to have soundness issues, owing to the lack of a kernel. Instead, the focus would be on the axioms, which would need to be accepted by each verifying mathematician.

      2. 3

        I took it as an attempt to convince others that we should trust computer-assisted proofs, not the other way around. He calls out a lot of real concerns and fears, but where was the anti-formalism?

        1. 1

          Well, a computer and a human ratify the same collection of proofs (for a chosen formal system); the computer just does it faster. So it’s not the syntax of the proofs themselves, nor the proof system, which is difficult for human mathematicians to accept.

          I suppose that the title is too open. When is a computer proof not a proof? When humans aren’t convinced, presumably. But then those humans can be shown to diverge from computers during the ratification of the proof.

          1. 5

            The way I read the article, it essentially touches the “but can we verify anything when the proof is massive?” question. I mean, with generated proofs, you need to know you’re generating the right thing. Nobody can really verify that a 1TB dump of logic proves the right thing, even if it’s correct. (Unless you meta-prove the generator I guess, but then we’re stacking potential human mistakes again) Maybe it reduces to some trivial tautology by accident? And if you can’t verify you’re proving the right thing, then is this actually a proof? (Specifically is it a proof of the thing you think it’s of)

            1. 4

              That latter question is the one which still puzzles me. If I define addition on the natural numbers, but call the operator “subtraction”, then am I verifying the right thing? Is that a bug in the proof (making it invalid) or a bug in the human-readable annotations for the proof?

              Check out Metamath Zero for a serious attempt at making a trivially-correct meta-prover. Along the way, the author had to verify the correctness of a subset of x86 machine code; “trivial” turns out to be relative.

              1. 2

                If I define addition on the natural numbers, but call the operator “subtraction”, then am I verifying the right thing? Is that a bug in the proof (making it invalid) or a bug in the human-readable annotations for the proof?

                You’ve just given an illustrative example but I think I can give you a different one: suppose that when I am defining a theorem, I do everything right except I make an inadvertent typo in the definition which happens to be syntactically valid (say, I type the digit 0 instead of 8) and also happens to go entirely unnoticed during the whole proof effort (perhaps because of proof automation, or because False becomes provable).

                I’d argue this would be really hard to justify as being a bug in the annotation rather than being a bug in the “proof” (proof in the general sense, as in including the theorem definitions), especially if you consider the root cause of the discrepancy.

                1. 1

                  Sure. And this is part of why I like Metamath so much: at worst, a typo in a proof either invalidates the proof, or creates a new valid proof object which is as correct as the intended proof. A typo in the axioms is a problem, of course.

                  1. 2

                    Unfortunately, the valid proof object that you are referring to can be completely useless, or even worse: it can be highly misleading and induce yourself and others into thinking that you have proved something that you haven’t, or even that you have proved the opposite of what you actually proved.

                    You seemed to imply that, since whatever you proved ended up being a valid proof object, the problem in this case is in how we annotate the proof or how we interpret it, but I was arguing that the problem may actually clearly be in how you specified the theorems erroneously rather than in how you annotated them.

                    That’s because in most cases when this happens you clearly intended to prove one thing but you may end up accidentally proving something else instead (sometimes, something completely useless, in fact), and this may happen in some cases for reasons that are completely orthogonal to your reasoning (as when you make a typo).

                    And furthermore, this may actually be something really difficult to notice, because formal definitions can be enormous and complex and, in extreme cases, the error may be a single erroneous character in a deeply nested definition that may be very easy to overlook, even for attentive reviewers.

                    This is not even mentioning the problems with the axioms, which in many formal proof systems have the advantage of being very few (and thus, easier to review), although they also have another set of issues, sometimes even philosophical ones that are also hard to solve.

                    1. 1

                      Okay, but I’m talking about Metamath. Check out an example proof. Metamath manipulates syntax directly and resolves every intermediate step; what you see is what you get, and so there’s no question for the proof author about whether or not the object on their screen is the object ratified by the computer. Additionally, the proof object is also syntactic; this is what I meant at the top of the thread when I said “syntactically valid”.

                      1. 2

                        Yes, I was already familiar with Metamath before this conversation. It’s not different from other systems with respect to what we’re talking about.

                        It’s not “what you see is what you get” like you say because, like other systems, the assertions that are being proved depend on the meaning of the symbols. If that weren’t the case, i.e. if the symbols were meaningless, then the assertions would also be meaningless.

                        The proof that you just linked to is a really basic expression, so this is hard to see, but even here you are still trusting that the , and symbols are representing what you think they are (although in this case, the and symbols are so well-known and commonly used that this is not usually a problem).

                        But I think you could imagine a different Metamath system, which is identical to the existing one except that the ℕ and ℝ symbols are swapped. In this imaginary system I could prove the assertion ⊢ ℝ ≺ ℕ, which is highly misleading. So if you really want to be sure that a proof is proving what you think it is, you really have to dig down and look at how all the symbols that the assertion depends on are being defined, including all the transitive dependencies.

                        In basic math this is not usually a big problem, because the symbols are relatively easy to understand and commonly used, but I suspect this becomes a bigger issue in more advanced math and it can certainly become an issue if you are trying to prove large, complex software systems to be correct.

                        In the proof you linked you can also see the description, which is what I meant by “annotations”. These descriptions can also be misleading unless they faithfully represent what is being proved, but of course, since they are written in prose they are almost guaranteed not to do that in all cases (or at least, not without ambiguities).

                        And it’s not like you can get rid of the descriptions, otherwise Metamath would become an unusable mess of symbols that almost nobody would understand (unless there was something else to replace their role, thus having the same problem).

                        1. 1

                          It’s not “what you see is what you get” like you say because, like other systems, the assertions that are being proved depend on the meaning of the symbols. If that weren’t the case, i.e. if the symbols were meaningless, then the assertions would also be meaningless.

                          You are exhibiting the same anti-formalism that I pointed out at the top of the thread. The assertion being proved does not depend on any semantic assignment of meaning to the symbols; it is syntactically proven by rewriting lines of syntax according to rules which only know about the symbols and not any semantics.

                          This is what I mean by struggles in the minds of humans, rather than difficulties with computers. Many humans insist on assigning some semantics, and there’s nothing wrong with that, but the proof object is valid even without such assignments.

                          For posterity: That syntax-only process is distinct from the typical type-driven proof generated by a system like Mizar or Isabelle, the systems endorsed by the article’s author. In those systems, a proof is interpreted within some kernel calculus, computing a value which corresponds to the syntactic proof. The choice of interpreting kernel is a choice of semantics.

                          1. 3

                            The assertion being proved does not depend on any semantic assignment of meaning to the symbols; it is syntactically proven by rewriting lines of syntax according to rules which only know about the symbols and not any semantics.

                            Corbin, I think we are speaking past each other but I’m not sure exactly why.

                            In one of your previous replies you gave me this link to an example of a proof in Metamath.

                            That link is a computer-checked proof that the expression ⊢ ℕ ≺ ℝ is a valid proof object. Now, regardless of how Metamath works, there must be some mechanism in it that currently allows you to construct a valid proof object of ⊢ ℕ ≺ ℝ but not ⊢ ℝ ≺ ℕ, otherwise Metamath would just be a book of unchecked proofs rather than a real proof checker.

                            If it’s currently not possible to prove ⊢ ℝ ≺ ℕ in Metamath, then it means that the and the symbols are not interchangeable. And if they are not interchangeable, there must be a reason why, i.e. there must a context associated with that is different from the context associated with , whether it’s because of the syntax rules you mention, or some definition or something else (I don’t know exactly what it is because I’ve never used Metamath apart from superficially reading some proofs).

                            The reason why these associated contexts must be different is the same reason why the natural numbers and the real numbers are not the same, i.e. it’s why they are different sets therefore they must be different objects and must use different symbols, so that we can refer to each of them separately.

                            It’s also the same reason why we attribute different semantic meanings to the and symbols in our minds. We must attribute different meanings, otherwise there would be no hope at all of understanding what we are doing!

                            For the entire system to work correctly, that is, for us to say that we proved ⊢ ℕ ≺ ℝ and to be sure that this is exactly what we proved, you can’t just rely on Metamath, you also have to be sure that the and the symbols have an associated context in Metamath that precisely reflects the semantic meaning that you have of these symbols.

                            In other words, in addition to what exists on Metamath, you must also be sure that there is a 1-to-1 mapping of what is specified in Metamath and the semantic meaning that we attribute to those symbols and rules. That is, you must be sure whether the symbol truly represents the natural numbers in Metamath and whether the symbol truly represents the real numbers in Metamath if you want to understand what the ⊢ ℕ ≺ ℝ expression in Metamath means!

                            As far as I can see, there is absolutely nothing special in Metamath regarding all of this, this is exactly what happens in any kind of formal (i.e. computerized) proof system.

                            This is what I mean by struggles in the minds of humans, rather than difficulties with computers. Many humans insist on assigning some semantics, and there’s nothing wrong with that, but the proof object is valid even without such assignments.

                            You seem to be implying that we “assign” semantics to what is in Metamath, i.e. that we read what is in Metamath and afterwards we assign meaning to it, but although this can also be a source of errors, this is not how it always works, especially for someone who is writing a proof.

                            Often, for someone who is writing a proof, the semantic meaning already exists in their mind, and the struggle is to map the meaning in their minds into the language of Metamath. This is also another source of errors.

                            Regardless, what matters is that in the end, if the semantic meaning that we have in our minds does not exactly correspond to what is in Metamath, then the entire system has failed.

                            You can say that this is the fault of humans and their struggles and not the fault of Metamath, but that’s just deflecting blame. In the end, you have failed in your mission anyway.

                            If you want your formal proof system to be useful for humans, then it needs to be usable, and this includes both being understandable and trying to eliminate these sources of errors. And if you don’t want the system to be useful for humans, then what’s the point of having it?

                            I mean, how useful is a formal proof system that tells you nonsensical facts like “1 = 2” and “5” is a negative integer? You can say “you don’t understand, the proof checker is right, 5 is a negative integer, look at the rules!!! It’s you who are wrong!!!” but this really does not help us at all!

                            I suspect that the Metamath creators understood this and I also suspect this is why it’s possible to add descriptions to the proofs, i.e. to aid us in establishing a correct mapping between what is specified in Metamath and the semantic meanings that we have in our heads.

                            The more this mapping is correct, the more the system is useful.

                            1. 1

                              And if they are not interchangeable, there must be a reason why, i.e. there must a context associated with ℕ that is different from the context associated with ℝ, whether it’s because of the syntax rules you mention, or some definition or something else…

                              Yes, it’s a definition of the natural numbers. Note that this definition is syntactic: you may replace ℕ with the right-hand side of the definition, which is a string of symbols, and use those symbols in place of ℕ. ℕ is also a symbol. The definition has no semantic clue about what the natural numbers are; it only has a formal syntactic explanation.

                              This is analogous to calling addition “subtraction” earlier in the thread. Attach the “wrong” symbol, and now the proof is “wrong”. Of course, syntactically, the proof object is still valid; however, humans might not be able to assign the “right” semantics in their mind. “wrong” and “right” are fully relativized here; they’re not absolutes.

                              [I]f the semantic meaning that we have in our minds does not exactly correspond to what is in Metamath, then the entire system has failed.

                              Nah, that’s anti-formalism. Instead, think of Metamath as providing an initial model, a model which validates all of the standard behaviors that are implied by the axioms, and each human as having a model in their minds which is not necessarily initial. Then, by basic abstract nonsense, there’s a functor from Metamath’s syntax-only proofs to each human’s internalized semantics-laden beliefs about those proofs.

                              1. 4

                                The definition has no semantic clue about what the natural numbers are

                                What would it even mean for the definition to have a semantic clue about what the natural numbers are, in a formal (i.e. computerized) system?

                                Of course, syntactically, the proof object is still valid; however, humans might not be able to assign the “right” semantics in their mind.

                                And my point is that it doesn’t really matter if your formal system considers this proof to be valid (i.e. this proof in which you’ve called addition “subtraction”), because a formal system is completely useless in isolation. If your formal system validates a proof in a forest and no one is around to care about it, does it matter?

                                In fact, if your formal proof system has any users at all, it may even matter in that your formal system is now worse than useless, because now you’re leading your users into believing things that are not true, which is the exact opposite of what we want formal proof systems to do, as the tools that they are.

                                You seem to be arguing that such a proof is valid and the formal system is useful in itself because it perfectly follows the axioms, regardless of the meaning we assign to the proof, and that’s all that matters.

                                Well, that’s kind of tautological… it’s obvious that your formal system follows a set of rules because it’s a deterministic program being executed by a computer. My “hello world”-like program that always prints “valid proof!” also follows a set of “axioms”. Maybe your program is a bit more flexible and doesn’t have a hardcoded set of axioms like my “hello world” program, but rather allows a human to input their own axioms into it, causing it to behave slightly differently depending on its input. There are many programs that also accept inputs and produce outputs according to a set of rules, such as grep or wc, but they do not qualify as useful formal proof systems.

                                Ultimately, formal proof systems must have a set of axioms, definitions and proofs that are connected to the real world and to things that we care about in order to be useful.

                                That’s why all formal proof systems have similar definitions for similar concepts. It’s the reason why the “ℕ” symbol in Metamath is defined the way it is and not some other completely random way. Otherwise Metamath would just be a database of random symbols and random facts (well, not strictly random because they would have to follow some rules, i.e. the axioms, but random in every other way) and it would literally look like incomprehensible gibberish.

                                I really don’t understand how you can give me a URL and say: “look, this is a Metamath proof that the real numbers are uncountable!” and yet argue that the proof in fact is just a set of symbols that follow some rules and the meaning of those symbols doesn’t matter, or that it’s a secondary concern and not really the main focus of a proof. To me it’s analogous to you saying that you are a fluent English speaker and yet you don’t know or care about what any of the English words mean.

      3. 3

        You may be interested in: mathematics, morally.

        1. 1