1. 16

Author argues that math and formal logic are empirical since they’re derived and reinforced with empirical methods. If so, mathematical proofs can’t be rejected by empiricists as non-empirical evidence.


  2. 7

    Ok, there’s been some debate here about mathematical vs empirical evidence. Some people highly believe in formal proof for reasons that probably vary. @derek-jones is main opponent saying those methods are nonsense advocating for strictly empirical approach. I’m somewhere in the middle where I want any formal method to have evidence it works in an empirical sense but I don’t see the concepts as that different. I think we believe in specific types of math/logic because they’re empirically proven. As I formulated those points & dug up related work, I found this submission that argued a lot of what I was going to argue plus some other points (esp on intuition) I wasn’t thinking about. So, enjoy!

    If you’ve read it at this point, I’ll weaken that claim to mine that certain maths/logics are empirically valid by being relied on in hundreds of thousands to billions of uses. If they were wrong, we’d see massive amount of evidence in the field that the fundamental logical methods didn’t work. Matter of fact, some of these failing would make it impossible for Derek to even write his rebuttals online given propositional logic and calculus are how mixed-signal ASIC’s work. ;)

    So, I’m going to list some specific types of math with widespread, empirical confirmation as a start. Anyone else feel free to add to the list if it’s math/logic used in formal proofs that has massive level of real-world use with diverse inputs or environmental setups.

    1. Arithmetic on integers and real numbers. Computers can be boiled down to glorified calculators. Arithmetic can help bootstrap trust in other things, too.

    2. Algebraic laws. Logics like Maude build on these.

    3. Calculus like integration and derivatives. Heavily used in engineering. Also, building blocks of analog circuits and computers implement these primitives.

    4. Propositional and Boolean logic. Digital designs are synthesized, verified, and tested in this form. With all CPU’s shipped, it’s probably the most well-tested logic after arithmetic and algebra.

    5. Some subset of first-order logic. There’s been a lot of experimental confirmation it works in form of Prolog and Datalog programs that do what the logic says they should. I’ll note the older Prolog work was a bit weaker since it was associated with lots of zealotry that led to AI winter. We can focus on working applications, though, of which there are many in commercial and FOSS. Although it has a niche, Prolog was pushed way out of it to implement diverse array of programs and environments. The SMT solvers also catch problems they’re supposed to on a regular basis. All problems I’ve seen investigated in these were implementation errors that didn’t invalidate the logic itself.

    6. Set theory. Experiments with the principles seem to always work long as problems and solutions fit the simple modeling style. It’s pretty intuitive for people. It’s been used to model many problems, esp theoretical, whose applications and rules were reviewed by humans for accuracy. There is some use of it in programming in data structures and composition. Deployments haven’t invalidated the basic operators.

    7. Basic forms of geometry. You can get really far in making stuff happen predictably if it’s modeled in or built with simple shapes. See Legos and construction in general.

    8. Basic probability theory. There’s a ton of use of this in empirical research. They all seem to trust the basic concepts after their observations of the field.

    One can do all kinds of hardware/software verification using 1-6 alone. I’ve seen 7 used periodically in applications such as digital synthesis and collision avoidance for guidance systems. I’ve seen 8 and similar logics used when the problem domain is imprecise. If those logics and their basic rules are empirically-proven, then the applications of them have high, empirical confidence if they’re using the rules correctly. At that point, most if not all problems would show up in the extra items with less or no empirical verification such as formal specifications of the problem and implementations/checkers for those logics. Formal methodists always note things like checkers are in TCB keeping them simple and small. Verified checkers also exist which we can further test if we want. Regardless, looking at these logics and their rules as theories with massive, empirical validation means we’d trust or watch out for the exact things the formal verification papers say we should. Everything in the TCB except the logical system and rules if it uses one of above.

    The mathematical concepts themselves are as empirical as anything else. Probably more so given they’ve had more validation than techniques many empiricists use in experiments. The statisticians and experimenters arguing among themselves about the validity of certain techniques probably don’t have slightest doubt in basic arithmetic or algebra. It’s clear given they rely on math in their experiments to justify their own claims. Gotta wonder why they doubt these maths work only when the inputs are numbers representing programs.

    1. 4

      A few quick comments:

      1. One serious problem with empiricism (or any claim about “reality”) is that its always conditioned (and usually unconsciously) by cultural values, politics, language, and more subtle aspects of human cognition. If you look, it’s easy to find plenty of historical and contemporary examples of people claiming empirical proofs of “facts” which are (to our way of thinking) absurd, repugnant, or both. Note that this is a separate concern from what philosophers call the problem of empiricism, which is basically that the decision to draw conclusions from any finite amount of data is an arbitrary one and thus the conclusions incomplete. That’s like what formal methods people say about unit tests.
      2. Mathematical practice predates the “Enlightenment” and what historians call “modern science” by some millennia. When Gauss called mathematics the “Queen of the sciences” he meant the older sense of the word “science” as an organized body of human knowledge, without any specific constraints on organizing principles. I would say that modern mathematics is somewhat unique in rejecting empiricism and insisting on symbolic, axiomatic reasoning.
      3. What’s empirically durable are the relations between formal systems of symbols and various kinds of measurements of more concrete phenomena. We should certainly hope that these would apply to our artificial computing machines at least as well as messier “natural” phenomena, since we intentionally build them to behave according to your systems 1-4 (and 5 to some extent).
      1. 2

        What I have been saying, and will continue to say, is that many of the claims made in papers using formal methods are intellectually dishonest and are something that soap powder manufacturers would not be allowed to claim (by the advertising standards authority).

        Rather than respond to what I am saying, you appear to be branding me as being anti-formalism in general. This is an effective technique for getting others to ignore what I am saying and is as intellectually dishonest as much for published formal methods work in software engineering.

        1. 2

          Rather than respond to what I am saying, you appear to be branding me as being anti-formalism in general. This is an effective technique for getting others to ignore what I am saying and is as intellectually dishonest as much for published formal methods work in software engineering.

          Whoa, now! I can only go on what you’re saying in the comments and the articles you link. Overstatements are just one thing you’ve said. I’ve never seen you approve of a single thing in formal methods as doing what it claims to do. You have dismissed quite a few of them. You said KCC was more accurate on C semantics but that could mean less of a screwup or waste to you. You also state a preference for empirical methods over them when I’ve argued some of their foundations are themselves empirical with massive sample size. Given what you say, it’s reasonable to think you’re anti-formal methods instead of just trying to help them improve their accuracy or trustworthiness. It was only recently that I saw you use more narrow complaints like cost that mean you might accept some form of formal methods. I had to look at that in light of your other comments here, though, when trying to interpret them.

          So, which formal methods techniques or projects in software have you liked, would build on empirically, and why? Which logically-deduced claims in empirically-verified logics will you trust over experiments using less-verified methods that claim the same things? If I’m misinterpreting your position, those would be a nice start of a better interpretation.

          1. 1

            You are a true believer and membership of this tribe invokes a response to those who criticize what you believe in. Intellectual dishonesty is not the appropriate term to apply to your approach (but it it was the rift I was work for formal methods), drinking the cool aid, tinted glasses, blinkered view, something along these lines. Even when I say something positive (e.g., KCC), you attribute negative intentions.

            Anyway, you are honestly searching for answers and reading what I have to say (which is a cut above most). But, I am not inclined to debate a true believer.

            1. 1

              re true believer and part of a tribe. Let’s test us empirically. :)

              An empiricist will collect data for and against various methods out there. There’s been more empirical studies done without formal methods. I’d expect you to have more data on them. You do. There’s also been lots of case studies in individual projects and on piles of code for static analyzers, model-checkers, and provers. These collectively show formal methods work across a lot of situations for a lot of properties. They’ve been pretty consistent, too, on what kinds of problems they catch. You don’t report or account for any positive findings in these conversations or write-ups that make Lobste.rs or your paper collection on your website that I could tell. You do describe the negatives in great detail. Your mention of KCC matched that established pattern or theory having almost nothing in terms of supporting or differentiating detail like your takedowns do. So, my interpretation of an empiricist discussing formal methods without ever crediting positive results in experiments is heavy bias against those methods. You also may only count specific types of experiments as evidence which filters out most results in formal methods.

              Now let’s look at what I’m doing. I did two write-ups arguing for applying every method of assurance because we can’t trust any one to do it all. I’m heavily in favor of high-assurance certifications that combine them. I submit esoteric methods for assurance that people might miss which are mostly formal methods since they’re missed most in these forums. I also submit clever informal methods like this, this, and this that you may not see due to lower votes. On skeptical side, I also called for tiny checkers to find errors in proof terms, exhaustive testing of both formal specs and provers, checking program logics with empirically-vetted ones (eg set theory), avoiding overselling to not repeat AI Winter for formal methods, and even warn newcomers that maybe they shouldn’t try it at all. That’s more Skeptic Magazine than true believer talk.

              What I do have from my background is a bias in favor of them that I counter by involving skeptics like you on specific submissions likely to have errors my bias will miss if I’m in a hurry. That’s worked out. My consistent recommendations on this forum are empirically-backed safe languages and automated testing combined with lightweight, formal methods like Design-by-Contract for interface checks or TLA+ for concurrency. A true believer would recommend Coq and Verdi but the others have better cost-effectiveness based on prior experiments. I’d say the true believer hypothesis gets the null result here. More like “biased with false positives also inviting constant peer review.”

              “Anyway, you are honestly searching for answers and reading what I have to say (which is a cut above most). But, I am not inclined to debate a true believer.”

              I didn’t actually want a debate. I wanted two things:

              1. In this submission, I pointed out that specific maths have so much empirical evidence behind them that anything using them as they’re used successfully in the field should have more weight than most empirical experiments with smaller samples or error-prone math. Given the logic is stronger empirically, I wanted to know when you’d trust a logical result as much as or more than experimental evidence for a method for software correctness. I’m not even talking in general where you have to factor in things like requirements errors. I’m talking about logically modeled properties that provers say a piece of software has with a tiny TCB checking the analysis in a logic with massive empirical proof of accuracy.

              2. Since you seemed surprised in the tangent, I asked what specific formal methods you believe in from reading empirical data on their effectiveness. I’ve seen and posted it for requirements, models, code, testing, repair, and so on. If you’re not anti-formal methods, you’ll probably have a list of methods with corresponding papers/articles you think well of that researchers like you might agree with. If there’s mutual agreement, leveraging such things might get support from your crowd in form of studies corroborating or rejecting the methods with review and/or experiments. One can’t begin to know what you’d work with or build on until you tell us which ones you found experimentally justified and why. My list so far has you liking KCC models’ realistic semantics for C with no comments on assurance of its tooling (esp compiler). That’s it.

          2. 2

            Would it be reasonable to say that while there’s some evidence that code verified with formal methods is more likely, based on our empirical observations, to be more correct than code not verified with it, all other things being equal? It doesn’t mean that it will be correct, or that we get the same benefit with all formal methods, but it means that we can be enthusiastic about FM without being blind to the limits.

            I believe in this for a few reasons:

            • We’ve seen mathematically elegant ideas in CS crash and burn before. The main one that comes to mind is N-version programming, which seemed wonderful in theory but failed in practice.
            • We’ve seen some evidence that code using formal methods still has bugs, but fewer bugs than similar systems without it.
            • It sidesteps the whole “is math abstract or empirical”, which appeals to the Wittgenstein fanboy in me.
            1. 2

              If you put effort into verifying/checking code, it will be more reliable that if your don’t. The question is, which technique is most cost effective, i.e., bang for the buck in reducing fault experiences.

              N-version program has many benefits, it is the idea of it being a magic bullet that crashed and burned.

              1. 1

                The main one that comes to mind is N-version programming, which seemed wonderful in theory but failed in practice.

                I’m still not sure it failed unless it’s the naive, original form. When researching security via diversity, I remember one paper pointed out that people were doing multiple implementations that often had the same algorithms, dependencies, and so on. They were too similar even when different people wrote them. So, they tried to do an automatic approach to swapping out functions with implementations that were structurally different but equivalent. Then, flaws due to the structure in one might not affect the other.

                Multiple versions behind a similar interface often don’t have same problems in hardware and OS’s probably for this reason. Sometimes they do but often don’t. I think ensuring implementation is diverse structurally and/or temporally is at least worth some further research in multiple areas of implementation before I’ll call N-version concept as a whole failed. I think the initial version of the idea just missed a key requirement as proven by those experiments.

                “We’ve seen some evidence that code using formal methods still has bugs, but fewer bugs than similar systems without it.”

                Yep. I’ll add that most of the case studies also say they caught bugs that review and testing missed. That’s usually specifications and model-checking. Then, the proofs caught the deepest bugs but required the most effort with simplifications. That should become the baseline belief about them given that’s been pretty consistent going back decades. What changes is tooling applicability to platform/language X, labor cost, time-to-run, effectiveness, and so on. Mostly getting better over time but still has limitations.

            2. 2

              Yeah that all sounds pretty sensible to me. Just because you can describe something mathematically doesn’t mean that’s how the real world behaves. basically you can prove something is logically inconsistent, or you can evidence that something is logically consistent, that doesn’t mean your computer is say even fast enough to do it. Theory and practice both exist, and I think it’s reasonable to assume they both exist for a reason because they’re both useful. I think certain domains also lend themselves more to formal verification. As I understand it, it’s because the cost of testing is high or the cost of failure is high, or both. If a program costs you a million dollars to run if it fails you better believe formal methods are going to be a priority.

              1. 3

                That last point also ties into discussion of whether there should be software liability. The reason formal verification got popular in hardware probably traces back to the Intel FDIV bug that cost them a pile of money. That’s because they might have to do recalls or something on hardware. With software, they try to maintain EULA’s that say they’re not responsible for their own work. If that changed, we might see an uptick in use of both formal verification and other assurance techniques that covers at least what regulations or court rulings required them to cover.

                This isn’t hypothetical, though, since market responded with activities such as formal verification, model-driven development, and static analysis in response to both TCSEC and DO-178B (now DO-178C) requirements. They’ll definitely do it if failures cost a fortune. Due to basic economics, the tooling will improve and get cheaper as demand and volume go up. EiffelStudio and Ada/SPARK with contract-based techniques are probably the highlights here where they can knock out tons of errors but don’t cost well into five digits a seat. That is, if I’m recalling pricing correctly.

                1. 1

                  With software, they try to maintain EULA’s that say they’re not responsible for their own work

                  This is probably going to change with the widespread adoption of tools driven by artificial intelligence.

                  Somebody must be accountable of their errors. That actually are always reconductable to humans.
                  Even when we will be able to create an artificial general intelligence.

                  1. 1

                    That’s unlikely. It’s a political problem like a lot of the “technical” problems. The software companies in the U.S. pay bribes to politicians to pass laws that benefit them or not pass laws that hurt them. They’ve fought both acquisition policies demanding high security and liability laws making them responsible for their work. An AGI can notice anything it wants but corrupt politicians will just ignore it.

                    In other areas, though, AI’s could be helpful like this one that checks contracts.

                    1. 2

                      It’s a political problem like a lot of the “technical” problems.

                      Just like the first death caused by a self driving car.

                      Since you cannot punish an software, someone else need to pay for these deaths.

                      Maybe the CEO, or the stock holders.
                      And I guess that all politicians that now vote to protect the industry, will have an hard time to face the parents of a victim and politically survive.

                      AI is a great technology, but it’s not yet ready to decide over humans.
                      And we should not be fooled by marketing and hype: we should fight business-aided ignorance by sharing knowledge of what AI is and how it works.

                      There is always at least one person accountable. He cannot hide behind a black box.

                      1. 2

                        That gets into a bigger problem, though. Companies might be held accountable when things don’t work out. If it’s a big company, they’ll loose some money instead of conspirators doing prison time. Often, the conspirators will be rewarded by future, high-paying jobs if they were acting in interests of their company when they made the gamble. It’s those that misled the company itself that might get severe punishment unless they had golden parachute built into the deal. The result may get those self-driving cars off the streets, force improvements on them with people still dying/suing, and/or get companies to invest in more lobbying or defense attorneys reducing liability.

                        So, it’s a possibility that depends a lot on what companies are involved, the nature of the accidents, and so on. Personally, I agree that the unknowns in self-driving cars make them a risk that has high likelihood of negative consequences for users or suppliers. I can neither be sure nor make a general claim about technological harm, though, given all the corruption and successful defenses by predatory companies in the past. I will give you this excellent article that talks about the problem from a different side than usual. I love the example in it of how people will react to autopilot disengaging when a bus is coming at them. Scary shit. ;)

                        1. 2

                          WONDERFUL READ.

                          I just added a link to it in my article about AGI: I didn’t know that it was called “the paradox of automation”, but the concept was already there as the only true threat from an artificial general intelligence to humanity.

            3. 3

              Seems overly philosophical/abstract. If the right people accepted OP’s argument, what would the world do differently?

              (Shower thought: Math is empirical if there’s such a thing as the Platonic universe.)

              1. 3

                Where Mathematics Comes From makes some similar arguments. I haven’t read much on the philosophy of mathematics, but this book changed my perspective greatly.

                1. 2

                  My position is that, math, software , formal methods , and empirical data are all just modeling. Underlying them is a fundamental property our brains are really good at understanding and thinking about , and that is geometry. Since all geometry can be explained via symmetry and symmetry breaking , modeling is essentially recognizing symmetry and symmetry breaking.

                  If you disconnect from reality, like in a sensory deprivation chamber, or take drugs, or even sleep, you models start drifting from what the senses tell you (keep in mind, your senses are also modeling) When you are awake , your model of whats happening is constantly being course corrected by your senses (another model) In other words, formalism and empiricism are two sides of the same coin. Enpirical data are models that helps other models course correct.

                  Formal methods are just writing down exactly what your model is. In other words, another model, and therefore ultimately not complete. This might be a very Kantian position.

                  1. 1

                    That they’re all modeling makes sense. They are far as I can tell. Prior evidence also indicates the rational and intuitive aspects of the mind can correct each other. There’s some balancing act happening with them. The geometry angle is interesting. It’s certainly possible given we have innate skill in that area. Just not sure it reduces down to that instead of that plus some other things. As I think about it, many things like counting that I’d think of as symbolic or something else can be derived from geometrical observation. People even talk of the “shape” of algebraic equations on paper and data structures in computers.

                    I’ll just have to let that one float around in my mind over time as I think on this stuff. :)

                    1. 1

                      To me it goes beyond just an innate skill, but something fundamental to the universe and existence. The fact that there is any matter at all comes down to symmetry breaking during the big bang. Where did all the anti-matter go? There is a reason we can “feel” the shape and size of a program in our heads. A lot of our thinking is visual and not symbolic. To me I feel there is something deeper here than just an innate skill.

                      For example, take the revolution we are having with neural nets now that we have enough computer power to make them valuable. Fundamentally they compute distances on hyperspheres, which become new values on a new hyperspheres on the next layer. It’s all geometry here. And if you haven’t heard, hyperspheres are spikey things. Obviously neutral nets are a model of what happens in our heads and as we learn more about the actual biological mechanisms, these models will improve. But isn’t it interesting that it’s possible each neuron in our heads is discerning geometric structures in multi dimensional spaces? And we have something like 80 billion of neurons, and millions working at a time using only 20 watts of energy.

                      1. 3

                        Your comment reminded me of a study which suggests that real neurons form multi-dimensional structures when responding to stimuli: https://www.frontiersin.org/articles/10.3389/fncom.2017.00048/full

                        1. 1

                          I didn’t have that one either. Thanks.

                          1. 1

                            I’ve never seen this, thanks!

                          2. 1

                            I’m getting the visual connections. I’m just not clear if they’re what’s really going on or an implementation detail for some brain functions. There’s clearly a lot tied to geometric processing.

                            I’d be careful with the neural net claims since most experts on brains and ANN’s tell me they’re not how the brain works. I hear a lot of different things on that topic depending on who is talking. I was told some are modeled after how people think some sets of neurons work. For instance, some ANN’s were mimicking visual processing which of course would make them look more visual or geometric in how they function. There were other styles in the brain, though. Additional research shows things like neurons themselves maybe NN’s more like analog CA’s or that glial cells with calcium flows maybe be doing much thinking. I can’t remember what the folks modeling neocortexes said about them to test whether that was more symbolic or geometric in representation. I just don’t believe ANN’s tell us how the brain works by far until they really model the brain’s features.

                            “And we have something like 80 billion of neurons, and millions working at a time using only 20 watts of energy.”

                            And that mindblowing capability is exactly why I support so much money being poured into studying and experimenting with neural architectures for problem solving. Especially combo’s of styles with feedback networks since that’s probably critical. Alternatively, hybrid AI architectures that simulate combos of imprecise planning with real-time, intuitive responses. AI researchers tend to go all in on one method whereas both evolution and its brain diversify.