This is a really interesting and thought-provoking idea. In writing httpdito, and in particular in putting a sort of tiny peephole optimizer into it as gas macros, it occurred to me that (some) Forth systems are awfully close to being macro assemblers, and that really the only reason you’d use a higher-level language rather than a macro assembler is that error handling in macro assemblers, and in Forth, is terrible verging on nonexistent. Dynamically-typed languages give you high assurance that your program won’t crash (writing the Ur-Scheme compiler very rarely required me to debug generated machine code), while strongly-statically-typed languages give you similarly weak assurances with compile-time checks; but it seems clear that the programmer needs to be able to verify that their program is also free of bugs like memory leaks, infinite loops, SQL injection, XSS, and CSRF, or for that matter “billion-laughs”-like denial-of-service vulnerabilities, let alone more prosaic problems like the Flexcoin bankruptcy due to using a non-transactional data store.
Against this background, we find that a huge fraction of our day-to-day software (X11, Firefox, Emacs, Apache, the kernel, and notoriously OpenSSL) is written in languages like C and C++ that lack even these rudimentary memory-safety properties, let alone safety against the trickier bogeymen mentioned above. The benefit of C++ is that it allows us to factor considerations like XSS and loops into finite, checkable modules; the benefit of C is that we can tell pretty much what the machine is going to do. But, as compiler optimizations exploit increasingly recondite properties of the programming language definition, we find ourselves having to program as if the compiler were our ex-wife’s or ex-husband’s divorce lawyer, lest it introduce security bugs into our kernels, as happened with FreeBSD a couple of years back with a function erroneously annotated as noreturn, and as is happening now with bounds checks depending on signed overflow behavior.
We could characterize the first of these two approaches as “performing on the trapeze with a net”: successful execution is pleasant, if unexpected, but we expect that even a failure will not crash our program with a segfault — bankrupt Flexcoin, perhaps, and permit CSRF and XSS, but at least we don’t have to debug a core dump! The second approach involves performing without a net, and so the tricks attempted are necessarily less ambitious; we rely on compiler warnings and errors, careful program inspection, and testing to uncover any defects, with results such as Heartbleed, the Toyota accelerator failure, the THERAC-25 fatalities, the destruction of Ariane 5, and the Mars Pathfinder priority-inversion lockups.
So onto this dismal scene sweep Kennedy, Benton, Jensen, and Dagand, with a novel new approach: rather than merely factoring our loops and HTML-escaping into algorithm templates and type conversion operators, thus giving us the opportunity to get them right once and for all (but no way to tell if we have done so), we can instead program in a language rich enough to express our correctness constraints, our compiler optimizations (§4.1), and the proofs of correctness of those optimizations. Instead of merely packaging up algorithms (which we believe to be correct after careful inspection) into libraries, we can package up correctness criteria and proof tactics into libraries.
This is a really interesting and novel alternative to the with-a-net and the without-a-net approaches described earlier; it goes far beyond previous efforts to prove programs correct; rather than attempting to prove your program correct before you compile it, or looking for bugs in the object code emitted, it attempts, Forth-like, to replace the ecosystem of compilers and interpreters with a set of libraries for your proof assistant — libraries which could eventually allow you to program at as high a level as you wish, but grounded in machine code and machine-checkable proofs. Will it turn out to be practical? I don’t know.
What you say has been a dream of PL researchers for a long time now. Ada 1 is one of many languages that has strong guarantees and static analysis in mind. ACL2 2 was used to verify the correctness of the floating-point unit in the AMD K6 after the FDIV bug was exposed in Pentium processors. However, as we know, verification of certain safety properties is an NP-complete problem. Lighter approaches have been considered. The Eiffel 3 programming language is a strong proponent of design-by-contract where pre and post-conditions of functions are given in the function definition. The Alloy 4 model checker lets you verify abstract models that you design in specifications. Unfortunately, most attempts at model checking, designing by contract, or engaging in strict static analysis, has been met with failure or disuse. A discussion of Coverity’s 5 static analysis tool exposes a lot of the issues that formal verification faces in industry.
These are good links; thank you. I should have included some of them in my essay. One nitpick, though: most safety properties are not merely NP-complete, but actually uncomputable—it’s easy to construct programs where being able to verify them reduces to the halting problem.