1. 12
  1.  

  2. 25

    This is just like how when we write int sum (int, int); we say sum() is a function, but that function only holds true for inputs of type (int, int). See the analogy? This is where we stole types from.

    I’m not sure programming types originally came from type theory. The earliest versions of FORTRAN had data types, too, like int and real. Types were originally introduced as a solution to an hardware problem: how know what a sequence of binary digits is supposed to represent? Only later did we start thinking of types in the mathematical way. We’re really lucky the two uses of “type” were unifiable.

    Is there a language with a type system powerful enough to express that the function adheres to the constraint: for every input the result will be the square of the input? […] If such an arrangement exists, then it’s safe to assume that a language in which no erroneous programs will exist is possible.

    Nobody in formal verification believes this. We can only prove the program conforms to the spec, not that the spec is correct, or covers everything we want. Most specification languages are very limited, and can’t encode things like memory use, hyperproperties, security, etc.

    (There are some exceptions, but each exception only adds a little bit of extra specification power at the cost of a lot of complexity)

    So it is possible to statically verify that a function always returns the square of its input, it just needs a proof (aka its type).

    We have to be careful here. It is possible for a static verification system to say some functions are square and some are not, but we can’t prove whether any arbitrary function is square. Rice’s Theorem applies here.

    This is a problem with formal verification in practice: there are plenty of properties we might know (strongly believe) are true, but be unable to prove it.

    As expected, there are programming languages out there, that use this Curry-Howard isomorphism to statically verify its correctness. Languages like CoQ [sic], Agda, Idris. Folks use these languages to build mission-critical software for rockets and airplanes, but the rocket-sciency nature of these languages is stopping the general business world from adopting these languages.

    None of these languages are used in mission-critical software. Rocket software has hard real-time requirements: operations must never exceed a maximum time budget per operation. That restricts you to things like C, C++, and Ada. NASA occasionally uses Coq as a theorem prover, but they certainly aren’t using Gallina as a programming language. They’re using Coq to verify C code. Agda isn’t really used outside academia. Idris is still highly experimental and shouldn’t be allowed anywhere near an airplane.

    1. 17

      Folks use these languages to build mission-critical software for rockets and airplanes, but the rocket-sciency nature of these languages is stopping the general business world from adopting these languages.

      sigh and facepalm at this part. It’s weird that people want to have such strong opinions about software, when it’s blindingly obvious not only that they’ve not worked in those domains, but also not anywhere near them.

      Reminds me of this thread which I don’t want to dignify too much … I responded with the bit about browsers but the rest of the thread was also another sigh.

      https://news.ycombinator.com/item?id=21859505

      I guess the real problem is that random opinions about programming languages based on no experience show up on lobste.rs and HN a lot.


      At some point I write a blog post about my (so far limited) experience with a formal model of shell called Smoosh (written in Lem and OCaml):

      Executable formal semantics for the POSIX shell (Oil is cited in this paper)

      There are two parts to the project: the model and the test suite. The test suite was super useful and found a lot of bugs, some of which I fixed, and some of which are outstanding:

      https://github.com/oilshell/oil/issues/331

      The model found zero bugs. As far as I understand, it will continue to find zero bugs. The model is not short – it’s 10K lines of Lem and OCaml code.

      When Oil could run significant POSIX shell scripts, it was 14K physical lines of code, and probably around 7-8K SLOC.

      http://www.oilshell.org/blog/2017/09/09.html (follow links to source code metrics)

      If you have a program and a model that are supposed to do the same thing, as far as I know there’s no way to reconcile them. And I don’t think there is even any “help”, since they are both expressed in powerful metalanguages. I would be very interested in a correction on that if I’m wrong. But it seems to run into Rice’s theorem as you say.

      The best way is to run a test suite. In other words, given the choice of a formal model or a test suite, I would take the test suite. And yes these two things are intertwined, so it’s not really and either-or, but I think it’s a useful observation.

      I’m sure it’s not news to people who develop the models, but apparently there are randoms on the internet who think that “brute force” and often manual testing isn’t how 99% of mission-critical systems you use are developed. (In my first job I learned how infamously reliable console games are actually written – with really bad C and lots of manual testing.)


      Also, most of the models can’t be translated to production code that implements the spec (let alone the problem of determining if the spec is correct.)

      I’ve made significant progress on that front in Oil in translating 10K lines of pretty abstract Python code to 10K lines of production-quality C++. It’s as fast as C++ (as measured by benchmarks I created over 2 years ago), although it probably uses more memory.

      http://www.oilshell.org/blog/2019/12/09.html#oil-native-shows-how-well-optimize-oil

      As far as I remember NASA also writes some code this way. They use Python to metaprogram C. I think it’s pretty much like Oil. I use state machines, algebraic data types, and other “rigorous but informal” reasoning. It’s very, very, very different than how bash is written. But it’s also not a formal proof.

      At some point I will write about this. The other thing I haven’t done yet is run Oil’s test suite against Smoosh, which is possible because Smoosh is executable (small step semantics). I run it against 5 other shells:

      https://www.oilshell.org/release/0.7.pre10/test/spec.wwz/osh.html

      Smoosh’s tests found several bugs in Oil, but I’m sure Oil’s test will find bugs in Smoosh. Again, this is not news to the authors, but some people seem to think that formal methods are some kind of magic that makes software correct. To me the Oil/smoosh comparison shows to a great extent that you’re punting the problem back to the specification, and determining what the specification is is the hard part and is done through TESTING.

      1. 5

        sigh and facepalm at this part. It’s weird that people want to have such strong opinions about software, when it’s blindingly obvious not only that they’ve not worked in those domains, but also not anywhere near them.

        I wish I could upvote this twice. this article had basically no substance and 0 empirical support.

      2. 1

        Types were originally introduced as a solution to an hardware problem

        Another nice example here is that early versions of C didn’t actually check types.

        1. 1

          C didn’t check types because BCPL didn’t. BCPL was arguably the first language with “weak typing”; all its contemporaries with type systems enforced type safety.

      3. 6

        If such an arrangement exists, then it’s safe to assume that a language in which no erroneous programs will exist is possible.

        I think Gödel would have things to say about this statement…

        1. 2

          Godel would say it’s a fine statement as long as you don’t want to run all programs. The trivial case of course would be the language which accepts no programs, and no erroneous programs will exist. I think the real monster hides behind the slippery nature of defining “error” here.

        2. 6

          The type of the proposition is the proof of the proposition.

          No. The type is the proposition.

          An inhabitant of the type is equivalent to a proof.

          1. 3

            Might need a little more research before a writeup is warranted. Also check out F*, it’s a good example of of a Coq like language with production usage.

            1. [Comment from banned user removed]

              1. 5

                I don’t think it’s that new, really. Java has the same thing. You could argue that Scheme wanted to go in that direction too, since only #f is truly false, and you’ll probably generate #f with a boolean predicate.

                1. 1

                  First, you know exactly what the code is doing.

                  I don’t get this reasoning at all. You know what the code is doing in the first case, it’s an if statement.

                  1. 1

                    Its a conditional, but a number has been provided. What passes the condition? Any non zero? What about boolean true? Does boolean false fail the condition? What about nil?

                    1. 1

                      This is a total non-problem, you know how an if statement works.

                      1. 2

                        It seems like you dont realize that different languages handle this differently. For example with Ruby, only nil and false fail a condition, which means that 0 would be “true”.

                        1. 0

                          I do understand that; you’re missing my point entirely don’t worry

                          1. 0

                            Perhaps if you made a point clearly I would understand. Currently you just making easily refutable statements.