The boolean blindness problem got me thinking of a more general painful case. The problem of having an inappropriate range for a function is a terrible legacy from C. The convention in C for many posix system calls (either succeed or not), is to have a return value of something, and then some global piece of shared state called errno. This is a crude hack to prevent the Range/Type of the return value from colliding with the error handling. C needs algebraic data types.
In the case of c#, we see people using nullable. They might have True/False and Null in the case of error. Haskell goes one step further and traps the boolean in a Maybe Monad so that you can’t escape the possibility of a nothing at any point in the code. Still, a specific ADT that provides an appropriate range to describe everything would be ideal.
In C, what I generally do is define an enumeration and return that. Any output values are filled in via a pointer. For example, read in POSIX looks like:
ssize_t read(int fildes, void *buf, size_t nbyte)
I would define that something like this (although specifics may vary);
read_ret read(int fildes, void *bug, size_t nbyte, size_t *nbyte_read);
and read_ret would look something like:
And read_ret might actually be generalized to all the POSIX I/O functions or something. But the important part is to not try to abuse the return value as anything more than an indicator of success or failure.
With ADTs the read_ret type would contain a pointer to the written data iff it’s a successful type, indeed that would be the definition of the successful returns.
It’s interesting to see how the names of things lose their importance as they begin to take on types and structure related to their meaning. You could redefine the maybe type
data Maybe a where
notHere : a -> Maybe a
here : Maybe a
And all it would do is cause confusions instead of fundamentally changing how it behaves or is used. This simply ought to be the case anyway as a consequence of alpha conversion.
This is one of the trickier things to learn when picking up a DT language. You’ll often be stuck because you accidentally defined a function which returns a Boolean and throws away something meaningful that you learned in that computation. Usually you’ll find yourself stuck because a later proof obligation will need to know what you learned when you computed that, say, equality judgement.
To compare, here’s a Boolean
data Boolean : Set where
tt : Boolean
ff : Boolean
It’s sometimes called Two because there are only two inhabitants and it’s trivially isomorphic to any other two-inhabitant type. To compare, here’s a typical propositional form called Decidable Equality. First we define the notion of a proof of equality
data Eq (a : A) : A -> Set where
refl : Eq a a
and a notion of refutation
Not : Set -> Set
I’ll elide the details but simply note that you can only construct a value of type P if P is a true proposition. You can construct a value of Not P only if you have a refutation of P, a counterexample.
So finally we can say define decidability
data Dec (P : Set) : Set where
yes : P -> Dec P
no : Not P -> Dec P
which is similar to Boolean but is loaded up with proof terms related to the exact proposition at hand. It means that we can construct a value of Dec P if we either can prove or refute P. Thus a decidable equality on natural numbers might read
eqNat : (n m : Nat) -> Dec (Eq n m)
And calling it will provide you with either a proof that n equals m or a refutation of such. This might be very useful for building more complex proofs.
On the other hand, Boolean equality of Naturals
boolEq : (n m : Nat) -> Boolean
will have basically the same function body but will result in a mere Boolean, throwing away a lot of work that was done.
A common thread in DT languages is building types that adequately capture all that was learned during a computation. Usually you get stuck when your types are not sufficiently expressive to have captured something you learned during computation.
What I take from Rob Harpers article and your addition is that this all goes back to Hoare triples: For any given function call we have preconditions required of the input (for example pred n requires the n != 0 precondition) and we also get post-conditions on the output. The problem with booleans is that the post-condition only exists in the mind of the progammer and doesn’t exist as part of the language so it cannot be checked statically.
Direct pattern matching let him rewrite the function without calling any functions that require conditions on the input. The technique you demonstrated with dependent types allows you to actually reify conditions into the type system, and then pass proofs that they hold as first class values. Another direction that I’ve seen here is to put conditions comments and then the compiler would use a constraint solver to ensure that conditions all match up.
You could encode Hoare triples in a DT type (probably using a monad for context). They’re basically equivalent, but require a lot more formalism than some of the more specific dependent types you can use.
I’m not completely sure how to compare Liquid Types and DTs, but they’re in the same solution space. Liquid Types are weaker than full DTs, but deterministic. As far as I can tell, liquid types are just slightly weaker forms of what’re called Sigma types in dependently typed literature. That’s sort of my guess, though.
Actually, I hate booleans for a subtly different, subtly related reason.
Ultimately, on creating a boolean variable, you know somewhere someplace you are going to be branching on it.
I challenge you to contemplate this refactoring.
Move the branch to the place where you created the boolean.
Every time I have done that, my code improved.
Weirdly, the email for this was misformatted, so that one of the comments was sent as the text. For people who mostly just read the emails (like me), this article is most interesting than the text of the email.