This is such a great post – making this stuff accessible to more people is so important. I think this is the best introductory post I’ve seen yet, but here are some older ones that might complement it (I always find it can be handy to have a few explanations when learning something new):
This same notation is also used to describe language semantics, I.e. the description of what a program in a language evaluates to / how it executes. I know it’s not everyone’s cup of tea, but learning the basic syntax of natural deduction, predicate logic, and set theory has been a major win for me in terms of understanding PLT in general.
Like it or not, this is the notation used in papers going back to the 40s, and being able to actually read them is very useful.
I would add something which was required for me to go from “I sorta understand typing relations” to “I understand typing relations”.
First, what we are defining here is indeed a relation (a subset of cartesian product of sets). So, Γ⊢x:τ is a ternary relation R(Γ, x, τ). The funny ⊢ : syntax is confusing, but it’s just an atomic name of a relation which we write using weird and confusing mixfix notation. ⊢ and : don’t have a meaning by themselves.
Second, the way we define our relation is by writing inductive rules. We essentially say “if X and Y belong to R, then Z belongs to R”.
Third, the actual result is the smallest relation that satisfies all our rules, the least fixed point.
I feel like it’s weird to say that : doesn’t mean anything in itself when in many grammars, we say things like (λx:τ1.e): (i.e. we use : in the “common” way of labeling a type).
The turnstile is also its own thing in logic with its own variety of meanings) that all align to “given X, then Y”.
This sort of reading makes the lambda introduction rule much clearer as well.
Γ,x:τ1⊢e:τ2
------------------
Γ⊢(λx:τ1.e):τ1→τ2
the top says “given a context where x is t2, we have e: t2” (the point being that x is referred to somewhere in e)
the bottom part says “given any context, then (λx:τ1.e) has type τ1→τ2”. Notably you no longer say anything about x in the context, since you’ve captured that inside the lambda itself. But you previously had been able to get the type of e if x has type t1 in the above part of the rule.
You’re very explicitly working withing contexts, and a set of facts on the left of the turnstile (tracking your free variables), and being able to shrink that context down. It just feels like really absorbing the turnstile as a description of something in itself is important.
Saying that Γ⊢x:τ is a ternary relation is, of course, true. I just think it’s OK to try and grok the turnstile as its own symbol, given how often it shows up, and type signatures are a thing we see and we can absorb that detail as such.
This made me think - is it true that in order for a type system to be implementable, the typing relation has to be a function? Or at least expressable as a function.
I’m going back into Types and Programming Languages, and it has this paragraph:
The inversion lemma is sometimes called the generation lemma for the typing relation, since, given a valid typing statement, it shows how a proof of this statement could have been generated. The inversion lemma leads directly to a recursive algorithm for calculating the types of terms, since it tells us, for a term of each syntactic form, how to calculate its type (if it has one) from the types of its subterms.
This also makes me think about how relations are so common in math, but are pretty much nonexistent in programming - that I know of.
This made me think - is it true that in order for a type system to be implementable, the typing relation has to be a function? Or at least expressable as a function.
Non-algorithmic presentations of type systems take a bit more work to translate into type checking algorithms, but that doesn’t necessarily prevent that work from being done. That said, some sets of rules may not be expressible with decidable or efficient algorithms. Sometimes restricting them a bit can help make this easier. That’s part of the challenge of type system design!
For example, for a long time, people wanted to Marin-Löf Type Theory (i.e. dependent type theory) into type systems with type inference that could be checked on computers… but it turns out in the general case that is undecidable. Over the years people figured out that using a combination bidirectional type checking and pattern unification you can restrict the programs people are able to write to make type inference tractable. A similar pattern played out with type checking for higher rank types.
That said, some sets of rules may not be expressible with decidable or efficient algorithms. Sometimes restricting them a bit can help make this easier. That’s part of the challenge of type system design!
The way I think is that math is “generative” – you have rules that build up huge, infinite sets, and then you reason about the sets
While programming is “recognition-based” – you write an algorithm to try to determine if a sequence of symbols or a (program, type) pair is in the set [1]
So I think you can start from either end. You can just “write the algorithm”, but then people can discover years later that the type system is unsound:
Also it seems that people discover years later that the type system is Turing complete? i.e. with C++ templates – I believe they were originally intended as a way to express duplicate code, and then people discovered you can do crazy stuff like write regex/CFG interpreters at compile time with it.
I guess the problem with Turing-completeness is that you never know if the type checker terminates … and also from a practical perspective, I think those type checkers tend to be slow, and you have more trouble reasoning about their performance. (I think Java types before generics must have been a more or less linear algorithm? There is the Featherweight Java subset in TAPL. Probably Go was also designed to type check more or less linearly?)
Or you can start with the math, but here it is not clear to me how exactly you arrive at the implementation. When I mentioned that Rust core devs used Datalog to prototype the ownership system, I think what they are doing is trying to find a balance between the speed of type checking / decidability and correctness ?
Then after they decide on the rules, they just implement them by hand, “proving it in their heads” ? (Sorta like you write a recursive descent parser – you will end up with some minor divergences from the grammar, but in practice it’s good enough)
[1] I took these words from the original PEG paper (Ford 2004), which I think is a bit ingenious because it does away with the thing everyone has problems with (ambiguity). BUT it also retains the “declarative” nature of grammars, in that there are at least 2 separate algorithms to implement PEGs (backtracking and packrat parsing):
This made me think - is it true that in order for a type system to be implementable, the typing relation has to be a function? Or at least expressable as a function
Yes. It has to be function to be efficiently implementable. But no, it doesn’t have to be function to be (possibly inefficiently) implementable. We always can implement given type system by trying all possible proofs/derivations in parallel (i. e. by breadth-first search on tree of all possible proofs). Such search does not necessary terminates. But if proof exists, then it terminates
Hindley-Milner is efficiently implementable, but its typing relation is not a function, as terms could have different types (there’s a most general, principle, type among those types)
I’m pretty sure it depends on how the rules are presented. You can have algorithmic presentations and non-algorithmic presentations of the same type system.
If one typing relation is a function and another one is not, they are different typing relations! The smaller one can be more useful though, if the bigger one is a transitive closure of the smaller one.
Less abstractly, it’s true both:
for non-trivial type systems (anything with subtyping or parametric polymorphism), there are actually multiple distinct way to assign types to sub expressions of a program such that everything checks out
there exists a most general typing, such that any other typing that works is just a specialization
This also makes me think about how relations are so common in math, but are pretty much nonexistent in programming - that I know of
There is programming language, which somewhat implements derivation of relations. It is called Prolog. And see also its various variants, such as Mercury. But I’m not sure whether Prolog implementations implement search across all possible proofs. And I have secret plan for creating such Prolog implementation, which will search truly all proofs
In HM and in more powerful type systems, id has type ∀x . x -> x. In general, in PLT we mostly consider type systems which have the principal type property, i.e. terms have only one type.
which have the principal type property, i.e. terms have only one type.
That’s not correct. “Principle type” means that there exists a most general type such that any other type would be a specialization of this most general type. It doesn’t say that the principle type is the only type.
One thing that was mentioned in passing but might help with some elaboration is that the inference rules can be seen as defining a way to construct proof trees. Rules with premises are the branches, and axioms are the leaves. In the case of a type system, this would let you construct a proof that a program conforms to a given type.
To build a proof tree you can stick the rules together a bit like lego (with care to make sure that the premises match appropriately with the conclusions). Some example proof trees can be found in “Bidirectional Typing Rules: A Tutorial” by David Christiansen.
Some insights that I found helpful as a result of this perspective:
You can think of an algorithm for a type checker as an algorithm for traversing these proof trees.
You can exploit this in languages like Coq or Agda to encode the inference rules as an inductive datatype. This can let you prove properties about the type system itself, and prove that the implementation of a type checker matches the specification of the type system.
Yeah I think that makes sense – I would also make an analogy to parsing (which I know better)
leaves : axioms in type checking (true has type bool) / tokens aka terminals in grammars/parsing
edges/branches : inference rules in type checking / productions in grammars
path through the tree: proof that a term has a given type (the term and type are in the typing relation) / derivation showing that a sequence conforms to the grammar (it’s a member of the set)
you start at the axioms, apply a bunch of rules, and eventually end up at the program (either its type, or its membership in the set)
Now a couple questions
algorithm for finding a path through the tree : ??? / YACC finds a derivation for the LALR subset of CFGs; the parse tree “recapitulates” the derivation
there is more than one path through the tree: ??? / the grammar is “ambiguous”, I think YACC can pick an arbitrary interpretation
I guess Datalog is like the equivalent of YACC ? I experimented with Prolog a couple years ago to write a type checker, without that much success. I did find somebody who said “Prolog is not programming, and it’s not math”, and I think I agree. (The over-dependence ordering and cuts are a reason it’s not math, but giving good error messages and reasoning about performance is a reason it’s not programming)
I haven’t tried Datalog, but I know that Rust core devs used the Souffle datalog compiler to prototype their type system.
I think of Datalog as something that can perform a massive search through the tree for SOME type systems, although it’s not clear to me when it fails. Is there an equivalent of “LALR subset” for Datalog?
Or I guess Coq and Agda are the most general solutions? What’s the difference between Datalog and them? My guess off the cuff would be that Datalog can be faster for a subset of type systems (similar to how YACC guarantees linear time for LALR grammars ) ??
And then you also have
Hand-written algorithm that doesn’t find the path/derivation : ??? / Recursive Descent
I guess the name for that is just – “I interpreted the typing rules as something you can check and not a way to generate proofs (a kind of inversion)” – BUT I’m not clear exactly when that falls down?
Or I guess Coq and Agda are the most general solutions? What’s the difference between Datalog and them?
Coq and Agda are simply Haskell with dependent types added. In other words, these languages are not something completely alien. They are relatively normal programming language. I. e. you simply directly describe computation you need. Datalog/Prolog is something completely different. They are alien. You don’t express algorithm directly, you specify how to derive relations.
Coq and Agda don’t have Prolog-like capabilities. I. e. they don’t have Prolog-style proof search (as well as I know)
Coq and Agda are simply Haskell with dependent types added.
You might be able to argue this for Agda, but not Coq. In particular, Haskell’s internal logic is not consistent (the empty type is inhabited!) but this is not the case for Coq.
Again: Coq is (conceptually) just Haskell with dependent types. Coq is able to prove mathematical facts thanks to Curry-Howard correspondence. I. e. “X is proof of fact Y” correspond to “X is expression of type Y”, and this gives us ability to represent arbitrary proofs. Also, in addition to dependently typed core Coq has automation support (so called tactics). Tactics lets Coq to infer some proofs automatically. But even if Coq did not have this automation, it still would be able to represent proofs, but they would be bigger.
I suggest reading Barendregt, “Lambda Calculi with Types”, this book shows what are dependent types and how they can represent proofs. Also it actively uses type system notation (aka computer science metanotation) with explanation. https://repository.ubn.ru.nl/bitstream/handle/2066/17231/17231.pdf (but this pdf contain some small amount of typos, if you really want, I can try to find better version).
I recommend watching this vid: https://www.youtube.com/watch?v=dCuZkaaou0Q (it is linked from that Alexis King’s answer comments). It is really cool! It tells in simple term a lot of facts of type system notation (author calls it “computer science metanotation”, CSM).
When I said that Prolog is alien, of course, I didn’t mean that it is absolute impossible to understand. :) I simply meant that Prolog doesn’t follow usual execution model of normal programming languages, as opposed to Coq, Haskell, C++, etc. But Prolog is, in fact, simple. Essentially, Prolog is CSM. In video above at 6:21 author says that CSM is non-deterministic language. And in some moment he says:
in fact the language [CSM] feels a little bit like Prolog and as an experiment I wrote a toy compiler that translates the latex expressions in my Emacs buffer for these notations
In short, Coq/Agda is full blown programming language with conditionals, recursion, data types and all such. You can write normal (pure functional) programs in it (assuming that we call Haskell programs “normal”). Also Coq has dependent types, which allows you to encode proofs and many other things. And also Coq has automation, which allows you to infer sometimes these proofs (so you don’t have to enter them explicitly). In Prolog you cannot write “normal” programs. It has completely different execution model. But you still can write all algorithms in Prolog, you just have to encode them to different programming model. And Prolog matches CSM. I. e. Prolog is a hack, which makes normal programming harder, but which is good for representing CSM.
But keep in mind that Prolog cannot represent all CSM features. In particular, Prolog cannot represent capture-free substitution (it is discussed in video above). If you want to represent substitution, you need Prolog variant with lambda support, i. e. lambda-prolog ( https://www.lix.polytechnique.fr/~dale/lProlog/ ), in particular, Makam ( http://adam.chlipala.net/papers/MakamICFP18/MakamICFP18.pdf, very good read!). Makam matches CSM even better than Prolog. But, as well as I understand, Makam doesn’t search truly all possible proofs. (I’m not sure here.) I have in my TODO list item “Try to find Prolog/CSM implementation, which truly searches all proofs. If such doesn’t exist, then write my own”.
Also here is my another comment in this thread, where I present how to represent all ZFC proofs in one single data type in Lean 3 (Lean 3 is very similar to Coq and Agda): https://lobste.rs/s/11k4ri/how_should_i_read_type_system_notation#c_rtpme5 . This is not how proofs are usually represented in dependently-typed languages, but this shows elegance of dependently typed programming.
Also, you recently asked something like “is there some LALR equivalent in computer science metanotation”. As well as I understand, you asked is there some subset of all possible CSM programs, which is good for simple and efficient implementation. Well, yes, it exists! I would call it “CSM with specified modes”. At 7:53 in video above author said that some parts of CSM are “input” and some are “output”. Labels “input” and “output” attached to rules are called “modes” in Prolog world. These labels guide execution, they tell how to execute our “CSM/Prolog program”. Not all CSM program can be labeled with modes. But if you was able to specify modes, then your program can be easily executed now! I. e. it now belongs to “LALR subset”. There are Prolog variants, which mandate you to specify modes, such as Mercury. I said above that I’m not sure whether there are Prolog implementations, which actually search all proofs. Fortunately, if you was able to specify modes, then many Prolog implementations will go, i. e. all they now will be able to correctly execute your program. But I will repeat that nearly all they lack substitutions, so you are limited to Makam. So conclusion will be so: if you have no modes, they I’m not sure whether there exist any single CSM implementation, which will go. If you have modes, then you will be able to correctly execute your CSM in Makam.
If you have any more questions, ask, I’m happy to answer
Thanks for the response! I did watch the Steele video on CSM a few years ago when I came out, but I will watch it again – I actually don’t remember the prolog bits
As well as I understand, you asked is there some subset of all possible CSM programs, which is good for simple and efficient implementation. Well, yes, it exists! I would call it “CSM with specified modes”. At 7:53 in video above author said that some parts of CSM are “input” and some are “output”. Labels “input” and “output” attached to rules are called “modes” in Prolog world
This part is interesting, thank you – I will see if watching the video causes any thing to click
In fact that video might have been around the time where I played with a type inferencer for Python in prolog, without much success - https://github.com/andychu/hatlog
Combining my brief experience and the essay, that’s where I got “it’s not programming and it’s not math”. (Judging from comments, there is some debate about that. However now that I read the comments again, I stand by my judgement – the people advocating Prolog have not accomplished much with it. It’s a bit like Forth, but less practical )
Big question is: why don’t we have a library of Prolog type checkers for Java, Go, Rust, etc. similar to a library of ANTLR parsers for Java, Python, JavaScript, and more ? (the analogy hints at an answer: the generated ANTLR parsers aren’t that good)
FWIW my main interest is to write a SHORT and correct type checker for the metalanguage of https://oilshell.org – typed Python with algebraic data types (stretch goal: control flow dependent null safety, which MyPy has)
It is actually based on ~50K lines of code I’ve written over the last few years, that pass MyPy – what type system features do I use?
Surprisingly, this is not a trivial question!!! MyPy is very rich.
And there is a constraint of compatibility with the C++ type system – MyPy and C++ have a VERY large common subset!
So I think that is an interesting problem, and a bit under-explained from the engineering perspective
I read the first 10 chapters of TAPL, and reviewing the basics was useful. However I don’t think the later parts were particularly well-suited for engineering of a type checker, e.g. I came away with the impression that some implementations were asymptotically inefficient, but actually I’m not even sure how to check that
The parts on “subtyping” didn’t help me, I think they were concentrating more on type inference and such
Thanks for the tips … I am probably going to ignore the dependent types stuff since the type system I want to write is much simpler, but knowing the rough relationships is useful, especially connections to practical tools (the Souffle datalog one seemed most interesting to me)
Okay, let me tell you more on Prolog and its implementations.
But first disclaimer. I deeply understand mathematical logic and I deeply understand how Prolog should work and what hypothetically can be achieved in Prolog. I have master degree in mathematical logic. But I don’t know what features actual Prolog implementations have. In fact, I never actually wrote any programs in Prolog-like languages. Also, I really deeply understand CSM (computer science metanotation).
Let’s begin with example. Here is one simple type system written as CSM:
Context free grammar (i. e. abstract syntax, i. e. BNF):
Terms: x, y ::= [x] | length(x) | 42
Types: t, u ::= int | arr(t)
Rules (i. e. Horn clauses):
----------------
|- 42 : int
|- x : arr(t)
----------------
|- length(x) : int
|- x : t
----------------
|- [x] : arr(t)
Note that [x] above means array with exactly one element: x. I. e. I cheated by describing one-element arrays only. :)
As easily can be seen, relation |- length([[42]]) : int is provable in CSM above.
Also note that in CSM above |- has no any sense without : and vice versa. I. e. |- x : t is simply atomic construct.
This was very simple example of CSM. If you didn’t understand it, then it will be very hard to understand anything below. Ask any questions if you want.
Okay, now let’s translate this CSM to Prolog. I will use Mercury ( https://mercurylang.org/ ). So, here we go:
% Comments start with "%"
% Note that this code will not compile yet, I will tell later how to compile it
% Here is head, you may ignore it:
:- module hello.
:- interface.
:- import_module io.
:- pred main(io::di, io::uo) is det.
:- implementation.
% Now grammar
:- type term % Terms
---> sing(term) % [x]
; length(term) % length(x)
; forty_two. % 42
% "type term" above is simply algebraic data type. It correspond to "enum Term { ... }" in Rust
:- type typ % Types % I used "typ" instead of "type", because "type" is Mercury keyword
---> int % int
; arr(typ). % arr(t)
% Here we declare our type relation, i. e. "|- x : t", i. e. "x has type t"
:- pred has_type(term, typ). % |- x : t
% Now rules. First conclusion, then premises
has_type(forty_two, int).
has_type(length(X), int) :- has_type(X, arr(T)).
has_type(sing(X), arr(T)) :- has_type(X, T).
% Now main
main(!IO) :-
if has_type(length(sing(sing(forty_two))), T) then io.write(T, !IO) else io.write_string("fail", !IO).
typ and term are what I call syntactical types. I. e. types of nodes of our AST. They should not be confused with actual types of our target language (I will call them target types), such as int and arr(int). Target types are represented by values of syntactical type typ.
But… the code above will not compile, because I leaved mode and determinism markers. In other words, Mercury is not able to process arbitrary CSM, we must explicitly tell Mercury how Mercury should process it. I. e. we should guide Mercury execution. Okay, let’s supply mode and determinism markers (this is diff):
-:- pred has_type(term, typ).
+:- pred has_type(term::in, typ::out) is semidet.
Okay, now the code will compile. You may see warnings, but they are okay. The program will print correct target type, i. e. “int”.
What we just did? We specified that term is input and type is output. I think this is completely reasonable. See again at 7:34 in Steele’s talk. In his video term is input and type is output, too. He essentially talks about modes.
How to run this code? I installed Mercury to Debian using these instructions: http://dl.mercurylang.org/deb/ . Then created file hello.m with contents above. Then I compiled it using mmc --make hello, then I run it using ./hello; echo (echo is needed, because ./hello doesn’t print newline).
I hope you understood the program above, otherwise I think you will not understand anything below.
Now let’s discuss desirable properties hypothetical Prolog implementation may have.
Property 1. It should have support for user-defined algebraic data types (ADT), such as typ and term above. As you can see, this is mandatory for translating CSM to Prolog. As we can see, Mercury has this property. Also note that ADT definition doesn’t necessary look like ADT definition. For example, let’s consider Makam. Here is Makam tutorial: https://astampoulis.github.io/blog/makam-tutorial-01/ . Author gives this example there:
This look like declarations of some functions. But conceptually we defined ADT named expr and its variants called stringconst, etc. So, Makam satisfies property 1, too, despite its documentation seems not to mention term “ADT”. But Datalog, at least as described in Wikipedia ( https://en.wikipedia.org/wiki/Datalog ) seem not to satisfy this property. But it is possible that actual Datalog implementations add this property.
Property 2. Support for lambdas, aka binders, aka capturing-free substitution. Well, “capturing-free substitution” and “support for lambdas” is probably different things, but they are still very closely related. This is the same capturing-free substitution Steele talks at 33:15. Okay, so lambdas and substitution are very desirable properties. For example, let’s consider https://en.wikipedia.org/wiki/Simply_typed_lambda_calculus . In section “Typing rules” we see rule 3. It involves lambda. In this article http://adam.chlipala.net/papers/MakamICFP18/MakamICFP18.pdf we can see that this rule can be encoded in Makam so:
typeof (lam T1 E) (arrow T1 T2) :−
(x:term → typeof x T1 → typeof (E x) T2).
Yes, this is exactly same rule modulo names for variables. Note that E here refers to what I call syntactical function. Or you may call it syntactical substitution. Or AST with hole. Also, note that idiomatic Makam code doesn’t use contexts (context is called Γ in that Wikipedia article), so we don’t see any Γ in Makam code. So, well, this Makam code absolutely exactly matches that rule 3 from Wikipedia, but this may be hard to understand. Anyway, what I want to say is this: property 2 is desirable for encoding CSMs and Makam has this property.
Property 3. There is no need to augment CSM with some markers to process it. Well, as we saw, Mercury doesn’t satisfy this property, because one need to specify modes and determinism markers.
Property 4. It is possible to enter any CSM to this Prolog implementation. This property doesn’t hold for Mercury, because not every CSM can be augmented with correct modes.
Property 5. If this fact has proof, then the implementation can find it. This property seems not to be hold for original Prolog. As well as I understand, original Prolog has quirky unreliable semantics. Prolog looks like logic engine, but (as well as I understand) it doesn’t implement logic correctly. So, AFAIK it is possible to construct CSM with particular fact such that this fact is provable, but Prolog will not find its proof. AFAIK this is exactly same complain, which we see in article you gave me: https://www.amzi.com/articles/prolog_under_the_hood.htm . So, Prolog is dishonest and unreliable. (When I say “unreliable” I mean “unreliable as logic engine”.) So, please, never use original Prolog. But, AFAIK you can enter any CSM (assuming it has no lambdas) to Prolog. Without any additional markers. So, it seems Prolog satisfies properties 1, 3 and 4. Well, there is some cheating here: Prolog is untyped, so it doesn’t have ADTs, but AFAIK we can somewhat emulate ADTs using functors, so let’s assume property 1 holds. Also, AFAIK Prolog doesn’t have lambdas, so property 4 holds with condition “assuming CSM doesn’t have lambdas”. So, in Prolog we have 1, 3 and 4, but don’t have 5. Mercury seems to be honest. If there is a proof, Mercury will find it. So, Mercury is honest, you can rely on it. Mercury is true logical engine, as opposed to Prolog. Prolog woes from https://www.amzi.com/articles/prolog_under_the_hood.htm don’t apply. Mercury has 5, but it doesn’t have 3 and 4. So there is a trade-off here: either we restrict set of possible programs and place burden on programmer in form of writing modes markers (Mercury), either we allow somewhat arbitrary CSMs, but will get unreliable broken semantics (original Prolog).
In this point you may ask: “But what exactly you mean when you say that Prolog is unreliable?” Well, let’s consider this Prolog program:
f(X) :- f(X).
f(a).
Here a is some fixed constant and X is variable. Then I asked Prolog (I used SWI-Prolog) is f(a) true. Prolog looped and did not produce any answer. It failed with stack overflow. But it is obvious that f(a) is true! It simply is present in our hypothesizes! So, yes, the main problem with Prolog from my point of view is here: it is very easy to trick Prolog into infinite loop. And for this reason I state that Prolog doesn’t satisfy property 5.
Now let’s compare with Mercury. Well, I just entered similar program to Mercury and got to infinite loop, too. :( But I have read here https://www.mercurylang.org/information/doc-latest/mercury_ref/Implementation_002ddependent-extensions.html , that I can specify --check-term to check for loops. I specified this option and yes, I got a message in compile time. So, yes, if your program belongs to particular set of programs (this set inclusion can be tested in compile time by specifying particular options), then you are guaranteed to get proof if it exists. (Well, everywhere I say “get proof” I don’t mean getting actual proof, I mean that the implementation can say “yes, it is true” or that implementation will produce some answer, say, target type of your term). (Also, Mercury doesn’t know that function “io.write” terminates. So when writing programs for real you should specify option --enable-term instead [see https://www.mercurylang.org/information/doc-latest/mercury_ref/Implementation_002ddependent-extensions.html ] and mark which functions you want to check for termination.) (Also, I spent some time playing with --check-term. It seems that this checker is very stupid, in lot of cases it cannot prove termination, when I know for sure that the program terminates. But still I will say that overall Mercury is reliable enough.)
In this point you may ask: “But what if Prolog has termination checker, too?” Well, possibly, I don’t want to check this. Anyway, I simply has a feeling that Prolog is unreliable mess. For Mercury I don’t have such feeling. All checks already present in Mercury together with --check-term makes it very reliable system.
Okay, so Prolog program above turns out to be very hard task for Prolog implementations. You either reject it in compile time (this is violation of property 4), either it runs and fails to loop without producing answer (this is violation of property 5).
Recently I said this:
I have in my TODO list item “Try to find Prolog/CSM implementation, which truly searches all proofs. If such doesn’t exist, then write my own”
I meant that I want to find (or write) Prolog implementation, which satisfies properties 1, 3, 4 and 5 in the same time. I don’t know whether such implementation exists already. And I know how such implementation would work. I. e. I can imagine algorithm. In simple terms such algorithm will try everything in parallel. Such implementation will produce answer (i. e. “true”) for above program in finite time. But there is another trade-off here: such implementation will be usually slower than alternative implementations, such as Mercury. Mercury has its restrictions and mode markers for a reason: they allow efficient implementation.
Prolog programmers take note: unlike Mercury, Prolog programs commit to the first solution of Gc. The Prolog equivalent of the above goal would have Y = 9 as its only solution, not Y = 25 or Y = 49
This seems like another proof of Prolog’s unreliability (but I didn’t try to deeply understand this sentence).
Property 6. When there are multiple answers, implementation can find them all. For example, when given term can have multiple target types. Well, Mercury has this property, I was able to write program, which exploit it: https://paste.debian.net/1289302/ . I had to use solutions for this.
Property 7. The implementation is typed. Well, original Prolog doesn’t have this property. Mercury has. Makam has, too.
Property 8. The implementation can find answers with free variables. (Yes, this property is distinct from property 6.) For example, let’s assume that our first CSM example also has syntax for empty lists. I. e. our ADT for terms now looks so:
Unfortunately, this would mean that solving has_type(empty_list, U) will get answer U = arr(T), i. e. answer will contain free variable. This seems not to be supported by Mercury. So for such situation you will need some unnatural encoding. So, Mercury does not have property 8.
Property 9. Computation always terminates. Yes, you may say that this is same as property 5. But it is not. Property 5 says: if there is at least one solution and we asked for at least one solution, then some solution will be printed and then the program will terminate. But if there is no solution, there is okay to loop forever. Property 9 says that program will always terminate, even if there is no solution. Properties 1, 4, 5 and 9 cannot be held in the same time. Because this would mean that we can in finite time determine whether given formula is provable in given logic. This is impossible, because we can encode Turing machine as CSM and this would mean that we can check in finite time whether Turing machine terminates, which is, as we know, impossible.
I said above that I can imagine algorithm, which has properties 1, 3, 4, 5. It will not have property 9. I. e. it will loop forever in many cases when there is no solution.
Also that hypothetical algorithm will have property 6. We can run CSM, which specifies axioms for propositional logic and program will run ad infinitum, printing all provable facts. Program will run forever and any provable fact will eventually appear in its output.
End of list of properties.
Okay, so I hope one day I will check whether various Prolog implementations have these properties (this may happen, say, in 10 years). This is list of Prolog implementations I will consider:
You may try to do the same. At least you now know what to look for.
FWIW my main interest is to write a SHORT and correct type checker for the metalanguage of https://oilshell.org
I think implementation of Oilshell itself should be in some well-known language. Haskell and Ocaml are good for implementing languages. Also see https://hirrolot.github.io/posts/compiler-development-rust-or-ocaml.html . If you also want Oilshell to be fast, then possibly Rust is good choice. Yet another variant is to wait when Mojo will be released.
But okay, let’s assume you chose to implement Oilshell in your own Python dialect. What language to implement its type checker in? Well, again, I suggest simply using some “normal” language. Such as Haskell or Ocaml. This will allow you to directly write your type checker. It will be relatively short. You may also chose something Prologish. But as I described above, this is minefield. But depending on exact task this may be acceptable solution.
If you already more or less understand algorithm of your type checker, i. e. you understand control flow of your type checker, then this means you already assigned modes to your type judgements in your head. This means Mercury may be good solution. Another situation is if you can write rules, but don’t understand control flow of your checker, i. e. you have no algorithm. Then assigning modes will be difficult.
Anyway, if you really want to try something Prologish, then try to write your rules in text editor in Prolog syntax. Then try to understand what properties from above list you will need from Prolog implementation. And then find Prolog implementation, which have these properties. :)
But again: all these Prolog variants don’t have such care (i. e. devs’ effort) as Ocaml and Haskell. Also it is possible they will require some effort to integrate it with bigger system, i. e. pass information from it and to it.
control flow dependent null safety
Sounds very unusual. I have no idea how to implement this in Prologish languages. Ocaml and Haskell recommended
Basically I think Prolog does not have the YACC property of helping you check that your grammar is implementable efficiently, and of reporting ambiguities
As I said above Mercury has similar properties
OK so then my question is something like – can I translate type checkers in OCaml to Prolog with modes?
If you already have type checker in OCaml, why translate it to Prolog? But, okay, if you want, then it simple. If you have function f : a -> b, then its Mercury equivalent will be :- pred f(a::in, b::out) is det..
Does Prolog (or datalog) have any Yacc-like properties that make it better than a general purpose language, when used as a metalanguage?
Yes, Prologish languages have advantage: they directly correspond to CSM, i. e. they allow you to write typing rules directly. Some of them, such as Makam and Mercury also seem to have capture-free substitution built-in, which is very good, too.
Why did TAPL choose ML for their implementation language rather than Prolog/datalog?
I think because they wanted to show how actually the program is implemented.
And again I think even the ML implementations are far from “engineering reality”, as far as I can see
Rust was implemented in Ocaml first. Coq is implemented in Ocaml. CompilerCert is implemented in Ocaml. Isabelle is implemented in Isabelle/ML, which is ML dialect. So, Ocaml is pretty good “engineering reality”. It is full-blown programming language. You can do anything from it, including writing ActivityPub servers (I recently saw an article on writing ActivityPub server in Ocaml.)
(Personally I know Haskell, but don’t know Ocaml. Still I have read many articles on Ocaml and I see that this is a good language.)
then where does “Prolog with modes” fall down?
Prologish languages allow very concise programs as opposed to general purpose languages. But in the same time it is possible that in some moment you will hit some unavoidable restriction of such Prologish language. I. e. Prologish languages are shorter and they are very strict. (Also, this post [ https://lobste.rs/s/11k4ri/how_should_i_read_type_system_notation#c_b8ricb ] was written as your answer to yourself. I saw it by chance only.)
Right he says the first metalanguage is non-deterministic, but this one is deterministic
I suggesting looking carefully at these two slides and understand why one is deterministic and another is not. In slide 6:00 both conclusions have same form: no_conflict with arbitrary parameters. So if we want to check no_conflict for some set of arguments, we don’t know beforehand which rule to apply, so we have to run both. But note that in this slide we see very small part of CSM. We don’t see whole CSM, so we don’t know input/output modes. But even if all arguments to no_conflict are input, i. e. all them are known, then we still don’t know which rule to choose. I. e. even in simplest case we have non-determinism.
Now let’s see slide 7:00. This time it seems we see whole set of rules. And we know intended meaning of our relation (i. e. ... |- ... : ...): we know that it supposed to mean “has type of”, and thus we know that gamma/context and term are inputs and type is output. Now we can ask ourselves: is it possible by shape of inputs alone (i. e. context and term) to see what rule to apply. Yes, it is possible! If term is variable, then first rule applies, if term is lambda, then second rule applies, etc. I recommend try to imagine full algorithm for type checking for this slide.
(delayed response) Thanks for the detailed response! The first example makes sense to me – it’s not too different than the Bool-Int language that’s the first one in TAPL (typed_arith I think they call it).
And yeah it’s not surprising that Mercury can easily deal with it.
So it’s like ~50 lines each for the type checker and evaluator. I actually like TypeScript now, having not used it before. The type system does seem as expressive as OCaml’s, or even more !!!
Also thanks for the Datalog links at the end – those are close to my interests, and I also want to make a little HTML demo for my statically typed language, so I might take some cues. (actually I just noticed the Lingo one is 17 MB of JS!! I want something a bit simpler)
The goal may be to write enough of a language to compile a Mandelbrot generator to WASM, from a Lispy syntax, but statically typed.
Okay, so I hope one day I will check whether various Prolog implementations have these properties (this may happen, say, in 10 years).
Hm about those 9 properties … It’s an interesting thing to think about, and I’m not qualified to judge, but I would say there are some fundamental reasons that programming is not math..
Basically you could maybe have something with those properties for Toy languages … but to me the interesting goal is if you can specify and prototype real languages (again like parts of Rust were prototyped with datalog). So again where does it fall down:
There is a big gap there, and I think if you attempt such a thing, you’ll see some stark engineering tradeoffs. I mean the most obvious one is the exponential algorithms are not practical.
Sounds like a great goal though! I want programming language impls to be short and correct. (And the TypeScript experiment has turned out to be interesting – the total is 750 lines now, but most of it is parsing, not types or semantics, even though it’s a Lispy syntax !! )
Those problems sound great for a Ph.D. , probably even too ambitious for one :)
My former teammate Fergus Henderson (from ~2006) worked on the Mercury Language before we worked together. And also there is a guy on Reddit /r/programminglanguages who also worked on Mercury
To me it seems like you need a blog to explain this stuff :-) Can we move CSM and programming closer together? It’s a worthy goal!
I have found the blog to be great for tracking progress on ideas and implementations, and keeping yourself honest!
These examples are trivial. They can be easily rewritten in any language, which supports algebraic data types and imperative programming in the same time. Such as Rust and Ocaml. (But in Rust you have to mess with these lifetimes, so I don’t recommend Rust.)
In Ocaml the code will be similarly short. Again, I suggest this article on Ocaml: https://hirrolot.github.io/posts/compiler-development-rust-or-ocaml.html . Ocaml syntax may be seen as difficult, but it is just syntax. On conceptual level the same simple things happen as in your TypeScript code. The only actual semantic difference is this: in Ocaml mutable values have to be specially marked with ref, similarly to Cell and RefCell in Rust. In exchange you get proper type system (as described in King article above).
as expressive as OCaml’s, or even more !!!
This is absolute exaggeration. Ocaml supports many advanced type features, such as GADTs.
The goal may be to write enough of a language to compile a Mandelbrot generator to WASM
:) I recently thought about generating Mandelbrot. And surprisingly I realized that no tools I aware of can precisely generate Mandelbrot. Moreover, tools even unable to precisely draw graph of “sin(1/x)”. Go draw it various ways (using Google or Mathematica or something else). You will see artifacts. So possibly one day I will create some tool for proper rendering of such graphs
but most of it is parsing
Well, for every language there are good parsing libraries (or tools). Just use them. They are cool. Parsing of preexisting languages, such as Rust, may be challenging (they often require some hacks). But the libraries are perfect for creating own languages. Just make sure to use context free parsing, not PEG and not parser combinators (parser combinators are Rust’s nom, Haskell’s parsec, etc). I can tell why I think so, if you want.
To me it seems like you need a blog to explain this stuff
I watched the first ~40 minutes of the Steele video again – can’t believe that was over 5 years ago when I watched it the first time (when I knew much less)
Essentially, Prolog is CSM
So this is exactly WHY I hacked on that Python type inferencer in Prolog – I wanted it to be true, but I think it turns out to be NOT true.
And I think you are essentially saying the same thing:
I have in my TODO list item “Try to find Prolog/CSM implementation, which truly searches all proofs. If such doesn’t exist, then write my own”.
Basically I think Prolog does not have the YACC property of helping you check that your grammar is implementable efficiently, and of reporting ambiguities. (Although YACC also falls short – I don’t use grammars for all parsing, only some parsing)
At 7:53 in video above author said that some parts of CSM are “input” and some are “output”. Labels “input” and “output” attached to rules are called “modes” in Prolog world. These labels guide execution, they tell how to execute our “CSM/Prolog program”. Not all CSM program can be labeled with modes
Right he says the first metalanguage is non-deterministic, but this one is deterministic (although it says Deterministic? in the slide title, probably because that fact is implicit in the paper – they didn’t call that out)
OK so I will have to look up “modes” in Prolog … maybe that helps. I didn’t encounter it the first time around.
OK so then my question is something like – can I translate type checkers in OCaml to Prolog with modes? OCaml pattern matching is deterministic.
And now is there any benefit to doing Java, or Featherweight Java, or Go in Prolog with modes? Or is it OCaml a better metalanguage?
Does Prolog (or datalog) have any Yacc-like properties that make it better than a general purpose language, when used as a metalanguage?
Why did TAPL choose ML for their implementation language rather than Prolog/datalog? (And again I think even the ML implementations are far from “engineering reality”, as far as I can see)
As well as I understand “higher order type theory” simply means that at type level you can have “things”, which have their own types. This is not rocket science. Haskell have this. And even Rust and even C++ have this! Rust’s Vec and C++’s vector are functions, which take type and return type. I. e. (in Rust pseudosyntax) Vec is a function of type fn(type) -> type
if you have decent dependent type system you can use them to generate proofs like what you get in Coq. And there are a lot of people who get really far in Haskell with various extensions.
But the reason people reach for stuff like Coq or Lean is because these languages, beyond the type system, also have tooling to help you write the proofs. The end result is still just dependently typed functions, but kind of like how do syntax makes Haskell’s monadic code more reasonable to grok, things like Coq’s tactics for proofs make it easier to write certain kinds of proofs. But utlimately you’re still “just” using the type system (or rather, the type system + the fact that you can provide a definition to a function) to generate the proof.
I can bring your parsing analogy down to extreme: CSM (computer science metanotation) programs are actual grammars! More specially: they are W-grammars ( https://en.wikipedia.org/wiki/Van_Wijngaarden_grammar ). More specially: if CSM program has one input and its rules break this input down to parts, then there is corresponding W-grammar.
W-grammar’s metagrammar (see Wikipedia article above) correspond to usual context-free grammar of CSM relations and terms themselves. It is called BNF in that video ( https://www.youtube.com/watch?v=dCuZkaaou0Q ).
W-grammar’s hypergrammar represent our target language, i. e. language, which is input (i. e. what is called “input” at 7:53 at that video).
If you want I can continue analogy and tell what correspond to what. Again: not all CSMs correspond to W-grammar. Our CSM should have one input, and that input should become lesser in any rule.
In other words, W-grammar (i. e. W-grammar hypergrammar) represents program in target language (i. e. language we want to type check, for example, Rust). So, given W-grammar represents all good (say) Rust programs. But unlike normal context-free grammars, W-grammar also type checks Rust program
Hmm … well I guess given the shape of context-free grammars and of type systems, it’s not too surprising you can find a higher level mathematical correspondence
I guess I am interested more in engineering consequences – YACC finding O(n) algorithms is a major engineering consequence
i.e. I don’t find termination/non-termination to be that useful an engineering property – it’s relatively weak – because O(n^3) is often as bad as non-terminating in practice, let alone O(2^n), etc.
I try to start with a codebase or a tool, say TypeScript or Rust, and Prolog or Souffle Datalog, and ask what makes it work
Right now I don’t see evidence of many people using DSLs and tools to write type checkers.
For example TypeScript is in this 50,000 line file and it seems like they just write down the rules and don’t bother with any math :) (I could be mistaken)
Type systems and semantics often go hand-in-hand, because you usually want “this expression has type t” to mean “evaluating this expression gives a value of type t”, and proving that involves invoking the sematics. The reality is more subtle for things like non-termination, but “(plus (plus 2 3) 3) has type int” is not really verifiable without providing some sort of notion of semantics.
Oops nevermind, this whole subthread is more about me not understanding WebAssembly than anything :-/
For some reason I didn’t realize that WASM is indeed statically typed (probably because when you look at it from the perspective of a C program, you lose almost all the types. The GC proposal puts a lot back though)
Actually figure 3 seems to help – from the very first line, it looks like it’s saying that all the t.unop and t.binop have a consistent type t, while t.testop and t.relop give you a i32
I think it took me a good year before I could understand any of the notation used in type system “math” with some confidence. This stack exchange answer would have been a god send because it’s such a simple mental model when explained this way.
I would buy this post remastered into a pretty infographic for sure!
I just discovered that (simple subset of) computer science metanotation ( https://www.youtube.com/watch?v=dCuZkaaou0Q , i. e. type system notation) is equivalent to GADTs. Moreover, GADTs can represent substitution, too! (via HOAS)
For example, here is one big Lean 3 GADT, which represents all valid proofs of ZFC theorems: (see in the end of this comment)
Note that:
I expressed capture-free substitution! Using HOAS!
Whole code is nearly 50 lines!
Yes, this is whole ZFC starting from scratch, i. e. from logical axioms, i. e. from absolute nothing. I don’t reuse Lean’s logic
I was able to represent ZFC’s axiom schemas, too
Valid values of this type are ZFC’s valid proofs
Also note:
I’m not sure that my HOAS approach is sound
I didn’t test this code
I skipped nearly all ZFC axioms :) But it is obvious one can fill them all
Was this fact already discovered by somebody else? Or am I first?
-- This is Lean 3. It can be tested on https://leanprover-community.github.io/lean-web-editor
-- ZFC as GADT. I will construct GADT such that its values are exactly valid ZFC proofs
inductive Term : Type
inductive Formula : Type
| Elem : Term -> Term -> Formula
| Equals : Term -> Term -> Formula
| And : Formula -> Formula -> Formula
| Or : Formula -> Formula -> Formula
| Not : Formula -> Formula
| Impl : Formula -> Formula -> Formula
| All : (Term -> Formula) -> Formula
| Exists : (Term -> Formula) -> Formula
infixl ` :& `:60 := Formula.And
infixl ` :| `:50 := Formula.Or
infixr ` :-> `:40 := Formula.Impl
inductive Proof : Formula -> Type
| A1 : Π a b : Formula, Proof (a :-> (b :-> a))
| A2 : Π a b c : Formula, Proof ((a :-> (b :-> c)) :-> ((a :-> b) :-> (a :-> c)))
| A3 : Π a b : Formula, Proof ((a :& b) :-> a)
| A4 : Π a b : Formula, Proof ((a :& b) :-> b)
| A5 : Π a b : Formula, Proof (a :-> (b :-> (a :& b)))
| A6 : Π a b : Formula, Proof (a :-> (a :| b))
| A7 : Π a b : Formula, Proof (b :-> (a :| b))
| A8 : Π a b c : Formula, Proof ((a :-> c) :-> (b :-> c) :-> ((a :| b) :-> c))
| A9 : Π a b : Formula, Proof ((a :-> b) :-> (a :-> Formula.Not b) :-> Formula.Not a)
| A10 : Π a : Formula, Proof (Formula.Not (Formula.Not a) :-> a)
| A11 : Π (fm : Term -> Formula) (tm: Term), Proof (Formula.All fm :-> fm tm) -- HOAS!
| A12 : Π (fm : Term -> Formula) (tm: Term), Proof (fm tm :-> Formula.Exists fm)
| MP : Π a b : Formula, Proof (a :-> b) -> Proof a -> Proof b -- Modus ponens
| R2 : Π (a: Formula) (b : Term -> Formula), (Π x: Term, Proof (a :-> b x)) -> Proof (a :-> Formula.All b) -- Cool!
| R3 : Π (a: Formula) (b : Term -> Formula), (Π x: Term, Proof (b x :-> a)) -> Proof ((Formula.Exists b) :-> a)
| Refl : Π x: Term, Proof (Formula.Equals x x)
| Sym : Π x y: Term, Proof (Formula.Equals x y) -> Proof (Formula.Equals y x)
| Trans : Π x y z: Term, Proof (Formula.Equals x y) -> Proof (Formula.Equals y z) -> Proof (Formula.Equals x z)
| Eq1 : Π x1 x2 y: Term, Proof (Formula.Equals x1 x2) -> Proof (Formula.Elem x1 y) -> Proof (Formula.Elem x2 y)
| Eq2 : Π x1 x2 y: Term, Proof (Formula.Equals x1 x2) -> Proof (Formula.Elem y x1) -> Proof (Formula.Elem y x2)
-- lambdas in types!
| ZFCExt : Proof (Formula.All (fun x, Formula.All (fun y, Formula.All (fun z, ((Formula.Elem z x :-> Formula.Elem z y) :& (Formula.Elem z y :-> Formula.Elem z x))) :-> Formula.Equals x y)))
-- Remaining axioms can be written, too. The only interesting ones are axiom schemes, so here goes axiom schema of specification:
| ZFCSpec : Π (phi : Term -> Formula) (z : Term), Proof (Formula.Exists (fun y, (Formula.All (fun x, (Formula.Elem x y :-> (Formula.Elem x z :& phi x)) :& ((Formula.Elem x z :& phi x) :-> Formula.Elem x y)))))
This is such a great post – making this stuff accessible to more people is so important. I think this is the best introductory post I’ve seen yet, but here are some older ones that might complement it (I always find it can be handy to have a few explanations when learning something new):
This same notation is also used to describe language semantics, I.e. the description of what a program in a language evaluates to / how it executes. I know it’s not everyone’s cup of tea, but learning the basic syntax of natural deduction, predicate logic, and set theory has been a major win for me in terms of understanding PLT in general.
Like it or not, this is the notation used in papers going back to the 40s, and being able to actually read them is very useful.
I would add something which was required for me to go from “I sorta understand typing relations” to “I understand typing relations”.
First, what we are defining here is indeed a relation (a subset of cartesian product of sets). So,
Γ⊢x:τis a ternary relationR(Γ, x, τ). The funny⊢ :syntax is confusing, but it’s just an atomic name of a relation which we write using weird and confusing mixfix notation.⊢and:don’t have a meaning by themselves.Second, the way we define our relation is by writing inductive rules. We essentially say “if X and Y belong to R, then Z belongs to R”.
Third, the actual result is the smallest relation that satisfies all our rules, the least fixed point.
I feel like it’s weird to say that
:doesn’t mean anything in itself when in many grammars, we say things like (λx:τ1.e): (i.e. we use:in the “common” way of labeling a type).The turnstile is also its own thing in logic with its own variety of meanings) that all align to “given X, then Y”.
This sort of reading makes the lambda introduction rule much clearer as well.
the top says “given a context where x is t2, we have e: t2” (the point being that x is referred to somewhere in e)
the bottom part says “given any context, then (λx:τ1.e) has type τ1→τ2”. Notably you no longer say anything about
xin the context, since you’ve captured that inside the lambda itself. But you previously had been able to get the type of e if x has type t1 in the above part of the rule.You’re very explicitly working withing contexts, and a set of facts on the left of the turnstile (tracking your free variables), and being able to shrink that context down. It just feels like really absorbing the turnstile as a description of something in itself is important.
Saying that Γ⊢x:τ is a ternary relation is, of course, true. I just think it’s OK to try and grok the turnstile as its own symbol, given how often it shows up, and type signatures are a thing we see and we can absorb that detail as such.
This made me think - is it true that in order for a type system to be implementable, the typing relation has to be a function? Or at least expressable as a function.
I’m going back into Types and Programming Languages, and it has this paragraph:
This also makes me think about how relations are so common in math, but are pretty much nonexistent in programming - that I know of.
Non-algorithmic presentations of type systems take a bit more work to translate into type checking algorithms, but that doesn’t necessarily prevent that work from being done. That said, some sets of rules may not be expressible with decidable or efficient algorithms. Sometimes restricting them a bit can help make this easier. That’s part of the challenge of type system design!
For example, for a long time, people wanted to Marin-Löf Type Theory (i.e. dependent type theory) into type systems with type inference that could be checked on computers… but it turns out in the general case that is undecidable. Over the years people figured out that using a combination bidirectional type checking and pattern unification you can restrict the programs people are able to write to make type inference tractable. A similar pattern played out with type checking for higher rank types.
Yeah I think this is my main question in the comment below - https://lobste.rs/s/11k4ri/how_should_i_read_type_system_notation#c_xbvbwg
The way I think is that math is “generative” – you have rules that build up huge, infinite sets, and then you reason about the sets
While programming is “recognition-based” – you write an algorithm to try to determine if a sequence of symbols or a (program, type) pair is in the set [1]
So I think you can start from either end. You can just “write the algorithm”, but then people can discover years later that the type system is unsound:
https://hackernoon.com/java-is-unsound-28c84cb2b3f
Also it seems that people discover years later that the type system is Turing complete? i.e. with C++ templates – I believe they were originally intended as a way to express duplicate code, and then people discovered you can do crazy stuff like write regex/CFG interpreters at compile time with it.
There seem to have been multiple similar “discoveries” about Rust - https://news.ycombinator.com/item?id=26445332
I guess the problem with Turing-completeness is that you never know if the type checker terminates … and also from a practical perspective, I think those type checkers tend to be slow, and you have more trouble reasoning about their performance. (I think Java types before generics must have been a more or less linear algorithm? There is the Featherweight Java subset in TAPL. Probably Go was also designed to type check more or less linearly?)
Or you can start with the math, but here it is not clear to me how exactly you arrive at the implementation. When I mentioned that Rust core devs used Datalog to prototype the ownership system, I think what they are doing is trying to find a balance between the speed of type checking / decidability and correctness ?
Then after they decide on the rules, they just implement them by hand, “proving it in their heads” ? (Sorta like you write a recursive descent parser – you will end up with some minor divergences from the grammar, but in practice it’s good enough)
[1] I took these words from the original PEG paper (Ford 2004), which I think is a bit ingenious because it does away with the thing everyone has problems with (ambiguity). BUT it also retains the “declarative” nature of grammars, in that there are at least 2 separate algorithms to implement PEGs (backtracking and packrat parsing):
https://www.oilshell.org/blog/2016/11/01.html#further-thoughts-generation-vs-recognition
I believe prolog is the programming language based around relations rather than functions.
Yes. It has to be function to be efficiently implementable. But no, it doesn’t have to be function to be (possibly inefficiently) implementable. We always can implement given type system by trying all possible proofs/derivations in parallel (i. e. by breadth-first search on tree of all possible proofs). Such search does not necessary terminates. But if proof exists, then it terminates
Hindley-Milner is efficiently implementable, but its typing relation is not a function, as terms could have different types (there’s a most general, principle, type among those types)
I’m pretty sure it depends on how the rules are presented. You can have algorithmic presentations and non-algorithmic presentations of the same type system.
If one typing relation is a function and another one is not, they are different typing relations! The smaller one can be more useful though, if the bigger one is a transitive closure of the smaller one.
Less abstractly, it’s true both:
There is programming language, which somewhat implements derivation of relations. It is called Prolog. And see also its various variants, such as Mercury. But I’m not sure whether Prolog implementations implement search across all possible proofs. And I have secret plan for creating such Prolog implementation, which will search truly all proofs
I don’t think it’s true, in general, an expression could have many types.
Eg,
id x = xhas typea -> a, but it also has a typeInt -> IntIn HM and in more powerful type systems,
idhas type∀x . x -> x. In general, in PLT we mostly consider type systems which have the principal type property, i.e. terms have only one type.That’s not correct. “Principle type” means that there exists a most general type such that any other type would be a specialization of this most general type. It doesn’t say that the principle type is the only type.
One thing that was mentioned in passing but might help with some elaboration is that the inference rules can be seen as defining a way to construct proof trees. Rules with premises are the branches, and axioms are the leaves. In the case of a type system, this would let you construct a proof that a program conforms to a given type.
To build a proof tree you can stick the rules together a bit like lego (with care to make sure that the premises match appropriately with the conclusions). Some example proof trees can be found in “Bidirectional Typing Rules: A Tutorial” by David Christiansen.
Some insights that I found helpful as a result of this perspective:
Yeah I think that makes sense – I would also make an analogy to parsing (which I know better)
Now a couple questions
algorithm for finding a path through the tree : ??? / YACC finds a derivation for the LALR subset of CFGs; the parse tree “recapitulates” the derivation
there is more than one path through the tree: ??? / the grammar is “ambiguous”, I think YACC can pick an arbitrary interpretation
I guess Datalog is like the equivalent of YACC ? I experimented with Prolog a couple years ago to write a type checker, without that much success. I did find somebody who said “Prolog is not programming, and it’s not math”, and I think I agree. (The over-dependence ordering and cuts are a reason it’s not math, but giving good error messages and reasoning about performance is a reason it’s not programming)
I haven’t tried Datalog, but I know that Rust core devs used the Souffle datalog compiler to prototype their type system.
I think of Datalog as something that can perform a massive search through the tree for SOME type systems, although it’s not clear to me when it fails. Is there an equivalent of “LALR subset” for Datalog?
Or I guess Coq and Agda are the most general solutions? What’s the difference between Datalog and them? My guess off the cuff would be that Datalog can be faster for a subset of type systems (similar to how YACC guarantees linear time for LALR grammars ) ??
And then you also have
I guess the name for that is just – “I interpreted the typing rules as something you can check and not a way to generate proofs (a kind of inversion)” – BUT I’m not clear exactly when that falls down?
Coq and Agda are simply Haskell with dependent types added. In other words, these languages are not something completely alien. They are relatively normal programming language. I. e. you simply directly describe computation you need. Datalog/Prolog is something completely different. They are alien. You don’t express algorithm directly, you specify how to derive relations.
Coq and Agda don’t have Prolog-like capabilities. I. e. they don’t have Prolog-style proof search (as well as I know)
You might be able to argue this for Agda, but not Coq. In particular, Haskell’s internal logic is not consistent (the empty type is inhabited!) but this is not the case for Coq.
It says here Coq was used to “create a surveyable proof of the four color theorem”
https://en.wikipedia.org/wiki/Coq
Can any language with dependent types create such proofs ? That wasn’t my impression
The page says it can be viewed as both as functional programming language with dependent types, and that it implements “higher order type theory”
So maybe … not sure
Again: Coq is (conceptually) just Haskell with dependent types. Coq is able to prove mathematical facts thanks to Curry-Howard correspondence. I. e. “X is proof of fact Y” correspond to “X is expression of type Y”, and this gives us ability to represent arbitrary proofs. Also, in addition to dependently typed core Coq has automation support (so called tactics). Tactics lets Coq to infer some proofs automatically. But even if Coq did not have this automation, it still would be able to represent proofs, but they would be bigger.
I suggest reading Barendregt, “Lambda Calculi with Types”, this book shows what are dependent types and how they can represent proofs. Also it actively uses type system notation (aka computer science metanotation) with explanation. https://repository.ubn.ru.nl/bitstream/handle/2066/17231/17231.pdf (but this pdf contain some small amount of typos, if you really want, I can try to find better version).
I recommend watching this vid: https://www.youtube.com/watch?v=dCuZkaaou0Q (it is linked from that Alexis King’s answer comments). It is really cool! It tells in simple term a lot of facts of type system notation (author calls it “computer science metanotation”, CSM).
When I said that Prolog is alien, of course, I didn’t mean that it is absolute impossible to understand. :) I simply meant that Prolog doesn’t follow usual execution model of normal programming languages, as opposed to Coq, Haskell, C++, etc. But Prolog is, in fact, simple. Essentially, Prolog is CSM. In video above at 6:21 author says that CSM is non-deterministic language. And in some moment he says:
In short, Coq/Agda is full blown programming language with conditionals, recursion, data types and all such. You can write normal (pure functional) programs in it (assuming that we call Haskell programs “normal”). Also Coq has dependent types, which allows you to encode proofs and many other things. And also Coq has automation, which allows you to infer sometimes these proofs (so you don’t have to enter them explicitly). In Prolog you cannot write “normal” programs. It has completely different execution model. But you still can write all algorithms in Prolog, you just have to encode them to different programming model. And Prolog matches CSM. I. e. Prolog is a hack, which makes normal programming harder, but which is good for representing CSM.
But keep in mind that Prolog cannot represent all CSM features. In particular, Prolog cannot represent capture-free substitution (it is discussed in video above). If you want to represent substitution, you need Prolog variant with lambda support, i. e. lambda-prolog ( https://www.lix.polytechnique.fr/~dale/lProlog/ ), in particular, Makam ( http://adam.chlipala.net/papers/MakamICFP18/MakamICFP18.pdf, very good read!). Makam matches CSM even better than Prolog. But, as well as I understand, Makam doesn’t search truly all possible proofs. (I’m not sure here.) I have in my TODO list item “Try to find Prolog/CSM implementation, which truly searches all proofs. If such doesn’t exist, then write my own”.
Also here is my another comment in this thread, where I present how to represent all ZFC proofs in one single data type in Lean 3 (Lean 3 is very similar to Coq and Agda): https://lobste.rs/s/11k4ri/how_should_i_read_type_system_notation#c_rtpme5 . This is not how proofs are usually represented in dependently-typed languages, but this shows elegance of dependently typed programming.
Also, you recently asked something like “is there some LALR equivalent in computer science metanotation”. As well as I understand, you asked is there some subset of all possible CSM programs, which is good for simple and efficient implementation. Well, yes, it exists! I would call it “CSM with specified modes”. At 7:53 in video above author said that some parts of CSM are “input” and some are “output”. Labels “input” and “output” attached to rules are called “modes” in Prolog world. These labels guide execution, they tell how to execute our “CSM/Prolog program”. Not all CSM program can be labeled with modes. But if you was able to specify modes, then your program can be easily executed now! I. e. it now belongs to “LALR subset”. There are Prolog variants, which mandate you to specify modes, such as Mercury. I said above that I’m not sure whether there are Prolog implementations, which actually search all proofs. Fortunately, if you was able to specify modes, then many Prolog implementations will go, i. e. all they now will be able to correctly execute your program. But I will repeat that nearly all they lack substitutions, so you are limited to Makam. So conclusion will be so: if you have no modes, they I’m not sure whether there exist any single CSM implementation, which will go. If you have modes, then you will be able to correctly execute your CSM in Makam.
If you have any more questions, ask, I’m happy to answer
Thanks for the response! I did watch the Steele video on CSM a few years ago when I came out, but I will watch it again – I actually don’t remember the prolog bits
This part is interesting, thank you – I will see if watching the video causes any thing to click
In fact that video might have been around the time where I played with a type inferencer for Python in prolog, without much success - https://github.com/andychu/hatlog
This is the article I referred to about Prolog
https://news.ycombinator.com/item?id=18373401
Combining my brief experience and the essay, that’s where I got “it’s not programming and it’s not math”. (Judging from comments, there is some debate about that. However now that I read the comments again, I stand by my judgement – the people advocating Prolog have not accomplished much with it. It’s a bit like Forth, but less practical )
Big question is: why don’t we have a library of Prolog type checkers for Java, Go, Rust, etc. similar to a library of ANTLR parsers for Java, Python, JavaScript, and more ? (the analogy hints at an answer: the generated ANTLR parsers aren’t that good)
FWIW my main interest is to write a SHORT and correct type checker for the metalanguage of https://oilshell.org – typed Python with algebraic data types (stretch goal: control flow dependent null safety, which MyPy has)
It is actually based on ~50K lines of code I’ve written over the last few years, that pass MyPy – what type system features do I use?
Surprisingly, this is not a trivial question!!! MyPy is very rich.
And there is a constraint of compatibility with the C++ type system – MyPy and C++ have a VERY large common subset!
So I think that is an interesting problem, and a bit under-explained from the engineering perspective
FWIW this was the result of search for something engineering-focused last year, and it seems like it resonated with many people - https://old.reddit.com/r/ProgrammingLanguages/comments/ss3w6n/an_accessible_introduction_to_type_theory_and/
I read the first 10 chapters of TAPL, and reviewing the basics was useful. However I don’t think the later parts were particularly well-suited for engineering of a type checker, e.g. I came away with the impression that some implementations were asymptotically inefficient, but actually I’m not even sure how to check that
The parts on “subtyping” didn’t help me, I think they were concentrating more on type inference and such
Thanks for the tips … I am probably going to ignore the dependent types stuff since the type system I want to write is much simpler, but knowing the rough relationships is useful, especially connections to practical tools (the Souffle datalog one seemed most interesting to me)
Let me answer all your messages at once here.
Okay, let me tell you more on Prolog and its implementations. But first disclaimer. I deeply understand mathematical logic and I deeply understand how Prolog should work and what hypothetically can be achieved in Prolog. I have master degree in mathematical logic. But I don’t know what features actual Prolog implementations have. In fact, I never actually wrote any programs in Prolog-like languages. Also, I really deeply understand CSM (computer science metanotation). Let’s begin with example. Here is one simple type system written as CSM:
Note that
[x]above means array with exactly one element:x. I. e. I cheated by describing one-element arrays only. :)As easily can be seen, relation
|- length([[42]]) : intis provable in CSM above.Also note that in CSM above
|-has no any sense without:and vice versa. I. e.|- x : tis simply atomic construct.This was very simple example of CSM. If you didn’t understand it, then it will be very hard to understand anything below. Ask any questions if you want.
Okay, now let’s translate this CSM to Prolog. I will use Mercury ( https://mercurylang.org/ ). So, here we go:
typandtermare what I call syntactical types. I. e. types of nodes of our AST. They should not be confused with actual types of our target language (I will call them target types), such asintandarr(int). Target types are represented by values of syntactical typetyp.But… the code above will not compile, because I leaved mode and determinism markers. In other words, Mercury is not able to process arbitrary CSM, we must explicitly tell Mercury how Mercury should process it. I. e. we should guide Mercury execution. Okay, let’s supply mode and determinism markers (this is diff):
Okay, now the code will compile. You may see warnings, but they are okay. The program will print correct target type, i. e. “int”.
What we just did? We specified that term is input and type is output. I think this is completely reasonable. See again at 7:34 in Steele’s talk. In his video term is input and type is output, too. He essentially talks about modes.
How to run this code? I installed Mercury to Debian using these instructions: http://dl.mercurylang.org/deb/ . Then created file
hello.mwith contents above. Then I compiled it usingmmc --make hello, then I run it using./hello; echo(echois needed, because./hellodoesn’t print newline).This book https://www.mercurylang.org/documentation/papers/book.pdf was helpful when writing code above.
I hope you understood the program above, otherwise I think you will not understand anything below.
Now let’s discuss desirable properties hypothetical Prolog implementation may have.
Property 1. It should have support for user-defined algebraic data types (ADT), such as
typandtermabove. As you can see, this is mandatory for translating CSM to Prolog. As we can see, Mercury has this property. Also note that ADT definition doesn’t necessary look like ADT definition. For example, let’s consider Makam. Here is Makam tutorial: https://astampoulis.github.io/blog/makam-tutorial-01/ . Author gives this example there:This look like declarations of some functions. But conceptually we defined ADT named
exprand its variants calledstringconst, etc. So, Makam satisfies property 1, too, despite its documentation seems not to mention term “ADT”. But Datalog, at least as described in Wikipedia ( https://en.wikipedia.org/wiki/Datalog ) seem not to satisfy this property. But it is possible that actual Datalog implementations add this property.Property 2. Support for lambdas, aka binders, aka capturing-free substitution. Well, “capturing-free substitution” and “support for lambdas” is probably different things, but they are still very closely related. This is the same capturing-free substitution Steele talks at 33:15. Okay, so lambdas and substitution are very desirable properties. For example, let’s consider https://en.wikipedia.org/wiki/Simply_typed_lambda_calculus . In section “Typing rules” we see rule 3. It involves lambda. In this article http://adam.chlipala.net/papers/MakamICFP18/MakamICFP18.pdf we can see that this rule can be encoded in Makam so:
Yes, this is exactly same rule modulo names for variables. Note that
Ehere refers to what I call syntactical function. Or you may call it syntactical substitution. Or AST with hole. Also, note that idiomatic Makam code doesn’t use contexts (context is called Γ in that Wikipedia article), so we don’t see any Γ in Makam code. So, well, this Makam code absolutely exactly matches that rule 3 from Wikipedia, but this may be hard to understand. Anyway, what I want to say is this: property 2 is desirable for encoding CSMs and Makam has this property.Property 3. There is no need to augment CSM with some markers to process it. Well, as we saw, Mercury doesn’t satisfy this property, because one need to specify modes and determinism markers.
Property 4. It is possible to enter any CSM to this Prolog implementation. This property doesn’t hold for Mercury, because not every CSM can be augmented with correct modes.
Property 5. If this fact has proof, then the implementation can find it. This property seems not to be hold for original Prolog. As well as I understand, original Prolog has quirky unreliable semantics. Prolog looks like logic engine, but (as well as I understand) it doesn’t implement logic correctly. So, AFAIK it is possible to construct CSM with particular fact such that this fact is provable, but Prolog will not find its proof. AFAIK this is exactly same complain, which we see in article you gave me: https://www.amzi.com/articles/prolog_under_the_hood.htm . So, Prolog is dishonest and unreliable. (When I say “unreliable” I mean “unreliable as logic engine”.) So, please, never use original Prolog. But, AFAIK you can enter any CSM (assuming it has no lambdas) to Prolog. Without any additional markers. So, it seems Prolog satisfies properties 1, 3 and 4. Well, there is some cheating here: Prolog is untyped, so it doesn’t have ADTs, but AFAIK we can somewhat emulate ADTs using functors, so let’s assume property 1 holds. Also, AFAIK Prolog doesn’t have lambdas, so property 4 holds with condition “assuming CSM doesn’t have lambdas”. So, in Prolog we have 1, 3 and 4, but don’t have 5. Mercury seems to be honest. If there is a proof, Mercury will find it. So, Mercury is honest, you can rely on it. Mercury is true logical engine, as opposed to Prolog. Prolog woes from https://www.amzi.com/articles/prolog_under_the_hood.htm don’t apply. Mercury has 5, but it doesn’t have 3 and 4. So there is a trade-off here: either we restrict set of possible programs and place burden on programmer in form of writing modes markers (Mercury), either we allow somewhat arbitrary CSMs, but will get unreliable broken semantics (original Prolog).
In this point you may ask: “But what exactly you mean when you say that Prolog is unreliable?” Well, let’s consider this Prolog program:
Here
ais some fixed constant andXis variable. Then I asked Prolog (I used SWI-Prolog) isf(a)true. Prolog looped and did not produce any answer. It failed with stack overflow. But it is obvious thatf(a)is true! It simply is present in our hypothesizes! So, yes, the main problem with Prolog from my point of view is here: it is very easy to trick Prolog into infinite loop. And for this reason I state that Prolog doesn’t satisfy property 5.Now let’s compare with Mercury. Well, I just entered similar program to Mercury and got to infinite loop, too. :( But I have read here https://www.mercurylang.org/information/doc-latest/mercury_ref/Implementation_002ddependent-extensions.html , that I can specify
--check-termto check for loops. I specified this option and yes, I got a message in compile time. So, yes, if your program belongs to particular set of programs (this set inclusion can be tested in compile time by specifying particular options), then you are guaranteed to get proof if it exists. (Well, everywhere I say “get proof” I don’t mean getting actual proof, I mean that the implementation can say “yes, it is true” or that implementation will produce some answer, say, target type of your term). (Also, Mercury doesn’t know that function “io.write” terminates. So when writing programs for real you should specify option--enable-terminstead [see https://www.mercurylang.org/information/doc-latest/mercury_ref/Implementation_002ddependent-extensions.html ] and mark which functions you want to check for termination.) (Also, I spent some time playing with--check-term. It seems that this checker is very stupid, in lot of cases it cannot prove termination, when I know for sure that the program terminates. But still I will say that overall Mercury is reliable enough.)In this point you may ask: “But what if Prolog has termination checker, too?” Well, possibly, I don’t want to check this. Anyway, I simply has a feeling that Prolog is unreliable mess. For Mercury I don’t have such feeling. All checks already present in Mercury together with
--check-termmakes it very reliable system.Okay, so Prolog program above turns out to be very hard task for Prolog implementations. You either reject it in compile time (this is violation of property 4), either it runs and fails to loop without producing answer (this is violation of property 5).
Recently I said this:
I meant that I want to find (or write) Prolog implementation, which satisfies properties 1, 3, 4 and 5 in the same time. I don’t know whether such implementation exists already. And I know how such implementation would work. I. e. I can imagine algorithm. In simple terms such algorithm will try everything in parallel. Such implementation will produce answer (i. e. “true”) for above program in finite time. But there is another trade-off here: such implementation will be usually slower than alternative implementations, such as Mercury. Mercury has its restrictions and mode markers for a reason: they allow efficient implementation.
Here is interesting place from https://www.mercurylang.org/documentation/papers/book.pdf (p. 51):
This seems like another proof of Prolog’s unreliability (but I didn’t try to deeply understand this sentence).
Property 6. When there are multiple answers, implementation can find them all. For example, when given term can have multiple target types. Well, Mercury has this property, I was able to write program, which exploit it: https://paste.debian.net/1289302/ . I had to use
solutionsfor this.Property 7. The implementation is typed. Well, original Prolog doesn’t have this property. Mercury has. Makam has, too.
Property 8. The implementation can find answers with free variables. (Yes, this property is distinct from property 6.) For example, let’s assume that our first CSM example also has syntax for empty lists. I. e. our ADT for terms now looks so:
Then we would get this additional rule:
Unfortunately, this would mean that solving
has_type(empty_list, U)will get answerU = arr(T), i. e. answer will contain free variable. This seems not to be supported by Mercury. So for such situation you will need some unnatural encoding. So, Mercury does not have property 8.Property 9. Computation always terminates. Yes, you may say that this is same as property 5. But it is not. Property 5 says: if there is at least one solution and we asked for at least one solution, then some solution will be printed and then the program will terminate. But if there is no solution, there is okay to loop forever. Property 9 says that program will always terminate, even if there is no solution. Properties 1, 4, 5 and 9 cannot be held in the same time. Because this would mean that we can in finite time determine whether given formula is provable in given logic. This is impossible, because we can encode Turing machine as CSM and this would mean that we can check in finite time whether Turing machine terminates, which is, as we know, impossible.
I said above that I can imagine algorithm, which has properties 1, 3, 4, 5. It will not have property 9. I. e. it will loop forever in many cases when there is no solution.
Also that hypothetical algorithm will have property 6. We can run CSM, which specifies axioms for propositional logic and program will run ad infinitum, printing all provable facts. Program will run forever and any provable fact will eventually appear in its output.
End of list of properties.
Okay, so I hope one day I will check whether various Prolog implementations have these properties (this may happen, say, in 10 years). This is list of Prolog implementations I will consider:
You may try to do the same. At least you now know what to look for.
I think implementation of Oilshell itself should be in some well-known language. Haskell and Ocaml are good for implementing languages. Also see https://hirrolot.github.io/posts/compiler-development-rust-or-ocaml.html . If you also want Oilshell to be fast, then possibly Rust is good choice. Yet another variant is to wait when Mojo will be released.
But okay, let’s assume you chose to implement Oilshell in your own Python dialect. What language to implement its type checker in? Well, again, I suggest simply using some “normal” language. Such as Haskell or Ocaml. This will allow you to directly write your type checker. It will be relatively short. You may also chose something Prologish. But as I described above, this is minefield. But depending on exact task this may be acceptable solution.
If you already more or less understand algorithm of your type checker, i. e. you understand control flow of your type checker, then this means you already assigned modes to your type judgements in your head. This means Mercury may be good solution. Another situation is if you can write rules, but don’t understand control flow of your checker, i. e. you have no algorithm. Then assigning modes will be difficult.
Anyway, if you really want to try something Prologish, then try to write your rules in text editor in Prolog syntax. Then try to understand what properties from above list you will need from Prolog implementation. And then find Prolog implementation, which have these properties. :)
Another possible good solution is Makam. There is very good tutorial here: http://adam.chlipala.net/papers/MakamICFP18/MakamICFP18.pdf . Also see https://astampoulis.github.io/blog/ . In this article https://astampoulis.github.io/blog/makam-tutorial-01/ author proceeded straight to writing interpreter and parser in Makam.
But again: all these Prolog variants don’t have such care (i. e. devs’ effort) as Ocaml and Haskell. Also it is possible they will require some effort to integrate it with bigger system, i. e. pass information from it and to it.
Sounds very unusual. I have no idea how to implement this in Prologish languages. Ocaml and Haskell recommended
As I said above Mercury has similar properties
If you already have type checker in OCaml, why translate it to Prolog? But, okay, if you want, then it simple. If you have function
f : a -> b, then its Mercury equivalent will be:- pred f(a::in, b::out) is det..Yes, Prologish languages have advantage: they directly correspond to CSM, i. e. they allow you to write typing rules directly. Some of them, such as Makam and Mercury also seem to have capture-free substitution built-in, which is very good, too.
I think because they wanted to show how actually the program is implemented.
Rust was implemented in Ocaml first. Coq is implemented in Ocaml. CompilerCert is implemented in Ocaml. Isabelle is implemented in Isabelle/ML, which is ML dialect. So, Ocaml is pretty good “engineering reality”. It is full-blown programming language. You can do anything from it, including writing ActivityPub servers (I recently saw an article on writing ActivityPub server in Ocaml.)
(Personally I know Haskell, but don’t know Ocaml. Still I have read many articles on Ocaml and I see that this is a good language.)
Prologish languages allow very concise programs as opposed to general purpose languages. But in the same time it is possible that in some moment you will hit some unavoidable restriction of such Prologish language. I. e. Prologish languages are shorter and they are very strict. (Also, this post [ https://lobste.rs/s/11k4ri/how_should_i_read_type_system_notation#c_b8ricb ] was written as your answer to yourself. I saw it by chance only.)
I suggesting looking carefully at these two slides and understand why one is deterministic and another is not. In slide 6:00 both conclusions have same form:
no_conflictwith arbitrary parameters. So if we want to checkno_conflictfor some set of arguments, we don’t know beforehand which rule to apply, so we have to run both. But note that in this slide we see very small part of CSM. We don’t see whole CSM, so we don’t know input/output modes. But even if all arguments tono_conflictare input, i. e. all them are known, then we still don’t know which rule to choose. I. e. even in simplest case we have non-determinism.Now let’s see slide 7:00. This time it seems we see whole set of rules. And we know intended meaning of our relation (i. e.
... |- ... : ...): we know that it supposed to mean “has type of”, and thus we know that gamma/context and term are inputs and type is output. Now we can ask ourselves: is it possible by shape of inputs alone (i. e. context and term) to see what rule to apply. Yes, it is possible! If term is variable, then first rule applies, if term is lambda, then second rule applies, etc. I recommend try to imagine full algorithm for type checking for this slide.Also, I highly recommend this cool demo: https://petevilter.me/post/datalog-typechecking/ , https://lingo-workbench.dev/ .
This demo is possibly of interest, too: https://github.com/voodoos/mlts . It is toy programming implemented in Lambda-Prolog
(delayed response) Thanks for the detailed response! The first example makes sense to me – it’s not too different than the Bool-Int language that’s the first one in TAPL (typed_arith I think they call it).
And yeah it’s not surprising that Mercury can easily deal with it.
FWIW I also transcribed something very similar to that Bool-Int language, based on some prompting from matklad in the “typescript” thread - https://lobste.rs/s/hkkcan/typescript_is_surprisingly_ok_for
https://github.com/oilshell/blog-code/blob/master/typescript/check.ts
https://github.com/oilshell/blog-code/blob/master/typescript/eval.ts
So it’s like ~50 lines each for the type checker and evaluator. I actually like TypeScript now, having not used it before. The type system does seem as expressive as OCaml’s, or even more !!!
header.ts has the types - https://github.com/oilshell/blog-code/blob/master/typescript/header.ts
Also thanks for the Datalog links at the end – those are close to my interests, and I also want to make a little HTML demo for my statically typed language, so I might take some cues. (actually I just noticed the Lingo one is 17 MB of JS!! I want something a bit simpler)
The goal may be to write enough of a language to compile a Mandelbrot generator to WASM, from a Lispy syntax, but statically typed.
Hm about those 9 properties … It’s an interesting thing to think about, and I’m not qualified to judge, but I would say there are some fundamental reasons that programming is not math..
Basically you could maybe have something with those properties for Toy languages … but to me the interesting goal is if you can specify and prototype real languages (again like parts of Rust were prototyped with datalog). So again where does it fall down:
There is a big gap there, and I think if you attempt such a thing, you’ll see some stark engineering tradeoffs. I mean the most obvious one is the exponential algorithms are not practical.
Sounds like a great goal though! I want programming language impls to be short and correct. (And the TypeScript experiment has turned out to be interesting – the total is 750 lines now, but most of it is parsing, not types or semantics, even though it’s a Lispy syntax !! )
Those problems sound great for a Ph.D. , probably even too ambitious for one :)
My former teammate Fergus Henderson (from ~2006) worked on the Mercury Language before we worked together. And also there is a guy on Reddit /r/programminglanguages who also worked on Mercury
To me it seems like you need a blog to explain this stuff :-) Can we move CSM and programming closer together? It’s a worthy goal!
I have found the blog to be great for tracking progress on ideas and implementations, and keeping yourself honest!
I didn’t use TypeScript, but I have read this good article from Alexis King: https://lexi-lambda.github.io/blog/2020/08/13/types-as-axioms-or-playing-god-with-static-types/ , so I consider TypeScript bad language. For compilers and interpreters I still suggest Rust, Haskell and Ocaml.
These examples are trivial. They can be easily rewritten in any language, which supports algebraic data types and imperative programming in the same time. Such as Rust and Ocaml. (But in Rust you have to mess with these lifetimes, so I don’t recommend Rust.)
In Ocaml the code will be similarly short. Again, I suggest this article on Ocaml: https://hirrolot.github.io/posts/compiler-development-rust-or-ocaml.html . Ocaml syntax may be seen as difficult, but it is just syntax. On conceptual level the same simple things happen as in your TypeScript code. The only actual semantic difference is this: in Ocaml mutable values have to be specially marked with
ref, similarly toCellandRefCellin Rust. In exchange you get proper type system (as described in King article above).This is absolute exaggeration. Ocaml supports many advanced type features, such as GADTs.
:) I recently thought about generating Mandelbrot. And surprisingly I realized that no tools I aware of can precisely generate Mandelbrot. Moreover, tools even unable to precisely draw graph of “sin(1/x)”. Go draw it various ways (using Google or Mathematica or something else). You will see artifacts. So possibly one day I will create some tool for proper rendering of such graphs
Well, for every language there are good parsing libraries (or tools). Just use them. They are cool. Parsing of preexisting languages, such as Rust, may be challenging (they often require some hacks). But the libraries are perfect for creating own languages. Just make sure to use context free parsing, not PEG and not parser combinators (parser combinators are Rust’s nom, Haskell’s parsec, etc). I can tell why I think so, if you want.
Thanks. I don’t have a blog. I have Mastodon: https://types.pl/@safinaskar
I watched the first ~40 minutes of the Steele video again – can’t believe that was over 5 years ago when I watched it the first time (when I knew much less)
So this is exactly WHY I hacked on that Python type inferencer in Prolog – I wanted it to be true, but I think it turns out to be NOT true.
This article I linked claims that Prolog is better understood as a language with backtracking and unification - https://www.amzi.com/articles/prolog_under_the_hood.htm
And I think you are essentially saying the same thing:
Basically I think Prolog does not have the YACC property of helping you check that your grammar is implementable efficiently, and of reporting ambiguities. (Although YACC also falls short – I don’t use grammars for all parsing, only some parsing)
Right he says the first metalanguage is non-deterministic, but this one is deterministic (although it says Deterministic? in the slide title, probably because that fact is implicit in the paper – they didn’t call that out)
OK so I will have to look up “modes” in Prolog … maybe that helps. I didn’t encounter it the first time around.
https://stackoverflow.com/questions/30427203/what-is-the-meaning-of-or-in-prolog
OK so then my question is something like – can I translate type checkers in OCaml to Prolog with modes? OCaml pattern matching is deterministic.
And now is there any benefit to doing Java, or Featherweight Java, or Go in Prolog with modes? Or is it OCaml a better metalanguage?
Does Prolog (or datalog) have any Yacc-like properties that make it better than a general purpose language, when used as a metalanguage?
Why did TAPL choose ML for their implementation language rather than Prolog/datalog? (And again I think even the ML implementations are far from “engineering reality”, as far as I can see)
Another question is – it’s not enough for “Prolog with modes” to exist, it also has to be expressive enough
My guess is this deterministic metalanguage only works for a small minority of languages, e.g. if you take
in roughly ascending order of “complexity” or “featurefulness”
then where does “Prolog with modes” fall down? My guess is it falls down even BEFORE Java
Because otherwise we would see people writing more of those type systems in Prolog, rather than a few toy blog posts here and there
Actually it seems like Prolog WITHOUT modes even falls down before Java
As well as I understand “higher order type theory” simply means that at type level you can have “things”, which have their own types. This is not rocket science. Haskell have this. And even Rust and even C++ have this! Rust’s
Vecand C++’svectorare functions, which take type and return type. I. e. (in Rust pseudosyntax)Vecis a function of typefn(type) -> typeif you have decent dependent type system you can use them to generate proofs like what you get in Coq. And there are a lot of people who get really far in Haskell with various extensions.
But the reason people reach for stuff like Coq or Lean is because these languages, beyond the type system, also have tooling to help you write the proofs. The end result is still just dependently typed functions, but kind of like how
dosyntax makes Haskell’s monadic code more reasonable to grok, things like Coq’s tactics for proofs make it easier to write certain kinds of proofs. But utlimately you’re still “just” using the type system (or rather, the type system + the fact that you can provide a definition to a function) to generate the proof.I can bring your parsing analogy down to extreme: CSM (computer science metanotation) programs are actual grammars! More specially: they are W-grammars ( https://en.wikipedia.org/wiki/Van_Wijngaarden_grammar ). More specially: if CSM program has one input and its rules break this input down to parts, then there is corresponding W-grammar.
W-grammar’s metagrammar (see Wikipedia article above) correspond to usual context-free grammar of CSM relations and terms themselves. It is called BNF in that video ( https://www.youtube.com/watch?v=dCuZkaaou0Q ).
W-grammar’s hypergrammar represent our target language, i. e. language, which is input (i. e. what is called “input” at 7:53 at that video).
If you want I can continue analogy and tell what correspond to what. Again: not all CSMs correspond to W-grammar. Our CSM should have one input, and that input should become lesser in any rule.
In other words, W-grammar (i. e. W-grammar hypergrammar) represents program in target language (i. e. language we want to type check, for example, Rust). So, given W-grammar represents all good (say) Rust programs. But unlike normal context-free grammars, W-grammar also type checks Rust program
Hmm … well I guess given the shape of context-free grammars and of type systems, it’s not too surprising you can find a higher level mathematical correspondence
I guess I am interested more in engineering consequences – YACC finding O(n) algorithms is a major engineering consequence
i.e. I don’t find termination/non-termination to be that useful an engineering property – it’s relatively weak – because O(n^3) is often as bad as non-terminating in practice, let alone O(2^n), etc.
I try to start with a codebase or a tool, say TypeScript or Rust, and Prolog or Souffle Datalog, and ask what makes it work
Right now I don’t see evidence of many people using DSLs and tools to write type checkers.
For example TypeScript is in this 50,000 line file and it seems like they just write down the rules and don’t bother with any math :) (I could be mistaken)
https://github.com/microsoft/TypeScript/blob/main/src/compiler/checker.ts
But Rust does use tools for prototyping, but not implementation, which is closer to what I would like to explore
I will check out the video again as mentioned, thanks!
Also note that webassembly spec is written in exactly same type system notation
??? You mean about figure 2 here
Those are the small-step semantics, i.e. evaluation rules
WebAssembly has types but they are dynamic types – i32, i64, f32, f64
The metalanguage syntax is similar, but it looks distinct to me because there’s no typing relation
https://people.mpi-sws.org/~rossberg/papers/Haas,%20Rossberg,%20Schuff,%20Titzer,%20Gohman,%20Wagner,%20Zakai,%20Bastien,%20Holman%20-%20Bringing%20the%20Web%20up%20to%20Speed%20with%20WebAssembly.pdf
Type systems and semantics often go hand-in-hand, because you usually want “this expression has type t” to mean “evaluating this expression gives a value of type t”, and proving that involves invoking the sematics. The reality is more subtle for things like non-termination, but “(plus (plus 2 3) 3) has type int” is not really verifiable without providing some sort of notion of semantics.
RE WASM in particular, you can check out this paper. There’s been a lot of work in adding actual static verification of the spec for WASM regarding these in more recent papers, but I don’t have the links on me right now https://www.cl.cam.ac.uk/~caw77/papers/mechanising-and-verifying-the-webassembly-specification.pdf
These are static types
See Figure 3 in the paper for the typing relations. But what I assume the parent is referring to is the online specification.
Oops, thanks! Yes that verification pass seems important (although it’s apparently not typing of data)
I’m not sure what you mean by “typing of data”?
Oops nevermind, this whole subthread is more about me not understanding WebAssembly than anything :-/
For some reason I didn’t realize that WASM is indeed statically typed (probably because when you look at it from the perspective of a C program, you lose almost all the types. The GC proposal puts a lot back though)
Actually figure 3 seems to help – from the very first line, it looks like it’s saying that all the
t.unopandt.binophave a consistent typet, whilet.testopandt.relopgive you a i32So I’m reading that as four separate instructions
that are statically verified, not a single one with dynamic types
I think it took me a good year before I could understand any of the notation used in type system “math” with some confidence. This stack exchange answer would have been a god send because it’s such a simple mental model when explained this way.
I would buy this post remastered into a pretty infographic for sure!
I just discovered that (simple subset of) computer science metanotation ( https://www.youtube.com/watch?v=dCuZkaaou0Q , i. e. type system notation) is equivalent to GADTs. Moreover, GADTs can represent substitution, too! (via HOAS)
For example, here is one big Lean 3 GADT, which represents all valid proofs of ZFC theorems: (see in the end of this comment)
Note that:
Also note:
Was this fact already discovered by somebody else? Or am I first?
Also note that computer science metanotation is somewhat equivalent to Twelf ( http://twelf.org/wiki/Theorem_prover ), Lambda Prolog ( https://www.lix.polytechnique.fr/~dale/lProlog/ ), Makam ( https://github.com/astampoulis/makam ), somewhat to Automath ( https://www.cs.ru.nl/~freek/aut/ ), somewhat to Prolog and Mercury
Here I discuss what properties type system notation implementation should have: https://lobste.rs/s/11k4ri/how_should_i_read_type_system_notation#c_5him0l
[Comment removed by author]