One example, perhaps, of what the author means by a “discovered language” might be Haskell because it is largely syntactic sugar over a core language which has only 9 components. There is an interesting talk about the core language’s 9 components and how they work.

Into the Core - Squeezing Haskell into Nine Constructors by Simon Peyton Jones

This is precisely why I love Lisp family of languages. Lisps provide a few general primitives that allow users to build their own abstractions as needed. Most languages end up baking a lot of assumptions into the core of the language, and many of those end up being invalidated as time goes on. Development practices change, the nature of problems the language is applied to changes, and so on. Language designers end up having to extend the language to adapt it to new problems, and this ultimately makes the language complex and unwieldy.

Isn’t the author arguing for less of that? Consider:

On one hand, we have something that was built to accomplish a task… On the other hand, we have something that describes how things can relate to each other… It seems like the popular perception of programming languages falls more in the ‘vacuum cleaner’ camp: that a programming language is just something for translating a description of computation into something machine-readable… I think that this ‘features focused’ development style can cause people to ignore too much of the structure that the features contribute to (or even that the features are destroying structure).

In theory, Lisp primitives can be used to build a principled, coherent set of abstractions just as Haskell is a syntax sugar over a small set of features. But “abstractions as needed” sounds like ad-hoc solutions to specific problems which is characteristic of languages on the invented side of the spectrum.

I think these are completely orthogonal concerns. Baking abstractions into the language doesn’t make them any more principled than making them in user space. The abstractions provided by the language generally have to have a large bigger scope, and if you don’t get them right on the first try you end up having to live with that.

I guess i don’t understand why this article causes you to say “This is precisely why I love Lisp family of languages”. The reasons you give for loving them seem like reasons you shouldn’t love them, according to the article. But I also think Lisp is interesting for those reasons. Not disagreeing with your point, just trying to find the connection to the article.

The article talks about the ‘features focused’ development style where languages tend to accumulate a lot of special case features for solving very specific problems. Lisps avoid this problem by providing a small set of general patterns that can be applied in a wide range of contexts, and giving users the power to create their own abstractions. This approach allows useful patterns to be discovered through use and experimentation. Once such patterns are identified, then they can be turned into general purpose libraries. At the same time users can still create features that make sense for their problems without having to pollute the core language.

Alternately, programs can be close to the silicon and replicate common mistakes by not designing around them, or languages can pretend there is no silicon.

I have seen many argue that certain programming languages are discovered. I disagree. For the lambda calculus, Turing machines, and recursive functions, one uses as argument they are equivalent by mere chance, thus they are independent discoveries of the same “thing” that already exists “out there”.

This is false. Church, Gödel and Turing all worked at some point even at the same location. Consider that they could have exchanged ideas either directly or indirectly (through collegues).

The programmers of the Functional camp also easily fall into the trap, that algebraic data types, type classes and other features such as unification are “out there” to be “discovered”. The fact that these languages can be understood in a minimal calculus, only highlights a fundamental fact:

We are good at transmitting information about programming. Two people can learn and interpret information from two different sources, and still derive the “same” structure.

We could equally jubilate that “programming education” is so effective, since so many people are drawn out to see the same “light”.

Maybe this is the case, maybe not. There are some very interesting connections. I apologize in advance for category theory.

First, the category Set, whose objects are sets and arrows are functions, is also a topos. In fact it is the simplest topos, in a sense; Set can be built by considering all of the different ways to examine a point as a topological space.

Every topos is Cartesian closed, which means that it has actions which resemble currying and application. Indeed, every topos has an internal logic, and there is a famous isomorphism, the Curry-Howard correspondence, which connects this logic to an internal language. Every Cartesian closed category has an internal language that looks like typed lambda calculus, and thus, so does every topos. This connection is sometimes called the Curry-Howard-Lambek correspondence.

To address your points more directly: It is true that Church, Gödel, and Turing were all on very similar paths. Gödel and Turing, along with Nash, also had hinted at describing the basics of modern complexity theory, talking about what would become P and NP. Support for the Church-Turing thesis comes, to me, from Turing-complete systems being shown to be Turing-equivalent (so far), and from metalogical results like those of not just Gödel and Tarski, but also more modern progress in constructive mathematics, where it is not false that all real numbers are computable and all computable functions are continuous, just like in Turing’s model.

Algebraic data types, as the name implies, arise from algebras on types, like in the typedefs mini-language. They are not limited to “functional” programmers; Zephyr ASDL is used by (my language) Monte, Python, Oil, and probably others.

Some type classes naturally arise, in the sense that we can imagine topoi where class-like features would be granted automatically to all computations, without changing the underlying logic. This is possible because logic is portable between topoi. In particular, the Haskell typeclass Ord corresponds to every type having a free (partial) order, and Eq to equality; we could give both for free by working in a topos like Pos, the category of partially-ordered sets; logic in this topos is the same as in Set, but richer in the values.

Unification arises in a beautiful way, to me. In Rel, the category whose objects are sets like Set, but whose arrows are relations, the directions of arrows do not matter much. If we bend all of the arrows to one side, then the unifications bulge out on the other, and computation in Rel can consist entirely of first bending the arrows (choosing which values are inputs and which are outputs) and then reducing the diagram piece-by-piece and checking each unification as it bulges out.

What I find entirely fascinating is that, according to the categorical logic, there should be at least one more family of logics besides the Cartesian-closed (“functional”) and monoidal-closed (“relational”, “logic”) languages, corresponding to weaker closure, and there is: linear logic! Linear logic is being explored right now in PLD and PLT circles, and its relevance was predicted by the mathematical framework.

There is a paper, Knowledge Representation in Bicategories of Relations, which has diagrams that depict relations in a visceral way. I usually imagine a pure logical language, too, like µKanren, which can be viewed as an internal language for the categorical diagrams.

I apologize for not having a page with images which show exactly what I described. I started work once on a writeup of how µKanren or other pure logical languages relate to relational categories, via the relatively obvious functor which sends diagrams to µKanren programs, but didn’t think that there would be much interest.

Even without collaboration, the combination of consistency and minimalism that acts as the fitness function for mathematical discovery is a pretty intense constraint, so it should be unsurprising that lots of the things we invent under those constraints are equivalent.

If you’ve got an urn with ten thousand green balls, one hundred light-blue balls, and two dark-blue balls, and we’ve got a rule that we only pick blue balls, we shouldn’t be surprised that a lot of the ones we pick are light blue – because we’ve made the green balls irrelevant to the selection. In the same way, if from the problem space of mathematical ideas we only choose consistent and minimal sets of ideas, and most forms of consistent minimality have certain attributes in common, we shouldn’t be surprised that we keep choosing ideas with those attributes – the majority of potential mathematical ideas are irrelevant to our selection purposes, and the rest are biased in particular ways.

The languages people claim are ‘discovered’ are merely very internally consistent. Consistency is valuable in of itself, because it makes predicting new behaviors based on knowledge of a priori rules easy. But, consistency is enough to vastly increase the likelihood of overlap with other consistent systems (like… all of mathematics).

I can’t justify platonism, it doesn’t seem sensible to me. But I do think if we solve similar problems while filtering for simplicity then we’re likely to solve them in similar ways.

Other than lambda calculus, Turing machines and recursive functions being equivalent, what is also very important is that since then no non-equivalent alternative has been found. You can’t blame that on everyone having exchanged ideas: all kinds of mathematical inventions/discoveries were found by people whose coworkers actually disliked and disagreed with their conclusions as long as possible. The invention/discovery of complex numbers, quaternions, Gödel’s uncertainty principles, cardinal numbers due to Cantor’s diagonalization argument… the list is endless. It seems quite unlikely a non-equivalent alternative will still be found, independent of whether you consider it a discovery or an invention.

“We can’t ‘point at bits of monoid’ and say how much they contribute to some purpose.“

This doesn’t seem right to me. Monoids aren’t the best example because they’re so simple, but it’s common for mathematicians/haskell to argue about which algebraic structures/constraints are best for reasoning about a topic. Consider arrows and free monads and monad transformers and all that jazz.

One example, perhaps, of what the author means by a “discovered language” might be Haskell because it is largely syntactic sugar over a core language which has only 9 components. There is an interesting talk about the core language’s 9 components and how they work.

Into the Core - Squeezing Haskell into Nine Constructors by Simon Peyton JonesThis is precisely why I love Lisp family of languages. Lisps provide a few general primitives that allow users to build their own abstractions as needed. Most languages end up baking a lot of assumptions into the core of the language, and many of those end up being invalidated as time goes on. Development practices change, the nature of problems the language is applied to changes, and so on. Language designers end up having to extend the language to adapt it to new problems, and this ultimately makes the language complex and unwieldy.

Isn’t the author arguing for less of that? Consider:

In theory, Lisp primitives can be used to build a principled, coherent set of abstractions just as Haskell is a syntax sugar over a small set of features. But “abstractions as needed” sounds like ad-hoc solutions to specific problems which is characteristic of languages on the invented side of the spectrum.

I think these are completely orthogonal concerns. Baking abstractions into the language doesn’t make them any more principled than making them in user space. The abstractions provided by the language generally have to have a large bigger scope, and if you don’t get them right on the first try you end up having to live with that.

I guess i don’t understand why this article causes you to say “This is precisely why I love Lisp family of languages”. The reasons you give for loving them seem like reasons you shouldn’t love them, according to the article. But I also think Lisp is interesting for those reasons. Not disagreeing with your point, just trying to find the connection to the article.

The article talks about the ‘features focused’ development style where languages tend to accumulate a lot of special case features for solving very specific problems. Lisps avoid this problem by providing a small set of general patterns that can be applied in a wide range of contexts, and giving users the power to create their own abstractions. This approach allows useful patterns to be discovered through use and experimentation. Once such patterns are identified, then they can be turned into general purpose libraries. At the same time users can still create features that make sense for their problems without having to pollute the core language.

Alternately, programs can be close to the silicon and replicate common mistakes by not designing around them, or languages can pretend there is no silicon.

I have seen many argue that certain programming languages are discovered. I disagree. For the lambda calculus, Turing machines, and recursive functions, one uses as argument they are equivalent by mere chance, thus they are independent discoveries of the same “thing” that already exists “out there”.

This is false. Church, Gödel and Turing all worked at some point even at the same location. Consider that they could have exchanged ideas either directly or indirectly (through collegues).

The programmers of the Functional camp also easily fall into the trap, that algebraic data types, type classes and other features such as unification are “out there” to be “discovered”. The fact that these languages can be understood in a minimal calculus, only highlights a fundamental fact:

We are good at transmitting information about programming. Two people can learn and interpret information from two different sources, and still derive the “same” structure.

We could equally jubilate that “programming education” is so effective, since so many people are drawn out to see the same “light”.

Maybe this is the case, maybe not. There are some very interesting connections. I apologize in advance for category theory.

First, the category Set, whose objects are sets and arrows are functions, is also a topos. In fact it is the simplest topos, in a sense; Set can be built by considering all of the different ways to examine a point as a topological space.

Every topos is Cartesian closed, which means that it has actions which resemble currying and application. Indeed, every topos has an internal logic, and there is a famous isomorphism, the Curry-Howard correspondence, which connects this logic to an internal language. Every Cartesian closed category has an internal language that looks like typed lambda calculus, and thus, so does every topos. This connection is sometimes called the Curry-Howard-Lambek correspondence.

Now, having laid out the foundations, you can consider this listing of examples of the correspondence.

To address your points more directly: It is true that Church, Gödel, and Turing were all on very similar paths. Gödel and Turing, along with Nash, also had hinted at describing the basics of modern complexity theory, talking about what would become P and NP. Support for the Church-Turing thesis comes, to me, from Turing-complete systems being shown to be Turing-equivalent (so far), and from metalogical results like those of not just Gödel and Tarski, but also more modern progress in constructive mathematics, where it is not false that all real numbers are computable and all computable functions are continuous, just like in Turing’s model.

Algebraic data types, as the name implies, arise from algebras on types, like in the typedefs mini-language. They are not limited to “functional” programmers; Zephyr ASDL is used by (my language) Monte, Python, Oil, and probably others.

Some type classes naturally arise, in the sense that we can imagine topoi where class-like features would be granted automatically to all computations, without changing the underlying logic. This is possible because logic is portable between topoi. In particular, the Haskell typeclass

`Ord`

corresponds to every type having a free (partial) order, and`Eq`

to equality; we could give both for free by working in a topos like Pos, the category of partially-ordered sets; logic in this topos is the same as in Set, but richer in the values.Unification arises in a beautiful way, to me. In Rel, the category whose objects are sets like Set, but whose arrows are relations, the directions of arrows do not matter much. If we bend all of the arrows to one side, then the unifications bulge out on the other, and computation in Rel can consist entirely of first bending the arrows (choosing which values are inputs and which are outputs) and then reducing the diagram piece-by-piece and checking each unification as it bulges out.

What I find entirely fascinating is that, according to the categorical logic, there should be at least one more family of logics besides the Cartesian-closed (“functional”) and monoidal-closed (“relational”, “logic”) languages, corresponding to weaker closure, and there is: linear logic! Linear logic is being explored right now in PLD and PLT circles, and its relevance was predicted by the mathematical framework.

Your description of unification via Rel is fascinating. What should I read to understand it better?

There is a paper, Knowledge Representation in Bicategories of Relations, which has diagrams that depict relations in a visceral way. I usually imagine a pure logical language, too, like µKanren, which can be viewed as an internal language for the categorical diagrams.

I apologize for not having a page with images which show exactly what I described. I started work once on a writeup of how µKanren or other pure logical languages relate to relational categories, via the relatively obvious functor which sends diagrams to µKanren programs, but didn’t think that there would be much interest.

Awesome, thanks.

Even without collaboration, the combination of consistency and minimalism that acts as the fitness function for mathematical discovery is a pretty intense constraint, so it should be unsurprising that lots of the things we invent under those constraints are equivalent.

If you’ve got an urn with ten thousand green balls, one hundred light-blue balls, and two dark-blue balls, and we’ve got a rule that we only pick blue balls, we shouldn’t be surprised that a lot of the ones we pick are light blue – because we’ve made the green balls irrelevant to the selection. In the same way, if from the problem space of mathematical ideas we only choose consistent and minimal sets of ideas, and most forms of consistent minimality have certain attributes in common, we shouldn’t be surprised that we keep choosing ideas with those attributes – the majority of potential mathematical ideas are irrelevant to our selection purposes, and the rest are biased in particular ways.

The languages people claim are ‘discovered’ are merely very internally consistent. Consistency is valuable in of itself, because it makes predicting new behaviors based on knowledge of a priori rules easy. But, consistency is enough to vastly increase the likelihood of overlap with other consistent systems (like… all of mathematics).

This is basically my position on the topic.

I can’t justify platonism, it doesn’t seem sensible to me. But I do think if we solve similar problems while filtering for simplicity then we’re likely to solve them in similar ways.

Other than lambda calculus, Turing machines and recursive functions being equivalent, what is also very important is that since then no non-equivalent alternative has been found. You can’t blame that on everyone having exchanged ideas: all kinds of mathematical inventions/discoveries were found by people whose coworkers actually disliked and disagreed with their conclusions as long as possible. The invention/discovery of complex numbers, quaternions, Gödel’s uncertainty principles, cardinal numbers due to Cantor’s diagonalization argument… the list is endless. It seems quite unlikely a non-equivalent alternative will still be found, independent of whether you consider it a discovery or an invention.

“We can’t ‘point at bits of monoid’ and say how much they contribute to some purpose.“

This doesn’t seem right to me. Monoids aren’t the best example because they’re so simple, but it’s common for mathematicians/haskell to argue about which algebraic structures/constraints are best for reasoning about a topic. Consider arrows and free monads and monad transformers and all that jazz.