I think there’s an underexplored happy place (that D doesn’t entirely reach, but sort of gestures at) involving a fusion of Algol, Self, ML and the unlisted eighth language concept, metaprogramming (templates, macros, :gag: C preprocessor): pervasive type inference, mutable state in objects, immutable values with rich built-in semantics (tuples, sumtypes), lambdas and functional algorithms at the coarse domain level but (opt-in) mutable state (!) at the lowest level. The two big weaknesses of the ML family (IMO) are coarse state mutation and very fine algorithms; tail recursion is cute but iteration will always be nearer to the CPU’s heart. And even a ML program does (as a matter of observable behavior) have mutable state when it’s executed; OOP lets you model this fact where insistence on purity at all levels does not.
Then you end up, in the hexagonal model [1], with three levels that almost use entirely distinct languages:
at the infrastructure (RPC, IO) level, pervasive metaprogramming uses compile-time introspection to mostly autogenerate encode/decode logic, IO functions and automatically check the code against the interface spec.
This is tested sparsely by running an instance of the program against wire-level mock services.
at the application (state mutation) level, classes separate concerns and encapsulate mutations of different aspects of the application’s state, and also process inbound and outbound messages at a high level.
This is tested with class mocking and stub services.
at the domain (computation of business outcomes) level, pure type-inferred algorithms operate on immutable values, making them easy to test and understand. Here there is heavy use of ML features.
This is tested with unittests.
at the (not listed in the hexagonal model) implementation level, where the rubber hits the road, it’s C again, or rather Fortran: a moderately high-level representation of CPU features in a format easily digestible for the compiler. Opt-in mutation, opt-in pointers. Small units of work.
Addendum: I think OOP gets a bad rap because people try to use it at the domain level, not the application level. A warning sign is when you see a class hierarchy that’s deeper than two levels. The important insight that classes should encapsulate components of the service as a whole that can change state and trigger actions independently. They should never represent values.
I would be more precise: Actors are missing. Actors are like objects and also like lambda calculi, so actor languages are like Lisps or Self; but the semantics are usually concurrent in a way which the typical Lisp or Self descendant lacks. The ur-language for concurrent actors is presumably E.
If you want an absolute gem of a programming language, check out Factor.
The development of it seems to have slowed down a bit when Slava Pestov stopped working on it, but it was in an amazing state at that point already. It’ll rewire your brain because of being a concatenative language and how you can (or have to) do things.
Yes, I was about to recommend Factor as the first choice for Forthlikes. I don’t know gForth, but if it’s like most trad Forths it’s a lot harder to get your head around.
For learning how to build a Forth, good resources are JonesForth (x86 assembly) or Quackery (Python.) The former has exhaustively commented source code, the latter has a whole book describing the language and implementation. I recommend the exercise — I learned a LOT from reading the source code of FIG Forth for the 8080 as a teen.
Another great one is retroforth. Despite its name, it is a thoroughly modern forth. I feel like it combines some of the nice parts of Joy/Factor, while being a bit more simple. http://www.retroforth.org/
The article is mistaken: Forth was originally written iteratively over time in FORTRAN, as a helper deck of cards Chuck Moore would carry around with him begining as far back as the 1960s. Controlling radio telescopes was just his thing.
Very good list but I think that it’s worth considering scripting languages, in ergonomics if not particularly fundamental features. Command line languages (bash and its predecessors and descendents) fit into this category for me, but I’m not sure what the ur-scripting-lang would be for them. Perl is definitely notable but too recent; awk maybe? SNOBOL?
I gave a list of 7 formal specification ur languages over at the orange site:
Temporal Logic (LTL, CTL, TLA)
Relational algebra (Z, B, Alloy)
Guarded Command Language (Promela, SPIN)
Process Calculi (CSP, FDR)
Labelled Transition Systems (Petri Nets, mclr2)
Just drawing a diagram and figuring out the semantics later (UML)
Abstraction over an existing programming language
This is all real messy, and I see lots of overlapping concepts and unclear cases. Are state machines their own thing, part of LTS, or an implementation detail of the specification approach? Should we be distinguishing between the mathematical formalisms and the languages themselves? What about the vaster world of specifying code and not designs? etc etc etc
Great essay! A few thoughts / minor disagreements —
I would not lump assembly/machine code in with the Algol family. It has very different characteristics like direct branching, heavy use of a stack, a finite register set, total lack of data structures… and of course it’s homoiconic.
Squeak is probably the most important surviving Smalltalk. It’s notable for being largely implemented in itself.
Erlang doesn’t belong in the same category as Smalltalk; its message passing is a separate mechanism than procedure call. I see Erlang as a functional language with actors. I’m not sure if the Actor model is worth its own category of language. If so, its Ur-language would be Act1, according to Wikipedia. Maybe other inherently-concurrent languages like E fall in the same category.
I think there’s an underexplored happy place (that D doesn’t entirely reach, but sort of gestures at) involving a fusion of Algol, Self, ML and the unlisted eighth language concept, metaprogramming (templates, macros, :gag: C preprocessor): pervasive type inference, mutable state in objects, immutable values with rich built-in semantics (tuples, sumtypes), lambdas and functional algorithms at the coarse domain level but (opt-in) mutable state (!) at the lowest level. The two big weaknesses of the ML family (IMO) are coarse state mutation and very fine algorithms; tail recursion is cute but iteration will always be nearer to the CPU’s heart. And even a ML program does (as a matter of observable behavior) have mutable state when it’s executed; OOP lets you model this fact where insistence on purity at all levels does not.
Then you end up, in the hexagonal model [1], with three levels that almost use entirely distinct languages:
Addendum: I think OOP gets a bad rap because people try to use it at the domain level, not the application level. A warning sign is when you see a class hierarchy that’s deeper than two levels. The important insight that classes should encapsulate components of the service as a whole that can change state and trigger actions independently. They should never represent values.
[1] https://vaadin.com/blog/ddd-part-3-domain-driven-design-and-the-hexagonal-architecture
I would be more precise: Actors are missing. Actors are like objects and also like lambda calculi, so actor languages are like Lisps or Self; but the semantics are usually concurrent in a way which the typical Lisp or Self descendant lacks. The ur-language for concurrent actors is presumably E.
Or maybe Erlang?
Oh yeah, I wasn’t even thinking of that.
Arguably actors live on most prominently in the microservices/event-sourcing design.
Good article.
If you want an absolute gem of a programming language, check out Factor. The development of it seems to have slowed down a bit when Slava Pestov stopped working on it, but it was in an amazing state at that point already. It’ll rewire your brain because of being a concatenative language and how you can (or have to) do things.
Highly recommended.
Yes, I was about to recommend Factor as the first choice for Forthlikes. I don’t know gForth, but if it’s like most trad Forths it’s a lot harder to get your head around.
For learning how to build a Forth, good resources are JonesForth (x86 assembly) or Quackery (Python.) The former has exhaustively commented source code, the latter has a whole book describing the language and implementation. I recommend the exercise — I learned a LOT from reading the source code of FIG Forth for the 8080 as a teen.
Another great one is retroforth. Despite its name, it is a thoroughly modern forth. I feel like it combines some of the nice parts of Joy/Factor, while being a bit more simple. http://www.retroforth.org/
As for K’s, I think that ngn/k is the best one to try. https://codeberg.org/ngn/k
BQN would also be a great choice for an APL descendant language. https://mlochbaum.github.io/BQN/
Very relevant is Van Roy’s book and Oz, which attempts a similar breakdown https://www.amazon.com/-/es/Peter-Van-Roy/dp/0262220695
This poster summarizes Van Roy’s breakdown https://www.info.ucl.ac.be/~pvr/paradigms.html
I would add to this list:
m4
Tcl is a beautiful fusion of both. Current programming is infested with too many bad versions of them!
I think lisp covers the macro category well. Your others are worth calling out specifically though.
I think textual macros have a significantly different flavour from Lisp macros, and a distinct history, going back to Strachey’s GPM in the 1960s.
TRAC and SAM76 are other significant macro languages. (Ted Nelson teaches TRAC in his classic book “Computer Lib”.)
No mention of the actual Ur language!
https://en.wikipedia.org/wiki/Ur_(programming_language)
The article is mistaken: Forth was originally written iteratively over time in FORTRAN, as a helper deck of cards Chuck Moore would carry around with him begining as far back as the 1960s. Controlling radio telescopes was just his thing.
Very good list but I think that it’s worth considering scripting languages, in ergonomics if not particularly fundamental features. Command line languages (bash and its predecessors and descendents) fit into this category for me, but I’m not sure what the ur-scripting-lang would be for them. Perl is definitely notable but too recent; awk maybe? SNOBOL?
REXX?
It is definitely lacking a lineage that starts with Z of formal modeling/constraint programming
I gave a list of 7 formal specification ur languages over at the orange site:
This is all real messy, and I see lots of overlapping concepts and unclear cases. Are state machines their own thing, part of LTS, or an implementation detail of the specification approach? Should we be distinguishing between the mathematical formalisms and the languages themselves? What about the vaster world of specifying code and not designs? etc etc etc
Great essay! A few thoughts / minor disagreements —