Note that the second paragraph is an hypothesis. It is generally accepted that the fact that languages are mutable is the reason that the strong form doesn’t hold, but the relationship between patterns of thought and language is still very fuzzy.
The later hypothesis depends on the idea programming languages are immutable and I find that somewhat misleading. There are two ways in which you modify a language to express thoughts that couldn’t be expressed in the old language:
Add new vocabulary (or redefine existing vocabulary).
Add (or modify) grammar.
The first of these is far more common in natural languages. Most fields have some form of jargon, which extends an existing language with new terms to cover domain-specific knowledge. In most programming languages you can add:
New proper nouns (objects / structures / records)
New nouns (field / variable names)
New verbs (methods / functions)
New adjectives (generic types parameterised on another type)
New adverbs (higher-order functions)
In some ways, programming languages are more extensible than natural languages because the dictionary, which defines the meaning of the word (in a prescriptivist world view, at least), is carried around with the word. You can use a new word only in a context that carries its definition (via source code or some compiled representation) and that means that you always have a mechanism for providing both the new word and its definition to consumers of your jargon.
This is a space that I keep trying to persuade psycholinguists to look at more because I think it has some fascinating implications on language design. In particular, I came across a paper about 15 years ago that showed that only about 10-20% of humans naturally think in terms of hierarchies. This was born out by the success of early versions of iTunes (which replaced an hierarchical file structure with a filter mechanism and was incredibly popular with non-programmers). I hypothesise that the fact that hierarchies are so deeply entrenched in the grammar of most programming languages is a significant factor in the fact that only 10-20% of people find it easy to learn to program. Most languages have hierarchy for namespaces, for scopes, for call stacks, and so on. They’re so pervasive that we don’t notice that they’re there and it’s hard to imagine a language without them, yet the most accessible languages (Excel with its flat grid, visual data-flow languages with their pipelines) don’t have any kind of strict hierarchy.
Correct, everything beyond the first paragraph is my own speculation.
The point you make about extending vocabulary is a good one, but I don’t agree with your mapping of parts of speech.
For one, I don’t think nouns map to variable/value/field names. Consider this Java snippet:
Dog fido = new Dog()
Here, it seems clear that fido (variable name) is a proper noun, and it is instead Dog (a type) that is a noun.
This is a bit of a philosophical point, but I also don’t think verbs map well to functions, at least not in a language where they’re first-class. Say I have a function that accepts dough and returns bread, If I can assign that a name and pass it around as I can data, it seems better described as an oven, rather than the action of baking. I would say that function types correspond to nouns, and function names to proper nouns.
This means that the things we can most flexibly define are proper nouns, which are unambiguously not part of the language. Just as my name is not a part of English, my variable/value names are not a part of a programming language.
We can still define new nouns and adjectives, in the form of types and traits, but our power to do this in most languages is very weak, as we can only define new types as hierarchies of existing ones.
This is a bit of a philosophical point, but I also don’t think verbs map well to functions, at least not in a language where they’re first-class. Say I have a function that accepts dough and returns bread, If I can assign that a name and pass it around as I can data, it seems better described as an oven, rather than the action of baking. I would say that function types correspond to nouns, and function names to proper nouns.
This is indeed a bit of a philosophical question, as in, it’s a question that even Aristotle struggled with in De Interpretatione, so it’s literally 2500 year-old. (Caveat: if wanna read it, be careful, because his definition for ‘noun’ and ‘verb’ map really badly to modern English grammar).
That being said, “a name [denoting an action] that you can pass around as data” would correspond to a verb in the infinitive or present participle form. As in, you can read (defun add (x y) ...) as either “to add two numbers” (FWIW this is what a lot of old English recipes books used for recipe names ;-) ) or “[a description of] adding two numbers”. Aristotle had it easy, he only needed the ancient Greek infinitive to cover all that ground. But it could also be worse – I think modern English grammar no longer distinguishes between participle and gerund form, so those two suffice, but some Romance languages need four forms to cover all that ground (infinitive, participle, gerund and supine).
Verbs in the present participle form act as nouns, so present participle forms that have come to have enough of a denotative sense on their own (like “reading”) are actually listed in the dictionary as nouns. Verbs in the infinitive can act as nouns, too (“To see the world is all he wanted”) but you won’t find “to see” listed as a noun.
Why do these count as verbs? The answer is also language-dependent, and there are more of them in the English language, but there are two of them that are particularly relevant to your example:
First, and one which Aristotle would’ve likely used as well [1]: “oven” is inherently countable, whereas (edit: a manipulable description of) “the act of baking” is not (and not because it’s a mass noun, like furniture or money). Two distinct ovens carry out the same act of baking, so the act of baking must be distinct from any individual oven. On the other hand, an oven can be built out of bricks, but the act of baking cannot be, so the act of baking must also be distinct from any oven in the world, current or future. (A further argument, that the idea of baking must also be distinct from the idea of an oven, as it would’ve been impossible for the idea of something that performs an action to exist without the idea of that action already being in existence, would’ve got me expelled from the lyceum, but I’m gonna make it just to annoy the school principal, for old time’s sake).
Second, because the act of baking can have tenses, voices, and takes complements, which an oven does not.
Some of these are a little hard to translate to functions per se (but so is the act of baking) because code inherently has no tense (although I guess one could argue the tense is implied: all code is in the future tense, all logs are in the past tense, and as for the present maybe Zeno had a point about that stupid tortoise of his). However, some of them do readily translate: functions, for example, can take complements (in fact the ones that take arguments don’t even work without them), whereas nouns do not.
As in, he would’ve used this to distinguish between “ovens” and “baking” as separate types of things (separate causes, among others). Aristotle’s De Interpretatione is 1/3 semiotics, 1/3 metaphysics, and 1/3 a grammar of the Ancient Greek language, which is why his classification of nouns and verbs maps so badly to modern English grammar (besides the obvious, that ancient Greek is not English, thank God). Aristotle would’ve classified “oven”, “baking” and “baked” as nouns, “is baking” and “is baked” as verbs, and “was baking” as a case of a verb.
This was some interesting context, thanks for sharing!
Two distinct ovens carry out the same act of baking, so the act of baking must be distinct from any individual oven.
I fixated a bit on this, and I actually think it makes my case. I can theoretically write many different functions that all accept dough and return bread, these are all individual ovens with proper noun names. These all have the same type though, which is the abstract concept of baking or oven.
We’re kind of running into the same problem scholastic philosophers started running into during the Renaissance – reasoning by analogy is attractive but it’s a dangerous thing to do, because there are many things that apply to physical processes which accept dough and return bread, but don’t apply to functions in code.
That being said, this is actually a fun exercise so what the hell :-D.
I can come up with a “function that accepts dough and returns bread” by subjecting it to laser beams of different wavelengths. It doesn’t have to happen in an enclosure that traps heat: water from the outer layer evaporates more quickly than the one inside, so a high-wavelength laser will create a dry outer layer, trapping the more liquid core, which will then be easy to heat up using a lower-wavelength laser (e.g. in the microwave region). Arguably, it’s baking, because it accepts dough and returns bread, but since I’m poking laser beams at the dough, this is a lightsaber, not an oven. Baking can’t be both an oven and a lightsaber. (Although you might be interested to think of it in Platonic terms: both an oven and a lightsaber could be said to partake in the idea of baking, which is a thought I find particularly interesting because it implies Martha Stewart and Darth Vader have a lot of common ground).
Or, coming at it from a more, erm, aristotelic angle. A function that accepts dough and returns bread describes a particular type of change. In order for it to be an oven, an oven (I mean an actual oven, or at least the abstract idea of an oven, although that would also get us expelled from the lyceum) should also describe a particular type of change. But it doesn’t: first of all, an oven isn’t a description, and second, it’s an object (or a class of objects) which can be used to enact many types of changing: the changing of dough into bread, but also the change of clay into pottery (which is sort of like baking but certainly not the same function), or the change of a combustible material into ash, smoke and vapour (which is nothing like baking).
Now, to give ancient philosophers a break: unfortunately, I haven’t gone to school in an English-speaking country so I can’t say for certain but it seems to me that American schools (presumably British ones, too, but I’ve not spoken to enough British people) teach grammar in the same backwards way that they taught me grammar when I was in school. One of the many silly things they tell you, mostly because it makes it very easy for teachers to inflict psychological pain upon their students, is that “the type” and “the meaning” of a word is what determines what part of speech it is: if it’s an object, it’s a common noun, unless it’s a person or a particular object that was given a name, in which case it’s a proper noun (but not just any particular object, that would just be an articulated noun). If it’s an action, it’s a verb.
This causes a great deal of confusion, and not only about nouns and verbs, but also about nouns and adjectives, and especially adjectives and adverbs, where it’s particularly obvious that it’s bollocks. For example, they teach you that adjectives are supposed to describe an object: “health” is obviously a noun, but what about “healthy”? It’s an adjective in “I’ve never seen a healthy Panda” but an adverb in “It’s important to eat healthy”.
In fact, at least in the grammar of European languages (I honestly have no idea how this goes for others), it’s the syntactic behaviour of a word that determines what part of speech it is (edit: that is to say, most modern authors define parts of speech not just in terms of meaning, but also in terms of what syntactic role it plays). Or, to put it in possibly more adequate terms (caveat: I’m not sure if I’m using the correct English terminology here), parts of speech are syntactic classes, rather than semantic categories: word are assigned to one part of speech or another (noun, verb etc.) based on the structural relationships between these elements and other items in a larger grammatical structure, rather than meaning.
(Edit: that’s why a verb, for example, is described not only in terms of what it conveys through its meaning, but also in terms of what it conveys through its relationship with other words in a sentence: it’s not just “what it means”, but also what role it can play in a sentence (a predicate), and how it may be changed in relation to a wider context – i.e. whether it can have tense, voice, number, gender and so on.)
That is why the same word sometimes displays the behaviour of different parts of speech depending on context. For example, “to write” can act as a noun, playing the role of a subject in a sentence: “To write seemed foolish under these circumstances”. But it can also act as a verb: “I intended to write you earlier”.
So the name that denotes a particular sequence of instructions applied to a certain set of arguments (“a function”) can “act as” a verb or a noun without breaking any law :-). That’s why it “feels like a noun” when you do this:
btn->on_click_cb = &my_click_callback;
but it “feels like a verb” when you do this:
btn->on_click_callback(&btn, &data);
There’s an even more complicated story about this in the English language about how verb phrases and noun phrases work. But I honestly don’t know it well enough because English isn’t my first language so I never studied English grammar in that much depth (also I honestly hated grammar in school, and that definitely didn’t help)
Edit: oh, yeah. That raises the obvious question: well, if it’s not (just) the meaning of the word that determines what part of speech it is, how come we say “write” is a verb? I mean, I literally just gave an example where “to write” acts as a noun, but if you look up “to write” in the dictionary it says verb. Why do we treat these things – nouns, verbs, adjectives – like they’re totally distinct things that never overlap?
Well, first, for the most part, they don’t, so it’s convenient. Second, there’s actually some tortuous history here, which you can sort of guess from the etymology: “noun” comes from “nomen”, name. It was supposed – “was” as in, back in the Renaissance or so, when we first started compiling complete descriptive grammars of European languages, Latin, that is, because it’s always f&^@ing Latin – to mean exactly that: a word that names something, whatever that is. “Verb” comes from “verbum”, “word”, which seems like it’s entirely arbitrary until you realize the Latins (from which we took the word) inherited the dichotomy of Greek grammar between “onoma” (name) and “rhema” (word, as in saying, utterance, something that is said).
Plato is, I think, the first one who operated with this distinction, between rhema (words that describe actions) and onoma (words that describe those who perform actions). The distinction is a little obscure (here) and there’s been plenty of speculation about how and why Plato came up with those names. The one that’s probably least controversial IMHO, as in, it relies the least on understanding Plato’s metaphysics while making it least likely for the ghost of Plato to haunt you because you’re using it wrong, is that “rhema” are “things that are said of others”, whereas “onoma” are, well, said “others”.
However, it’s worth bearing in mind that Plato was explicitly talking only about a fairly restricted type of sentences. The word that Plato uses is commonly translated as “discourse” (that’s the term used above, too) – which, in the most generous interpretation of Plato’s text, would be defined as “an utterance that can be either true or false”. The fact that we then kept this terminology for every sentence and word out there is entirely on us. But the fact that meaning is insufficient to determine what part of speech a word is (or, to put it another way, that the same words can be a different part of speech in different contexts, so there’s no 1:1 mapping between them) has dawned on many other people since Plato’s time.
tl;dr IMHO you’re not wrong to think of funtions as either verbs or nouns, there are plenty of words that act as either in different contexts. I am also extremely fun at parties.
That being said, this is actually a fun exercise so what the hell :-D.
Damn right it is!
It is kind of an exercise in backwards reasoning though, In truth my argument about function types being nouns, only holds because the people who invented first-class functions already believed functions should be equivalent to data, and so implemented the language that way.
I think you’re right on the money with the on_click example, the act of applying a function is what makes it a verb.
I think you’re right on the money with the on_click example, the act of applying a function is what makes it a verb.
Yeah, I didn’t want to amend my post with that because it was already way too long but, even though we call them by the same name, on_click in the function call and on_click in the function definition, for example, are instances of different things. We don’t need to summon the ghosts of dead Greek philosophers to figure that out, it’s enough to think of how a compiler would treat the most trivial case – procedures (no arguments) in a language without higher-order functions. The former would be just an alias for an address, whereas the latter is a binding of a scope (and a sequence of instructions in it) to a compilation state. That’s why you can replace the former with an address, but not the latter.
Ancient Greek philosophers sure had a lot of free time…
They’re so pervasive that we don’t notice that they’re there and it’s hard to imagine a language without them, yet the most accessible languages (Excel with its flat grid, visual data-flow languages with their pipelines) don’t have any kind of strict hierarchy.
I think @crazyloglad was interested in some applications of hierarchy-less models for UIs (among others), too, and I’ve no idea if @-mentions get you pinged in any way but maybe they do :-D?
@mentioning did not ping me in, though I did however just stumble upon this from the thread being briefly mentioned on IRC. I take it that you are referring to Pipeworld.
Pipeworld is indeed poking at the intersection of decomposing hierarchies into data streams, presenting that as different aural/visual representations and recomposing that dynamically into one or several hierarchies by the user, and a handful of other things – though I am at a loss for describing it in a more approachable way; there is a lot to unpack in there and lacking other incentives, channeled my inner Diogenes and just dumped some cryptic visuals and feature descriptions.
A lot of the interest was born out of poking around in SCADA systems and the hypotheticals of how these would evolve as more and more cheap compute gets introduced and the ‘we are not connected to a larger communications network’ gneigh- sayers stopped horsing around. Multiple stakeholders have different needs from shared compute over shared constraints, so how should the entirety of the user interface be formed to satisfy and encourage collaboration between them — that sort of reasoning.
Looping back a bit to the thread and article in question, though it dips into the same realm, how about a detour through music and the role of sheet music and its notation from the vantage point of the composer; individual performers; conductor; the orchestra and the audience. How much of the evolution of music notation are we actually peepholing here?
Hm interesting hypothesis. I would point to R and Matlab as other examples of mostly non-programmer languages without much hierarchy and namespacing. The users are highly technical but the code I’ve seen uses less namespacing than code from programmers. Actually when they move to Python, there seem to be frequent complaints about imports and the like (which is exacerbated by Python’s complex packaging mechanism)
Shell basically has one namespace for commands / functions, and I think people like that too. However I do think non-programmers do have problems with the file system hierarchy, which good shell scripts will make use of.
I like the overall point of comparing human language as a framework for thought in the same way your first programming language is a framework for programming, but I have a few comments about the linguistic half, as someone with a degree in it.
I think comparing humans to compilers is a bit of a mistake. Compilers have a binary acceptable/unacceptable state, while humans process language on a range of acceptability, and it’s that range of acceptability that allows human language to change. The fact that if i change my compiler it won’t run on your machine isn’t a minor technical difficulty - it’s a fundamental difference.
This is because language change doesn’t just happen on a communal scale - it happens on an individual level as well. The language a person speaks is always changing, so you always have to be able to change your internal processor.
This also means that the distinction they draw between codified and ad-hoc languages is correct, except that all natural languages, linguistically speaking, are ad-hoc by their definition, so they’re just redefining the dichotomy they mention.
As a side note, I think the author is missing the most interesting comparison of natural language and computer languages - HTML. With its explicit “be liberal in what you accept” philosophy, the web allows users to write technically invalid HTML while still displaying acceptable webpages. This is a much better analogy for human language, where you can still understand someone speaking a different dialect.
This also means that the distinction they draw between codified and ad-hoc languages is correct, except that all natural languages, linguistically speaking, are ad-hoc by their definition, so they’re just redefining the dichotomy they mention.
It isn’t the same dichotomy though, I did give two examples that this dichotomy classifies differently. I made this distinction to illustrate the same point you made, that computer languages (as we use them today at least) aren’t comparable in terms of evolution to “natural” natural languages because they can’t be ad-hoc.
As a side note, I think the author is missing the most interesting comparison of natural language and computer languages - HTML. With its explicit “be liberal in what you accept” philosophy, the web allows users to write technically invalid HTML while still displaying acceptable webpages. This is a much better analogy for human language, where you can still understand someone speaking a different dialect.
This is an interesting case, but I think it still has the same shortcoming. There’s no way to convey new constructs or reach consensus on them. It really only enables backwards-compatibility, which, I guess, is what it was designed for.
I’m not sure I’m convinced by the linguistic argument here.
The author writes off the strong hypothesis of linguistic determinism on the following basis:
We can propose a mechanism to explain this starting from two premises: humans are capable of abstract thought in the absence of language, and are capable of modifying their languages. From these, we can conclude not only that the strong version is false, but that the relationship appears to flow in the other direction: one thinks an abstract thought, and modifies their language to express it. Thought determines language.
I get that this is a programming blog post and not a linguistics PhD thesis, but it really skips over a lot of the important details – were things really this simple, nobody would have accepted strong linguistic determinism in the first place. But more importantly, an approach based on the modifiability of natural language sells language itself a bit short, and this has consequences for the rest of the author’s argument.
Yes, the author is absolutely right to point out the ways in which language changes, and to use this as an argument against strong linguistic determinism. But in the context of PLT, I don’t think it’s relevant. The fact of the matter is that in a huge number of cases, speakers do not have the power to modify the language they use to express a given idea. Perhaps they are L2 speakers, or bound to a particular formal standard – it doesn’t matter, because since the development of the very first language families, humans have had all the grammatical tools they need to express anything they want in any language. In light of this, the so-called “weak Sapir-Whorf” hypothesis can be summarised from the perspective of a language user in the following maxim:
All languages can express all ideas, but every language sees the world in a different way.
That is to say, no modification needed! There is nothing I can say in English that can’t be translated roughly into any other language. Where does that leave us? Well, the points about the modifiability of programming languages for the most part still stand, of course. I’m just not convinced that the comparison to natural languages in this context is particularly productive. If programming languages are like natural languages, then it surely shouldn’t particularly matter which one you use: I can express my ideas in German or in Russian with more-or-less the same (average) level of efficiency, even if the structures involved are different. But that isn’t the case when it comes to programming languages – some languages, when compared to others for example, have very tangible differences in the likely ‘correctness’ of the resulting program, or in the effort required on the programmer’s part to write it. That’s because programming languages are not like natural languages, and in particular, exhibit different properties when it comes to modifiability.
(NB: there’s also a lot to be said here about natural language development – the author’s argument depends somewhat on the idea that language changes according to its speakers’ needs, which is by no means a given – but I don’t know nearly enough about this to argue about it!)
but it really skips over a lot of the important details
Agreed, I don’t have a background in linguistics, so my ideas are lacking nuance and should be taken with a pinch of salt. I tried to make clear with my word choice that this was a speculative exploration of the topic.
speakers do not have the power to modify the language they use to express a given idea. Perhaps they are L2 speakers
I would contend this point. Where I live, there’s a language variant specific to L2 speakers. Formal standards do restrict modification, but I did address that in the aside about codified vs ad-hoc languages. L2 speakers are more likely to be held to formal standards, but I think that has more to say about culture and politics than it does about the nature of language.
All languages can express all ideas, but every language sees the world in a different way.
This is an interesting idea to explore. Is this the case because there is a certain universality to language, or because humans from different cultures have similar ideas they commonly wish to express on account of being human, and so sculpt their languages to be able to express the same ideas?
I would offer a different analysis of this in regards to programming languages. General purpose programming languages are Turing complete, so for any program in one language, I could write a program in any other that represents the same output/state change. In this sense all general purpose programming languages could be said to be able to express ideas equivalently. How they express those ideas might be different though due to constraints like “no mutable data”, “no first-class functions” etc. This looks similar to how you can express the same idea in German and Russian, but might have to express it differently. From this perspective programming languages come out looking rather similar to natural ones.
Thanks for your reply, didn’t realise you were the author! It was a very interesting post, so thank you for sharing it here.
Formal standards do restrict modification, but I did address that in the aside about codified vs ad-hoc languages. L2 speakers are more likely to be held to formal standards, but I think that has more to say about culture and politics than it does about the nature of language.
I agree with you here, I’m not sure I made my point particularly clear. What I mean is that modifiability is in no way a prerequisite to the free use of language. There are people who speak (a proposed variant of) reconstructed Proto-Indo-European for fun and are in no way limited in the concepts they can express.
Is this the case because there is a certain universality to language, or because humans from different cultures have similar ideas they commonly wish to express on account of being human, and so sculpt their languages to be able to express the same ideas?
How they express those ideas might be different though due to constraints like “no mutable data”, “no first-class functions” etc.
The important thing to note here is that for programming languages, these differences have a significant impact on the end result. We need to develop new programming languages because they allow us to write more efficient/correct/desirable programs more easily. More than 6000 years of natural language development, on the other hand, have not resulted in any improvement to our ability to express ideas using language; language change is just a thing that happens, with consequences for culture but no consequences for the efficiency or usefulness of language as a system. That’s why I hold that modifiability plays a fundamentally different role in PLT.
The major nit I had with it was the reduction of computer languages to the concept of a compiler. The problem holds across all implementations of boolean technology: mathematical relationships do not map to human languages. Compilers are the easiest way to see this, but you can’t loosely type or abstract your way to some higher level computer language and make the problem go away. It’s baked in.
Bonus points for the author understanding that intelligent thought comes first, spoken/performed languages later. If this weren’t the case, deaf illiterate people would be unintelligent, and that’s obviously not true at all.
As coders we want to make the tech work like our human languages. It is the endless struggle and the basis for all analysis. But you gotta realize that this is never going to happen. Otherwise you’ll end up spending a lot of mental effort for nothing.
I started with Pascal, and I wanted anonymous functions, map, and fold before I discovered languages that had them (and was delighted when I did). I suppose it proves the abstract thought hypothesis.
I’m not a very smart person though. Perhaps some people are just more predisposed to thinking outside their ready-made language constructs. Maybe they are the same people who like made-up words.
It seems that the problem with the Java example is indeed syntactic. I’d never seen lambdas in Java before and did a double take at how awkward the notation is. In any case, Java as a teaching language is probably not the best idea anyway - there’s so much up-front ceremony that you’ll either have to explain or handwave away. Scheme is much better because it’s so minimal and simple (of course, I’m biased), but you could do so much better when starting with Python, for example. If you have to teach a language that’s widely used in industry, that would probably be my first choice over C++, Java, JavaScript or PHP.
Regarding extensibility in languages resulting in unreadable code if you rely too heavily on macros; doesn’t that make sense? If I started to converse with you in English using some made-up words, you might be able to follow me by gleaning the meaning through context (or interrupting me and asking - similar to viewing the source of a macro), but if I add too many of those, I’ll have lost you. That doesn’t mean it’s not valuable to be able to invent and introduce new words (macros) sparingly, if they are used between a small group of people using the same vocabulary (programming team).
At least with macros you don’t need a custom compiler, so you can just use my code without needing to know exactly all the macros I introduced to make my own life easier.
Not sure where I fall on this essay as a whole, but it is not easy to implement a conforming Common Lisp or Scheme. The toy meta-circular evaluator in SICP is just a toy. Both languages are big (CL bigger) and both entail very serious technical challenges.
It’s relatively easy to build a Lisp dialect if you can lean heavily on the host language. I think that’s really the point - most of the early MIT AI lab research like e.g. PLANNER, CONNIVER and of course Scheme itself were all done on top of earlier Lisps (mostly Lisp 1.5 or MacLisp, I think). And think what you will of Paul Graham’s Arc what you like, but it was also built on top of another Lisp (Racket, in this case, which, ironically, is now using Chez Scheme as a host language, but didn’t start out that way). You can start small and build it out (Scheme wasn’t as fully featured as it is today, of course). Also, it doesn’t have to be s-expression based; there’s a JavaScript written in Guile and a teaching subset of Java in Racket. Also, Julia is partially written in Scheme. Of course, at least starting out with s-expressions is a lot easier since you don’t need to write a parser.
Common lisp is quite large, and r7rs-large grows by the month; but their core operational models are not huge (this is more true of scheme than cl, given e.g. clos), and an implementation of the interesting bits is not prohibitively difficult.
I’d agree about Scheme except for tail calls and continuations and hygienic macros. From my point of view a Scheme is next to useless without tail calls and syntax-case, the latter of which is definitely not trivial.
So, for context, I knew nothing about hygienic macros before today. I was vaguely aware of their purpose, and of scheme’s syntax for them, but that was it. Here is a quick-‘n’-dirty implementation of syntax-rules for s7 scheme I was able to devise in a little over two hours, based on r7rs and a little googling to explain the behaviour of nested captures (which I think I got right). It doesn’t actually implement hygiene—ironic, but it is fairly trivial—nor a couple of other features, but I do not think any major pieces are missing.
I am willing to aver that it is a bit more complicated than I thought, but if someone like me, whose prior exposure to hygienic macros was effectively nil, is able to construct a near-passable implementation in 2 hours, I think my original statement that it is not prohibitively difficult stands.
Note that the second paragraph is an hypothesis. It is generally accepted that the fact that languages are mutable is the reason that the strong form doesn’t hold, but the relationship between patterns of thought and language is still very fuzzy.
The later hypothesis depends on the idea programming languages are immutable and I find that somewhat misleading. There are two ways in which you modify a language to express thoughts that couldn’t be expressed in the old language:
The first of these is far more common in natural languages. Most fields have some form of jargon, which extends an existing language with new terms to cover domain-specific knowledge. In most programming languages you can add:
In some ways, programming languages are more extensible than natural languages because the dictionary, which defines the meaning of the word (in a prescriptivist world view, at least), is carried around with the word. You can use a new word only in a context that carries its definition (via source code or some compiled representation) and that means that you always have a mechanism for providing both the new word and its definition to consumers of your jargon.
This is a space that I keep trying to persuade psycholinguists to look at more because I think it has some fascinating implications on language design. In particular, I came across a paper about 15 years ago that showed that only about 10-20% of humans naturally think in terms of hierarchies. This was born out by the success of early versions of iTunes (which replaced an hierarchical file structure with a filter mechanism and was incredibly popular with non-programmers). I hypothesise that the fact that hierarchies are so deeply entrenched in the grammar of most programming languages is a significant factor in the fact that only 10-20% of people find it easy to learn to program. Most languages have hierarchy for namespaces, for scopes, for call stacks, and so on. They’re so pervasive that we don’t notice that they’re there and it’s hard to imagine a language without them, yet the most accessible languages (Excel with its flat grid, visual data-flow languages with their pipelines) don’t have any kind of strict hierarchy.
Correct, everything beyond the first paragraph is my own speculation.
The point you make about extending vocabulary is a good one, but I don’t agree with your mapping of parts of speech.
For one, I don’t think nouns map to variable/value/field names. Consider this Java snippet:
Dog fido = new Dog()
Here, it seems clear that fido (variable name) is a proper noun, and it is instead Dog (a type) that is a noun.
This is a bit of a philosophical point, but I also don’t think verbs map well to functions, at least not in a language where they’re first-class. Say I have a function that accepts dough and returns bread, If I can assign that a name and pass it around as I can data, it seems better described as an oven, rather than the action of baking. I would say that function types correspond to nouns, and function names to proper nouns.
This means that the things we can most flexibly define are proper nouns, which are unambiguously not part of the language. Just as my name is not a part of English, my variable/value names are not a part of a programming language.
We can still define new nouns and adjectives, in the form of types and traits, but our power to do this in most languages is very weak, as we can only define new types as hierarchies of existing ones.
(Note: not parent!)
This is indeed a bit of a philosophical question, as in, it’s a question that even Aristotle struggled with in De Interpretatione, so it’s literally 2500 year-old. (Caveat: if wanna read it, be careful, because his definition for ‘noun’ and ‘verb’ map really badly to modern English grammar).
That being said, “a name [denoting an action] that you can pass around as data” would correspond to a verb in the infinitive or present participle form. As in, you can read
(defun add (x y) ...)
as either “to add two numbers” (FWIW this is what a lot of old English recipes books used for recipe names ;-) ) or “[a description of] adding two numbers”. Aristotle had it easy, he only needed the ancient Greek infinitive to cover all that ground. But it could also be worse – I think modern English grammar no longer distinguishes between participle and gerund form, so those two suffice, but some Romance languages need four forms to cover all that ground (infinitive, participle, gerund and supine).Verbs in the present participle form act as nouns, so present participle forms that have come to have enough of a denotative sense on their own (like “reading”) are actually listed in the dictionary as nouns. Verbs in the infinitive can act as nouns, too (“To see the world is all he wanted”) but you won’t find “to see” listed as a noun.
Why do these count as verbs? The answer is also language-dependent, and there are more of them in the English language, but there are two of them that are particularly relevant to your example:
Some of these are a little hard to translate to functions per se (but so is the act of baking) because code inherently has no tense (although I guess one could argue the tense is implied: all code is in the future tense, all logs are in the past tense, and as for the present maybe Zeno had a point about that stupid tortoise of his). However, some of them do readily translate: functions, for example, can take complements (in fact the ones that take arguments don’t even work without them), whereas nouns do not.
This was some interesting context, thanks for sharing!
I fixated a bit on this, and I actually think it makes my case. I can theoretically write many different functions that all accept dough and return bread, these are all individual ovens with proper noun names. These all have the same type though, which is the abstract concept of baking or oven.
:: Dough -> Bread
a la Haskell
Function<Dough, Bread>
a la Java
We’re kind of running into the same problem scholastic philosophers started running into during the Renaissance – reasoning by analogy is attractive but it’s a dangerous thing to do, because there are many things that apply to physical processes which accept dough and return bread, but don’t apply to functions in code.
That being said, this is actually a fun exercise so what the hell :-D.
I can come up with a “function that accepts dough and returns bread” by subjecting it to laser beams of different wavelengths. It doesn’t have to happen in an enclosure that traps heat: water from the outer layer evaporates more quickly than the one inside, so a high-wavelength laser will create a dry outer layer, trapping the more liquid core, which will then be easy to heat up using a lower-wavelength laser (e.g. in the microwave region). Arguably, it’s baking, because it accepts dough and returns bread, but since I’m poking laser beams at the dough, this is a lightsaber, not an oven. Baking can’t be both an oven and a lightsaber. (Although you might be interested to think of it in Platonic terms: both an oven and a lightsaber could be said to partake in the idea of baking, which is a thought I find particularly interesting because it implies Martha Stewart and Darth Vader have a lot of common ground).
Or, coming at it from a more, erm, aristotelic angle. A function that accepts dough and returns bread describes a particular type of change. In order for it to be an oven, an oven (I mean an actual oven, or at least the abstract idea of an oven, although that would also get us expelled from the lyceum) should also describe a particular type of change. But it doesn’t: first of all, an oven isn’t a description, and second, it’s an object (or a class of objects) which can be used to enact many types of changing: the changing of dough into bread, but also the change of clay into pottery (which is sort of like baking but certainly not the same function), or the change of a combustible material into ash, smoke and vapour (which is nothing like baking).
Now, to give ancient philosophers a break: unfortunately, I haven’t gone to school in an English-speaking country so I can’t say for certain but it seems to me that American schools (presumably British ones, too, but I’ve not spoken to enough British people) teach grammar in the same backwards way that they taught me grammar when I was in school. One of the many silly things they tell you, mostly because it makes it very easy for teachers to inflict psychological pain upon their students, is that “the type” and “the meaning” of a word is what determines what part of speech it is: if it’s an object, it’s a common noun, unless it’s a person or a particular object that was given a name, in which case it’s a proper noun (but not just any particular object, that would just be an articulated noun). If it’s an action, it’s a verb.
This causes a great deal of confusion, and not only about nouns and verbs, but also about nouns and adjectives, and especially adjectives and adverbs, where it’s particularly obvious that it’s bollocks. For example, they teach you that adjectives are supposed to describe an object: “health” is obviously a noun, but what about “healthy”? It’s an adjective in “I’ve never seen a healthy Panda” but an adverb in “It’s important to eat healthy”.
In fact, at least in the grammar of European languages (I honestly have no idea how this goes for others), it’s the syntactic behaviour of a word that determines what part of speech it is (edit: that is to say, most modern authors define parts of speech not just in terms of meaning, but also in terms of what syntactic role it plays). Or, to put it in possibly more adequate terms (caveat: I’m not sure if I’m using the correct English terminology here), parts of speech are syntactic classes, rather than semantic categories: word are assigned to one part of speech or another (noun, verb etc.) based on the structural relationships between these elements and other items in a larger grammatical structure, rather than meaning.
(Edit: that’s why a verb, for example, is described not only in terms of what it conveys through its meaning, but also in terms of what it conveys through its relationship with other words in a sentence: it’s not just “what it means”, but also what role it can play in a sentence (a predicate), and how it may be changed in relation to a wider context – i.e. whether it can have tense, voice, number, gender and so on.)
That is why the same word sometimes displays the behaviour of different parts of speech depending on context. For example, “to write” can act as a noun, playing the role of a subject in a sentence: “To write seemed foolish under these circumstances”. But it can also act as a verb: “I intended to write you earlier”.
So the name that denotes a particular sequence of instructions applied to a certain set of arguments (“a function”) can “act as” a verb or a noun without breaking any law :-). That’s why it “feels like a noun” when you do this:
but it “feels like a verb” when you do this:
There’s an even more complicated story about this in the English language about how verb phrases and noun phrases work. But I honestly don’t know it well enough because English isn’t my first language so I never studied English grammar in that much depth (also I honestly hated grammar in school, and that definitely didn’t help)
Edit: oh, yeah. That raises the obvious question: well, if it’s not (just) the meaning of the word that determines what part of speech it is, how come we say “write” is a verb? I mean, I literally just gave an example where “to write” acts as a noun, but if you look up “to write” in the dictionary it says verb. Why do we treat these things – nouns, verbs, adjectives – like they’re totally distinct things that never overlap?
Well, first, for the most part, they don’t, so it’s convenient. Second, there’s actually some tortuous history here, which you can sort of guess from the etymology: “noun” comes from “nomen”, name. It was supposed – “was” as in, back in the Renaissance or so, when we first started compiling complete descriptive grammars of European languages, Latin, that is, because it’s always f&^@ing Latin – to mean exactly that: a word that names something, whatever that is. “Verb” comes from “verbum”, “word”, which seems like it’s entirely arbitrary until you realize the Latins (from which we took the word) inherited the dichotomy of Greek grammar between “onoma” (name) and “rhema” (word, as in saying, utterance, something that is said).
Plato is, I think, the first one who operated with this distinction, between rhema (words that describe actions) and onoma (words that describe those who perform actions). The distinction is a little obscure (here) and there’s been plenty of speculation about how and why Plato came up with those names. The one that’s probably least controversial IMHO, as in, it relies the least on understanding Plato’s metaphysics while making it least likely for the ghost of Plato to haunt you because you’re using it wrong, is that “rhema” are “things that are said of others”, whereas “onoma” are, well, said “others”.
However, it’s worth bearing in mind that Plato was explicitly talking only about a fairly restricted type of sentences. The word that Plato uses is commonly translated as “discourse” (that’s the term used above, too) – which, in the most generous interpretation of Plato’s text, would be defined as “an utterance that can be either true or false”. The fact that we then kept this terminology for every sentence and word out there is entirely on us. But the fact that meaning is insufficient to determine what part of speech a word is (or, to put it another way, that the same words can be a different part of speech in different contexts, so there’s no 1:1 mapping between them) has dawned on many other people since Plato’s time.
tl;dr IMHO you’re not wrong to think of funtions as either verbs or nouns, there are plenty of words that act as either in different contexts. I am also extremely fun at parties.
Damn right it is!
It is kind of an exercise in backwards reasoning though, In truth my argument about function types being nouns, only holds because the people who invented first-class functions already believed functions should be equivalent to data, and so implemented the language that way.
I think you’re right on the money with the
on_click
example, the act of applying a function is what makes it a verb.Yeah, I didn’t want to amend my post with that because it was already way too long but, even though we call them by the same name,
on_click
in the function call andon_click
in the function definition, for example, are instances of different things. We don’t need to summon the ghosts of dead Greek philosophers to figure that out, it’s enough to think of how a compiler would treat the most trivial case – procedures (no arguments) in a language without higher-order functions. The former would be just an alias for an address, whereas the latter is a binding of a scope (and a sequence of instructions in it) to a compilation state. That’s why you can replace the former with an address, but not the latter.Ancient Greek philosophers sure had a lot of free time…
Your imperative view is valid, but there’s also a relational way of considering verbs: let verbs designate relations, not functions.
I think @crazyloglad was interested in some applications of hierarchy-less models for UIs (among others), too, and I’ve no idea if @-mentions get you pinged in any way but maybe they do :-D?
@mentioning did not ping me in, though I did however just stumble upon this from the thread being briefly mentioned on IRC. I take it that you are referring to Pipeworld.
Pipeworld is indeed poking at the intersection of decomposing hierarchies into data streams, presenting that as different aural/visual representations and recomposing that dynamically into one or several hierarchies by the user, and a handful of other things – though I am at a loss for describing it in a more approachable way; there is a lot to unpack in there and lacking other incentives, channeled my inner Diogenes and just dumped some cryptic visuals and feature descriptions.
A lot of the interest was born out of poking around in SCADA systems and the hypotheticals of how these would evolve as more and more cheap compute gets introduced and the ‘we are not connected to a larger communications network’ gneigh- sayers stopped horsing around. Multiple stakeholders have different needs from shared compute over shared constraints, so how should the entirety of the user interface be formed to satisfy and encourage collaboration between them — that sort of reasoning.
Looping back a bit to the thread and article in question, though it dips into the same realm, how about a detour through music and the role of sheet music and its notation from the vantage point of the composer; individual performers; conductor; the orchestra and the audience. How much of the evolution of music notation are we actually peepholing here?
Hm interesting hypothesis. I would point to R and Matlab as other examples of mostly non-programmer languages without much hierarchy and namespacing. The users are highly technical but the code I’ve seen uses less namespacing than code from programmers. Actually when they move to Python, there seem to be frequent complaints about imports and the like (which is exacerbated by Python’s complex packaging mechanism)
Shell basically has one namespace for commands / functions, and I think people like that too. However I do think non-programmers do have problems with the file system hierarchy, which good shell scripts will make use of.
Shutt discusses extensibility.
I like the overall point of comparing human language as a framework for thought in the same way your first programming language is a framework for programming, but I have a few comments about the linguistic half, as someone with a degree in it.
I think comparing humans to compilers is a bit of a mistake. Compilers have a binary acceptable/unacceptable state, while humans process language on a range of acceptability, and it’s that range of acceptability that allows human language to change. The fact that if i change my compiler it won’t run on your machine isn’t a minor technical difficulty - it’s a fundamental difference.
This is because language change doesn’t just happen on a communal scale - it happens on an individual level as well. The language a person speaks is always changing, so you always have to be able to change your internal processor.
This also means that the distinction they draw between codified and ad-hoc languages is correct, except that all natural languages, linguistically speaking, are ad-hoc by their definition, so they’re just redefining the dichotomy they mention.
As a side note, I think the author is missing the most interesting comparison of natural language and computer languages - HTML. With its explicit “be liberal in what you accept” philosophy, the web allows users to write technically invalid HTML while still displaying acceptable webpages. This is a much better analogy for human language, where you can still understand someone speaking a different dialect.
It isn’t the same dichotomy though, I did give two examples that this dichotomy classifies differently. I made this distinction to illustrate the same point you made, that computer languages (as we use them today at least) aren’t comparable in terms of evolution to “natural” natural languages because they can’t be ad-hoc.
This is an interesting case, but I think it still has the same shortcoming. There’s no way to convey new constructs or reach consensus on them. It really only enables backwards-compatibility, which, I guess, is what it was designed for.
I’m not sure I’m convinced by the linguistic argument here.
The author writes off the strong hypothesis of linguistic determinism on the following basis:
I get that this is a programming blog post and not a linguistics PhD thesis, but it really skips over a lot of the important details – were things really this simple, nobody would have accepted strong linguistic determinism in the first place. But more importantly, an approach based on the modifiability of natural language sells language itself a bit short, and this has consequences for the rest of the author’s argument.
Yes, the author is absolutely right to point out the ways in which language changes, and to use this as an argument against strong linguistic determinism. But in the context of PLT, I don’t think it’s relevant. The fact of the matter is that in a huge number of cases, speakers do not have the power to modify the language they use to express a given idea. Perhaps they are L2 speakers, or bound to a particular formal standard – it doesn’t matter, because since the development of the very first language families, humans have had all the grammatical tools they need to express anything they want in any language. In light of this, the so-called “weak Sapir-Whorf” hypothesis can be summarised from the perspective of a language user in the following maxim:
That is to say, no modification needed! There is nothing I can say in English that can’t be translated roughly into any other language. Where does that leave us? Well, the points about the modifiability of programming languages for the most part still stand, of course. I’m just not convinced that the comparison to natural languages in this context is particularly productive. If programming languages are like natural languages, then it surely shouldn’t particularly matter which one you use: I can express my ideas in German or in Russian with more-or-less the same (average) level of efficiency, even if the structures involved are different. But that isn’t the case when it comes to programming languages – some languages, when compared to others for example, have very tangible differences in the likely ‘correctness’ of the resulting program, or in the effort required on the programmer’s part to write it. That’s because programming languages are not like natural languages, and in particular, exhibit different properties when it comes to modifiability.
(NB: there’s also a lot to be said here about natural language development – the author’s argument depends somewhat on the idea that language changes according to its speakers’ needs, which is by no means a given – but I don’t know nearly enough about this to argue about it!)
Agreed, I don’t have a background in linguistics, so my ideas are lacking nuance and should be taken with a pinch of salt. I tried to make clear with my word choice that this was a speculative exploration of the topic.
I would contend this point. Where I live, there’s a language variant specific to L2 speakers. Formal standards do restrict modification, but I did address that in the aside about codified vs ad-hoc languages. L2 speakers are more likely to be held to formal standards, but I think that has more to say about culture and politics than it does about the nature of language.
This is an interesting idea to explore. Is this the case because there is a certain universality to language, or because humans from different cultures have similar ideas they commonly wish to express on account of being human, and so sculpt their languages to be able to express the same ideas?
I would offer a different analysis of this in regards to programming languages. General purpose programming languages are Turing complete, so for any program in one language, I could write a program in any other that represents the same output/state change. In this sense all general purpose programming languages could be said to be able to express ideas equivalently. How they express those ideas might be different though due to constraints like “no mutable data”, “no first-class functions” etc. This looks similar to how you can express the same idea in German and Russian, but might have to express it differently. From this perspective programming languages come out looking rather similar to natural ones.
Thanks for your reply, didn’t realise you were the author! It was a very interesting post, so thank you for sharing it here.
I agree with you here, I’m not sure I made my point particularly clear. What I mean is that modifiability is in no way a prerequisite to the free use of language. There are people who speak (a proposed variant of) reconstructed Proto-Indo-European for fun and are in no way limited in the concepts they can express.
Well, if you believe Noam Chomsky’s fan club, there is definitely a very significant universal component to language. At any rate, the evidence for this theory is very compelling.
The important thing to note here is that for programming languages, these differences have a significant impact on the end result. We need to develop new programming languages because they allow us to write more efficient/correct/desirable programs more easily. More than 6000 years of natural language development, on the other hand, have not resulted in any improvement to our ability to express ideas using language; language change is just a thing that happens, with consequences for culture but no consequences for the efficiency or usefulness of language as a system. That’s why I hold that modifiability plays a fundamentally different role in PLT.
Good essay.
The major nit I had with it was the reduction of computer languages to the concept of a compiler. The problem holds across all implementations of boolean technology: mathematical relationships do not map to human languages. Compilers are the easiest way to see this, but you can’t loosely type or abstract your way to some higher level computer language and make the problem go away. It’s baked in.
Bonus points for the author understanding that intelligent thought comes first, spoken/performed languages later. If this weren’t the case, deaf illiterate people would be unintelligent, and that’s obviously not true at all.
As coders we want to make the tech work like our human languages. It is the endless struggle and the basis for all analysis. But you gotta realize that this is never going to happen. Otherwise you’ll end up spending a lot of mental effort for nothing.
Shameless plug: a related essay, which includes a nice diagram: https://danielbmarkham.com/the-canyon/
I started with Pascal, and I wanted anonymous functions, map, and fold before I discovered languages that had them (and was delighted when I did). I suppose it proves the abstract thought hypothesis.
I’m not a very smart person though. Perhaps some people are just more predisposed to thinking outside their ready-made language constructs. Maybe they are the same people who like made-up words.
It seems that the problem with the Java example is indeed syntactic. I’d never seen lambdas in Java before and did a double take at how awkward the notation is. In any case, Java as a teaching language is probably not the best idea anyway - there’s so much up-front ceremony that you’ll either have to explain or handwave away. Scheme is much better because it’s so minimal and simple (of course, I’m biased), but you could do so much better when starting with Python, for example. If you have to teach a language that’s widely used in industry, that would probably be my first choice over C++, Java, JavaScript or PHP.
Regarding extensibility in languages resulting in unreadable code if you rely too heavily on macros; doesn’t that make sense? If I started to converse with you in English using some made-up words, you might be able to follow me by gleaning the meaning through context (or interrupting me and asking - similar to viewing the source of a macro), but if I add too many of those, I’ll have lost you. That doesn’t mean it’s not valuable to be able to invent and introduce new words (macros) sparingly, if they are used between a small group of people using the same vocabulary (programming team).
At least with macros you don’t need a custom compiler, so you can just use my code without needing to know exactly all the macros I introduced to make my own life easier.
Not sure where I fall on this essay as a whole, but it is not easy to implement a conforming Common Lisp or Scheme. The toy meta-circular evaluator in SICP is just a toy. Both languages are big (CL bigger) and both entail very serious technical challenges.
It’s relatively easy to build a Lisp dialect if you can lean heavily on the host language. I think that’s really the point - most of the early MIT AI lab research like e.g. PLANNER, CONNIVER and of course Scheme itself were all done on top of earlier Lisps (mostly Lisp 1.5 or MacLisp, I think). And think what you will of Paul Graham’s Arc what you like, but it was also built on top of another Lisp (Racket, in this case, which, ironically, is now using Chez Scheme as a host language, but didn’t start out that way). You can start small and build it out (Scheme wasn’t as fully featured as it is today, of course). Also, it doesn’t have to be s-expression based; there’s a JavaScript written in Guile and a teaching subset of Java in Racket. Also, Julia is partially written in Scheme. Of course, at least starting out with s-expressions is a lot easier since you don’t need to write a parser.
Common lisp is quite large, and r7rs-large grows by the month; but their core operational models are not huge (this is more true of scheme than cl, given e.g. clos), and an implementation of the interesting bits is not prohibitively difficult.
I’d agree about Scheme except for tail calls and continuations and hygienic macros. From my point of view a Scheme is next to useless without tail calls and syntax-case, the latter of which is definitely not trivial.
[Comment removed by author]
So, for context, I knew nothing about hygienic macros before today. I was vaguely aware of their purpose, and of scheme’s syntax for them, but that was it. Here is a quick-‘n’-dirty implementation of syntax-rules for s7 scheme I was able to devise in a little over two hours, based on r7rs and a little googling to explain the behaviour of nested captures (which I think I got right). It doesn’t actually implement hygiene—ironic, but it is fairly trivial—nor a couple of other features, but I do not think any major pieces are missing.
I am willing to aver that it is a bit more complicated than I thought, but if someone like me, whose prior exposure to hygienic macros was effectively nil, is able to construct a near-passable implementation in 2 hours, I think my original statement that it is not prohibitively difficult stands.