If humans are going to be reading, supporting, and re-writing code, I don’t see why we’d want to eschew one’s strongest language, say, English, in favor of one that reads like hieroglyphics.
Look at what humans already do in domains that require precision, as programming does. “reads like hieroglyphics” is a fair description of a lot of mathematical notation, yet mathematicians have long preferred symbology to English - if you think programmers shouldn’t then you should be able to explain why mathematicians do. Lawyers write “English” but in a famously stilted style, full of clunky standardized constructions, to the point that it’s sometimes considered a distinct dialect (“legalese”) and one could reasonably ask whether a variant that used a symbol-based language would be more readable. And of course it bears remembering that the most popular human first language is notated not with alphabetics but with ideograms where distinct concepts generally have their own distinct symbols.
Even within programming, humans who are explaining an algorithm to other humans tend not to use English but rather a mix of “pseudocode” (sometimes likened to Python) and mathematical notation.
Erlang is notorious for being ‘ugly,’ but I wonder what that’s all about. Truly. I like to think most my Erlang code is composed of English sentences, lumped together into paragraphs, and contained in solitary modules. It’s familiar, unsurprising, and quite beautiful when one’s naming is in top form.
Really? Most languages allow sensible plain English names for concepts. The part of Erlang that’s notoriously ugly - and the part that makes it read very unlike English - is its unusual punctuation style. If the author really finds Erlang English-like to read, I can only assume this is because they’re much more familiar with Erlang than other languages, rather than the language being objectively more English-like than, say, Python or Java.
Erlang’s punctuation style is more like English than most C-likes out there. Let’s take this statement:
I will need a few items on my trip: if it’s sunny, sunscreen, water, and a hat; if it’s rainy, an umbrella, and a raincoat; if it’s windy, a kite, and a shirt.
In a C-style language (here, go) it might look something like
switch weather.Now() {
case weather.Sunny:
sunscreen()
water()
hat()
case weather.Rainy:
umbrella()
raincoat()
case weather.Windy:
kite()
shirt()
}
In Erlang it could just be:
case weather() of
sunny -> sunscreen(), water(), hat();
rainy -> umbrella(), raincoat();
windy -> kite(), shirt()
end.
You even get to keep the , for enumeration/sequences, ; for alternative clauses, and . for full stops!
Mathematicians use a seamless hybrid of prose and formula:
For every v ∈ V, there exists w ∈ V such that v + w = 0.
Similarly in code, some parts are more like formulae, some are more like prose, and some are more like tables or figures… and it’s interesting to consider separate syntaxes for these different types of definitions.
Have a look at the Inform 7 manual’s section on equations, for an example. Here is a (formal, compiling, working) definition of what should happen when the player types push cannonball (I’ve used bullet lists in Markdown to get indentation without monospace):
Equation - Newton’s Second Law
F=ma
where F is a force, m is a mass, a is an acceleration.
Equation - Principle of Conservation of Energy
mgh = mv^2/2
where m is a mass, h is a length, v is a velocity, and g is the acceleration due to gravity.
Equation - Galilean Equation for a Falling Body
v = gt
where g is the acceleration due to gravity, v is a velocity, and t is an elapsed time.
Instead of pushing the cannon ball:
let the falling body be the cannon ball;
let m be the mass of the falling body;
let h be 1.2m;
let F be given by Newton’s Second Law where a is the acceleration due to gravity;
let v be given by the Principle of Conservation of Energy;
let t be given by the Galilean Equation for a Falling Body;
say “You push [the falling body] off the bench, at a height of [h], and, subject to a downward force of [F], it falls. [t to the nearest 0.01s] later, this mass of [m] hits the floor at [v].”;
now the falling body is in the location.
(Yes, the Inform 7 compiler will solve equations for you. Why aren’t normal programming languages capable of this kind of high school math? Are we living in some kind of weird bubble?)
Why aren’t normal programming languages capable of this kind of high school math?
They are. Inform 7 uses English words as syntax, but otherwise is a normal programming language. Why do you think this can’t be solved in, say, Python?
Of course it can be solved in any general purpose programming language, but none of them feature dimensional analysis or algebraic equation solving out of the box in a convenient and natural way… yet Inform 7 does, oddly.
I find this interesting because that stuff would seem to be the most obvious use case for computing machines from, like, a 1930s perspective.
Ah, my apologies for completely misunderstanding what you said.
I completely agree. Python (and maybe Basic?) are close, but even then they fall dramatically short of “language as code” as Inform 7 does. I wonder what keeps Inform 7 from becoming a more general purpose programming language?
Also, English isn’t everyone’s strongest language. As natural languages go, it’s pretty complex and inconsistent. If you want your code to be understood by people who are more familiar and comfortable with other natural languages, then your own familiarity with English isn’t necessarily such an advantage in writing code.
I like this essay, and it makes me think about a couple things.
…the length and detail of a name should be proportional to its scope: A variable that is referenced across multiple files ..
It’s funny, because I like to write CoffeeScript, and you can’t really do that: variables are file-scoped by default. It’s a preference thing. (Some people like to shadow lexical bindings? not I)
Another issue with picking long names vs short names is that often (depending on language), naming is itself a leaky abstraction, because you just know it’s gonna allocate an entire word if you want “int cc_exp = credit_card_expiration; ..cc_exp..”. I don’t think all languages give you aliases for free. We also contort to make things fit 80 characters. Variables can’t have spaces (except Common LISP?), most can’t have dashes, I really want subscripts!—it feels like those password prompts: “5-8 characters, alphanumeric only”. At least we don’t have 8.3 FILENA~1. We learned to name files! (except the pain with spaces, again..) The tools often give naming a hard/constrained time, since it’s somehow never a priority.
I wish a lot of code were just a little more wasteful/redundant to be a lot more readable.
Your point about spaces is a bit confusing because on the one hand you complain you can’t have spaces in variables, but then complain about problems with spaces in filenames. I am not sure what you want at this point? Maybe some kind of `variable notation`?
Another language that allows spaces in variables is SQL. You can have spaces in e.g. table names just fine.
I want spaces in variable names & filenames. I do tend to use spaces in filenames, but I also forget to quote my “$@” bash variables so it tends to hurt. I think it’s crazy some people (usually programmers..) always_use_underscores.txt to sidestep it altogether. I actually really like SQL in this regard, since you’re right: spaces are ok, and aliases are free (you can even start a column name with a number). I also much prefer spaces in regexes like Perl 6 does right.
Look at what humans already do in domains that require precision, as programming does. “reads like hieroglyphics” is a fair description of a lot of mathematical notation, yet mathematicians have long preferred symbology to English - if you think programmers shouldn’t then you should be able to explain why mathematicians do. Lawyers write “English” but in a famously stilted style, full of clunky standardized constructions, to the point that it’s sometimes considered a distinct dialect (“legalese”) and one could reasonably ask whether a variant that used a symbol-based language would be more readable. And of course it bears remembering that the most popular human first language is notated not with alphabetics but with ideograms where distinct concepts generally have their own distinct symbols.
Even within programming, humans who are explaining an algorithm to other humans tend not to use English but rather a mix of “pseudocode” (sometimes likened to Python) and mathematical notation.
Really? Most languages allow sensible plain English names for concepts. The part of Erlang that’s notoriously ugly - and the part that makes it read very unlike English - is its unusual punctuation style. If the author really finds Erlang English-like to read, I can only assume this is because they’re much more familiar with Erlang than other languages, rather than the language being objectively more English-like than, say, Python or Java.
Erlang’s punctuation style is more like English than most C-likes out there. Let’s take this statement:
In a C-style language (here, go) it might look something like
In Erlang it could just be:
You even get to keep the
,
for enumeration/sequences,;
for alternative clauses, and.
for full stops!That example is a little misleading. In C and C++ it could be written like this, which arguably reads more like English than either Erlang or Go:
all you need is the full stop (
.
) instead of{}
and then you have all the tokens of Erlang!Mathematicians use a seamless hybrid of prose and formula:
Similarly in code, some parts are more like formulae, some are more like prose, and some are more like tables or figures… and it’s interesting to consider separate syntaxes for these different types of definitions.
Have a look at the Inform 7 manual’s section on equations, for an example. Here is a (formal, compiling, working) definition of what should happen when the player types
push cannonball
(I’ve used bullet lists in Markdown to get indentation without monospace):Equation - Newton’s Second Law
where F is a force, m is a mass, a is an acceleration.
Equation - Principle of Conservation of Energy
where m is a mass, h is a length, v is a velocity, and g is the acceleration due to gravity.
Equation - Galilean Equation for a Falling Body
where g is the acceleration due to gravity, v is a velocity, and t is an elapsed time.
Instead of pushing the cannon ball:
(Yes, the Inform 7 compiler will solve equations for you. Why aren’t normal programming languages capable of this kind of high school math? Are we living in some kind of weird bubble?)
They are. Inform 7 uses English words as syntax, but otherwise is a normal programming language. Why do you think this can’t be solved in, say, Python?
Of course it can be solved in any general purpose programming language, but none of them feature dimensional analysis or algebraic equation solving out of the box in a convenient and natural way… yet Inform 7 does, oddly.
I find this interesting because that stuff would seem to be the most obvious use case for computing machines from, like, a 1930s perspective.
Ah, my apologies for completely misunderstanding what you said.
I completely agree. Python (and maybe Basic?) are close, but even then they fall dramatically short of “language as code” as Inform 7 does. I wonder what keeps Inform 7 from becoming a more general purpose programming language?
Also, English isn’t everyone’s strongest language. As natural languages go, it’s pretty complex and inconsistent. If you want your code to be understood by people who are more familiar and comfortable with other natural languages, then your own familiarity with English isn’t necessarily such an advantage in writing code.
Was going to say something along these lines. These discussions tend to be extremely anglocentric.
Additionally, many of us work with distributed teams, where there’s no one “strongest language” anyway.
The problem with English, of course, is the difficulty in properly parsing and converting it in to a syntax tree that everyone can agree with:
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo
using this as an occasion to trot out one of my favorite essays on naming: https://blog.janestreet.com/whats-in-a-name/
I like this essay, and it makes me think about a couple things.
It’s funny, because I like to write CoffeeScript, and you can’t really do that: variables are file-scoped by default. It’s a preference thing. (Some people like to shadow lexical bindings? not I)
Another issue with picking long names vs short names is that often (depending on language), naming is itself a leaky abstraction, because you just know it’s gonna allocate an entire word if you want “int cc_exp = credit_card_expiration; ..cc_exp..”. I don’t think all languages give you aliases for free. We also contort to make things fit 80 characters. Variables can’t have spaces (except Common LISP?), most can’t have dashes, I really want subscripts!—it feels like those password prompts: “5-8 characters, alphanumeric only”. At least we don’t have 8.3 FILENA~1. We learned to name files! (except the pain with spaces, again..) The tools often give naming a hard/constrained time, since it’s somehow never a priority.
I wish a lot of code were just a little more wasteful/redundant to be a lot more readable.
Your point about spaces is a bit confusing because on the one hand you complain you can’t have spaces in variables, but then complain about problems with spaces in filenames. I am not sure what you want at this point? Maybe some kind of `
variable notation
`?Another language that allows spaces in variables is SQL. You can have spaces in e.g. table names just fine.
I want spaces in variable names & filenames. I do tend to use spaces in filenames, but I also forget to quote my “$@” bash variables so it tends to hurt. I think it’s crazy some people (usually programmers..) always_use_underscores.txt to sidestep it altogether. I actually really like SQL in this regard, since you’re right: spaces are ok, and aliases are free (you can even start a column name with a number). I also much prefer spaces in regexes like Perl 6 does right.
This would’ve been much improved by the inclusion of actual code examples. :(