I do think the importance of lexical issues in language design is really, profoundly under-appreciated. But, that said, it’s not even a particularly primary part of what the author is advocating for.
Let me observe that the Web is just a poor messaging system. Replace that old clunky HTTP with something like ZeroMQ, and you remove even the remaining cost to scaling. Suddenly you don’t need massive servers at the heart of your networks. We can move from Soviet-style central planning to a real market. “You want some code? I got some code.”
“Soviet-style central planning” is a rather spurious insult here, but I do tend to agree that we need better protocols for decentralized messaging. The author talks about the goal of building infrastructure that can manage a household network-of-things, without having to keep track of every single thing individually.
They advocate, also, for this infrastructure to send source code over the wire; whether that’s reasonable depends on the computational resources we assume these devices have. Certainly there’s an analogy, though the author doesn’t make it, to SQL, which is based on a similar notion that the server that actually has the data locally is in a far better position to optimize a query against it than any other server can possibly be, and that the result is less data over the wire, saving bandwidth. Of course, in the specific case of SQL, it’s intentionally not an imperative language (in the general case), because optimization would be a much harder problem if it were… This ZeroScript proposal is imperative, and loses that benefit, but the somewhat nebulous design goals don’t seem to include data access, so perhaps that’s okay.
They’ve decided, as programmers often do, that the fundamental problem is best solved with a new language. I always feel that such things should be done with careful attention to scope - making this too narrow, only targeting it for network-of-things devices, is likely to seriously hurt its adoption. Also, the author has neglected to discuss how they believe this language will be better for its purpose than JavaScript, Lua, or other incumbents.
Still, new language projects are always worth looking at. They often give useful data about pain points in existing languages, and sometimes there are novel solutions worth remembering for one’s own ill-advised efforts. :)
I suppose it is ambiguous whether " is opening or closing a string, whereas < and > are directional. In theory a language could require “ and ” characters instead of ", but I would personally find that frustrating. My keyboard doesn’t have “ or ” keys.
If this is even a problem (which I don’t think it is), you could use ' and ", though they look less directional than < and >. But at least they don’t have mathematical importance.
Yeah, that’s my guess at the motivation. I sympathize with wanting programming to be less English-centric, but ASCII quotation marks are on most keyboard layouts, while the various quotation delimiters that other human languages use are each only on the layouts for those languages.
There’s way too much variety for a move to these to actually increase inclusiveness… Just look through Quotation mark - Summary table for all languages (wikipedia). Be sure to notice how the commonly-used options include « … », »…«, „…“, 「…」 and its vertical equivalent
﹁
︰
﹂, which looks okay in the edit box but has been flattened to one line in the preview.
I wouldn’t want to be in the position of having to advocate for one of these over the others. Yes, it sucks that English is the default, but any solution should do more than changing the default.
(I am a bit of a broken record on this project, but I happen to advocate for basing future languages on self-describing binary data formats with stylesheets that map them to and from textual syntaxes. Then you could just swap them out for each user.)
I think the right approach is to support most of these quoting pairs at equivalent priority. Even staying just within ascii it’s sometimes quite nice to have both () and [] available for grouping for clarity in math-ish code.
I am 100% in agreement with you that future languages should have their canonical rep be something like a binary AST that is presentable in as many text syntaxes as you like. I’ve thought a bit about something similar to zeroscript, and I think an interesting direction to take it is:
1.) a universal parsing/grammar mechanism
2.) that surfaces structure in an SEXP like byte exact binary format that can play the same role as protobuffers, avro, etc
3.) a minimalist scheme/lisp like homoiconic transformation language that operates within the format.
The closest exiting thing I’m aware of to this is the OMeta stuff from viewpoints.
I happen to advocate for basing future languages on self-describing binary data formats with stylesheets that map them to and from textual syntaxes. Then you could just swap them out for each user.
Thanks for asking. No, the .NET CLR is a virtual machine; it gets (as far as I’m aware) an intermediate representation which you wouldn’t want to use as source code. This would be an abstract syntax tree, but serialized. Of course, to make this useful there needs to be a “text” editor that understands it! And there also needs to be a grep equivalent, because programmers rely heavily on that sort of thing and that’s been a major stumbling block for previous binary-source-code efforts.
I have gotten decently far with this, but it’s still at least a year from minimal usefulness. The data format is my first step, or rather the schema format, because I think that’s actually the hardest part of the problem space. My write-up doesn’t really reflect my latest ideas on why it’s important or what role it will fill once it works, but I haven’t had time to revise it… You can have a look here if you want. Sorry about the JavaScript requirement.
Okay. I guess I have no opinion, then, on whether this is sufficiently high-fidelity to be round-tripped on every edit. I still think a solution that isn’t tied to .NET would be ideal. :)
That’s true, I wasn’t saying you should use .NET, but was just wondering if it was what you were thinking.
The CLR can make round-trips to and from any of the .NET languages without loss in quality, mainly because all the classes and stuff are in the metadata.
From what I understand of your idea though, it sounds like you don’t want the virtual machine layer, but that the text editor would simply read the code and display it differently based on localization settings or something?
Ugh - I just looked, and got an unpleasant reminder that the RTF-to-weird-format translation thing that I cobbled together still doesn’t handle apostrophes. Embarrassing. Oh well. :)
So true… and double quotes are not ambiguous, we use them to delimit special strings (think quotes) in English, why is there something wrong with that in source code?
Don’t fix what isn’t broken.
And the point that < and > already have meaningful uses is incredibly true. How is this
I do think the importance of lexical issues in language design is really, profoundly under-appreciated. But, that said, it’s not even a particularly primary part of what the author is advocating for.
“Soviet-style central planning” is a rather spurious insult here, but I do tend to agree that we need better protocols for decentralized messaging. The author talks about the goal of building infrastructure that can manage a household network-of-things, without having to keep track of every single thing individually.
They advocate, also, for this infrastructure to send source code over the wire; whether that’s reasonable depends on the computational resources we assume these devices have. Certainly there’s an analogy, though the author doesn’t make it, to SQL, which is based on a similar notion that the server that actually has the data locally is in a far better position to optimize a query against it than any other server can possibly be, and that the result is less data over the wire, saving bandwidth. Of course, in the specific case of SQL, it’s intentionally not an imperative language (in the general case), because optimization would be a much harder problem if it were… This ZeroScript proposal is imperative, and loses that benefit, but the somewhat nebulous design goals don’t seem to include data access, so perhaps that’s okay.
They’ve decided, as programmers often do, that the fundamental problem is best solved with a new language. I always feel that such things should be done with careful attention to scope - making this too narrow, only targeting it for network-of-things devices, is likely to seriously hurt its adoption. Also, the author has neglected to discuss how they believe this language will be better for its purpose than JavaScript, Lua, or other incumbents.
Still, new language projects are always worth looking at. They often give useful data about pain points in existing languages, and sometimes there are novel solutions worth remembering for one’s own ill-advised efforts. :)
How is a double-quote ambiguous? Also,
<
and>
already have perfectly meaningful uses, and to me are very confusing as string delimiters.I suppose it is ambiguous whether
"
is opening or closing a string, whereas < and > are directional. In theory a language could require “ and ” characters instead of"
, but I would personally find that frustrating. My keyboard doesn’t have “ or ” keys.That being said, I think plain old
"
is fine.If this is even a problem (which I don’t think it is), you could use
'
and"
, though they look less directional than < and >. But at least they don’t have mathematical importance.Or just use UTF-8 fancy quotes “ and ” (which it appears Lobste.rs will do for you -> “ ”).
Parallel lines in geometry :)
Or as some people used to do before rich text editors automatically inserted “smart-quotes”, use `this' to quote.
Maybe the author was going for something like guillemets that are easier to type.
Yeah, that’s my guess at the motivation. I sympathize with wanting programming to be less English-centric, but ASCII quotation marks are on most keyboard layouts, while the various quotation delimiters that other human languages use are each only on the layouts for those languages.
There’s way too much variety for a move to these to actually increase inclusiveness… Just look through Quotation mark - Summary table for all languages (wikipedia). Be sure to notice how the commonly-used options include « … », »…«, „…“, 「…」 and its vertical equivalent ﹁ ︰ ﹂, which looks okay in the edit box but has been flattened to one line in the preview.
I wouldn’t want to be in the position of having to advocate for one of these over the others. Yes, it sucks that English is the default, but any solution should do more than changing the default.
(I am a bit of a broken record on this project, but I happen to advocate for basing future languages on self-describing binary data formats with stylesheets that map them to and from textual syntaxes. Then you could just swap them out for each user.)
I think the right approach is to support most of these quoting pairs at equivalent priority. Even staying just within ascii it’s sometimes quite nice to have both () and [] available for grouping for clarity in math-ish code.
I am 100% in agreement with you that future languages should have their canonical rep be something like a binary AST that is presentable in as many text syntaxes as you like. I’ve thought a bit about something similar to zeroscript, and I think an interesting direction to take it is: 1.) a universal parsing/grammar mechanism 2.) that surfaces structure in an SEXP like byte exact binary format that can play the same role as protobuffers, avro, etc 3.) a minimalist scheme/lisp like homoiconic transformation language that operates within the format.
The closest exiting thing I’m aware of to this is the OMeta stuff from viewpoints.
Kind of like the .NET framework’s CLR?
Thanks for asking. No, the .NET CLR is a virtual machine; it gets (as far as I’m aware) an intermediate representation which you wouldn’t want to use as source code. This would be an abstract syntax tree, but serialized. Of course, to make this useful there needs to be a “text” editor that understands it! And there also needs to be a grep equivalent, because programmers rely heavily on that sort of thing and that’s been a major stumbling block for previous binary-source-code efforts.
I have gotten decently far with this, but it’s still at least a year from minimal usefulness. The data format is my first step, or rather the schema format, because I think that’s actually the hardest part of the problem space. My write-up doesn’t really reflect my latest ideas on why it’s important or what role it will fill once it works, but I haven’t had time to revise it… You can have a look here if you want. Sorry about the JavaScript requirement.
FWIW most .NET assemblies can be disassembled into quite readable C# code.
Okay. I guess I have no opinion, then, on whether this is sufficiently high-fidelity to be round-tripped on every edit. I still think a solution that isn’t tied to .NET would be ideal. :)
That’s true, I wasn’t saying you should use .NET, but was just wondering if it was what you were thinking.
The CLR can make round-trips to and from any of the .NET languages without loss in quality, mainly because all the classes and stuff are in the metadata.
From what I understand of your idea though, it sounds like you don’t want the virtual machine layer, but that the text editor would simply read the code and display it differently based on localization settings or something?
Ugh - I just looked, and got an unpleasant reminder that the RTF-to-weird-format translation thing that I cobbled together still doesn’t handle apostrophes. Embarrassing. Oh well. :)
So true… and double quotes are not ambiguous, we use them to delimit special strings (think quotes) in English, why is there something wrong with that in source code?
Don’t fix what isn’t broken.
And the point that < and > already have meaningful uses is incredibly true. How is this
more readable than this?
Just use
y .LE. z .AND. x .GT. a
!Is this a joke or are you serious?