From the second sentence on, this post is just full of misunderstandings. It’s confusing the issue of being context free vs. decidable or statically parsed. I mention the same Perl example here in Parsing Bash is Undecidable
This is is a case where you have to run arbitrary code to fully parse Perl or shell, or in C++’s case, arbitrary computation at compile time during template instantiation.
Most languages do not have the problem that bash, Perl, and C++ have – e.g. Go, Python, Java, Rust, Julia. Those languages aren’t context free either. And there is some matter of perspective, see:
Whether it’s context free doesn’t matter that much in a practical sense, and is probably open for debate. It’s not a clear question because of the issues mentioned in Trevor Jim’s articles (are you considering the lexical structure, etc.)
Being decidable is very desirable for engineering. Larry Wall has said that Raku / Perl 6 specifically fixes the bug in Perl 5.
Does it? Raku’s parser is extensible, and you can run arbitrary code in the parser. So it is not decideable if a given parse will even terminate. (That said, if the parse does terminate, you are left with a single, unambiguous parse tree. I find this uninteresting from a human standpoint, and from a machine standpoint, you can easily JIT, as at least one historical APL implementation did.)
As far as I understand the extensibility is limited by some kind of delimiter.
So it depends on what you define as “parsing” – I think you can parse statically while ignoring the extensible parts? Whereas in Perl 5 there are places where you can’t do that – I think it affects the rest of the file.
It’s similar to the Trevor Jim articles – you have to define “parsing” to say if it’s context-free or not.
I don’t have a link handy for the quote, but my memory is that Wall said something like, “In Perl 5, we were sometimes confused about what language we’re parsing …” which I interpreted to mean that.
In any case the extensibility for Perl 6 is intentional, while I’m pretty sure that rule for Perl 5 is unintentional.
I think C++ metaprogramming is mostly unintentional as well. And the “feedback” back into the parser is due to the “lexer hack” or typename-identifier problem, as far as I remember.
People were “discovering” how to program the template instantiate mechanism for many years, and never used the feedback into the parser. Similar issue with Swift, although I think there is no feedback into the parser and Swift is statically parseable: https://forums.swift.org/t/swift-type-checking-is-undecidable/39024
The bash/ksh feature was unintentional, and OpenBSD ksh actually fixed it after my report, and no one noticed because they didn’t use it!
being context-free doesn’t really have any useful engineering properties.
It does make things easier to implement and less convoluted, which for the developers I would argue is a good property. I believe that’s the reason most new languages do “name: type” in signatures rather than “type name”. (Yes the parsing is still context dependant, but… Less?)
By deferring the resolution until later, Go can be parsed faster. On the other hand it is not quite context-free.
I think this is a matter of perspective. From the perspective of evaluation (whether it be interpretation or compilation), no interesting language can be meaningfully context-free. However, from the perspective of the parser, this is indeed context-free.
You could imagine an alternative parser that intertwines some context-sensitive semantic analysis, and returns a Call or a Conversion instead of a CallOrConversion node. That would still be a perfectly valid Go implementation, since this element of the grammar is expectedly ambiguous in the specification.
From the second sentence on, this post is just full of misunderstandings. It’s confusing the issue of being context free vs. decidable or statically parsed. I mention the same Perl example here in Parsing Bash is Undecidable
https://www.oilshell.org/blog/2016/10/20.html#appendix-parsing-c-perl-and-make-is-also-undecidable
This is is a case where you have to run arbitrary code to fully parse Perl or shell, or in C++’s case, arbitrary computation at compile time during template instantiation.
Most languages do not have the problem that bash, Perl, and C++ have – e.g. Go, Python, Java, Rust, Julia. Those languages aren’t context free either. And there is some matter of perspective, see:
http://trevorjim.com/how-to-prove-that-a-programming-language-is-context-free/
http://trevorjim.com/python-is-not-context-free/
http://trevorjim.com/c-and-cplusplus-are-not-context-free/
However, it doesn’t really matter as being context-free doesn’t really have any useful engineering properties.
Being decidable is very desirable for engineering. Larry Wall has said that Raku / Perl 6 specifically fixes the bug in Perl 5.
OSH fixes the bug in bash … dynamic parsing is hidden behind options like
shopt -s eval_unsafe_arith
.This is a bad article and I don’t recommend taking anything away from it …
Just to clarify:
Does it? Raku’s parser is extensible, and you can run arbitrary code in the parser. So it is not decideable if a given parse will even terminate. (That said, if the parse does terminate, you are left with a single, unambiguous parse tree. I find this uninteresting from a human standpoint, and from a machine standpoint, you can easily JIT, as at least one historical APL implementation did.)
As far as I understand the extensibility is limited by some kind of delimiter.
So it depends on what you define as “parsing” – I think you can parse statically while ignoring the extensible parts? Whereas in Perl 5 there are places where you can’t do that – I think it affects the rest of the file.
It’s similar to the Trevor Jim articles – you have to define “parsing” to say if it’s context-free or not.
I don’t have a link handy for the quote, but my memory is that Wall said something like, “In Perl 5, we were sometimes confused about what language we’re parsing …” which I interpreted to mean that.
In any case the extensibility for Perl 6 is intentional, while I’m pretty sure that rule for Perl 5 is unintentional.
I think C++ metaprogramming is mostly unintentional as well. And the “feedback” back into the parser is due to the “lexer hack” or typename-identifier problem, as far as I remember.
People were “discovering” how to program the template instantiate mechanism for many years, and never used the feedback into the parser. Similar issue with Swift, although I think there is no feedback into the parser and Swift is statically parseable: https://forums.swift.org/t/swift-type-checking-is-undecidable/39024
The bash/ksh feature was unintentional, and OpenBSD ksh actually fixed it after my report, and no one noticed because they didn’t use it!
My understanding is that it is not.
‘[S]ometimes confused about what language we’re parsing’ does not preclude the notion that none of the languages being parsed is itself extensible.
Err, replace ‘preclude the notion’ with ‘imply’.
It does make things easier to implement and less convoluted, which for the developers I would argue is a good property. I believe that’s the reason most new languages do “name: type” in signatures rather than “type name”. (Yes the parsing is still context dependant, but… Less?)
There are lots of non-context free languages that are easy to implement! They just haven’t been defined mathematically.
Example: Calc regular languages (2017)
http://spw17.langsec.org/papers/grosch-taming-length-fiels.pdf
You could define 20 more sets like that.
Again, context-free isn’t a useful engineering property. Being LALR(1) is a useful property
Conversely, context free is too restrictive for lots of useful languages
I think this is a matter of perspective. From the perspective of evaluation (whether it be interpretation or compilation), no interesting language can be meaningfully context-free. However, from the perspective of the parser, this is indeed context-free.
You could imagine an alternative parser that intertwines some context-sensitive semantic analysis, and returns a Call or a Conversion instead of a CallOrConversion node. That would still be a perfectly valid Go implementation, since this element of the grammar is expectedly ambiguous in the specification.