1. 17
  1. 9

    From the second sentence on, this post is just full of misunderstandings. It’s confusing the issue of being context free vs. decidable or statically parsed. I mention the same Perl example here in Parsing Bash is Undecidable

    https://www.oilshell.org/blog/2016/10/20.html#appendix-parsing-c-perl-and-make-is-also-undecidable

    This is is a case where you have to run arbitrary code to fully parse Perl or shell, or in C++’s case, arbitrary computation at compile time during template instantiation.

    Most languages do not have the problem that bash, Perl, and C++ have – e.g. Go, Python, Java, Rust, Julia. Those languages aren’t context free either. And there is some matter of perspective, see:

    http://trevorjim.com/how-to-prove-that-a-programming-language-is-context-free/

    http://trevorjim.com/python-is-not-context-free/

    http://trevorjim.com/c-and-cplusplus-are-not-context-free/

    The standard way to show that a language is not context free is to use Ogden’s Lemma.

    However, it doesn’t really matter as being context-free doesn’t really have any useful engineering properties.

    Being decidable is very desirable for engineering. Larry Wall has said that Raku / Perl 6 specifically fixes the bug in Perl 5.

    OSH fixes the bug in bash … dynamic parsing is hidden behind options like shopt -s eval_unsafe_arith.


    This is a bad article and I don’t recommend taking anything away from it …

    1. 5

      Just to clarify:

      • Go is statically parseable
      • Whether it’s context free doesn’t matter that much in a practical sense, and is probably open for debate. It’s not a clear question because of the issues mentioned in Trevor Jim’s articles (are you considering the lexical structure, etc.)
      1. 1

        Being decidable is very desirable for engineering. Larry Wall has said that Raku / Perl 6 specifically fixes the bug in Perl 5.

        Does it? Raku’s parser is extensible, and you can run arbitrary code in the parser. So it is not decideable if a given parse will even terminate. (That said, if the parse does terminate, you are left with a single, unambiguous parse tree. I find this uninteresting from a human standpoint, and from a machine standpoint, you can easily JIT, as at least one historical APL implementation did.)

        1. 2

          As far as I understand the extensibility is limited by some kind of delimiter.

          So it depends on what you define as “parsing” – I think you can parse statically while ignoring the extensible parts? Whereas in Perl 5 there are places where you can’t do that – I think it affects the rest of the file.

          It’s similar to the Trevor Jim articles – you have to define “parsing” to say if it’s context-free or not.

          I don’t have a link handy for the quote, but my memory is that Wall said something like, “In Perl 5, we were sometimes confused about what language we’re parsing …” which I interpreted to mean that.

          In any case the extensibility for Perl 6 is intentional, while I’m pretty sure that rule for Perl 5 is unintentional.

          I think C++ metaprogramming is mostly unintentional as well. And the “feedback” back into the parser is due to the “lexer hack” or typename-identifier problem, as far as I remember.

          People were “discovering” how to program the template instantiate mechanism for many years, and never used the feedback into the parser. Similar issue with Swift, although I think there is no feedback into the parser and Swift is statically parseable: https://forums.swift.org/t/swift-type-checking-is-undecidable/39024

          The bash/ksh feature was unintentional, and OpenBSD ksh actually fixed it after my report, and no one noticed because they didn’t use it!

          1. 1

            As far as I understand the extensibility is limited by some kind of delimiter.

            My understanding is that it is not.

            ‘[S]ometimes confused about what language we’re parsing’ does not preclude the notion that none of the languages being parsed is itself extensible.

            1. 1

              Err, replace ‘preclude the notion’ with ‘imply’.

        2. 1

          being context-free doesn’t really have any useful engineering properties.

          It does make things easier to implement and less convoluted, which for the developers I would argue is a good property. I believe that’s the reason most new languages do “name: type” in signatures rather than “type name”. (Yes the parsing is still context dependant, but… Less?)

          1. 1

            There are lots of non-context free languages that are easy to implement! They just haven’t been defined mathematically.

            Example: Calc regular languages (2017)

            http://spw17.langsec.org/papers/grosch-taming-length-fiels.pdf

            You could define 20 more sets like that.

            Again, context-free isn’t a useful engineering property. Being LALR(1) is a useful property

            Conversely, context free is too restrictive for lots of useful languages

        3. 8

          By deferring the resolution until later, Go can be parsed faster. On the other hand it is not quite context-free.

          I think this is a matter of perspective. From the perspective of evaluation (whether it be interpretation or compilation), no interesting language can be meaningfully context-free. However, from the perspective of the parser, this is indeed context-free.

          You could imagine an alternative parser that intertwines some context-sensitive semantic analysis, and returns a Call or a Conversion instead of a CallOrConversion node. That would still be a perfectly valid Go implementation, since this element of the grammar is expectedly ambiguous in the specification.