1. 20
  1. 6

    What a timely reminder that any input is a program (remember log4j?).

    Related to langsec and it’s notion of (in)security. The latest google project zero blog post provides a very “weird machine”.

    1. 3

      Isn’t it more accurate to say that any input could be a program if you’re not careful ?

      For example, the stdin of grep is an input, but it should NOT be a program, and isn’t in any real implementation.

      The regex arg should generally be considered a program. (And there is legitimate confusion over whether regexes are code or data; depending on the implementation they have some properties of both. The appearance of Regex DOS in many systems is a good example of this.)

      OK although I guess what you can say is you can “program” the stdin input to make the regex blow up … which is true. Hm

    2. 3

      Note that context-free grammars can still yield weird machines when implemented. The weirdness of a weird machine comes from the fact that it is a concrete implementation of an abstract machine, not the complexity of the original grammar. State machines for old video games usually were context-free and had a finite number of states; nonetheless, they often yielded weird machines because the original developers did not account for all combinations of inputs.

      1. 3

        I think the langsec work generally makes a lot of good points, but the focus on grammars and especially context-free grammars was misleading.

        The frequent objection was the length prefix, which is a good protocol design if you care about safety, but not context-free.

        They wrote a follow-up about this:

        http://spw17.langsec.org/papers/grosch-taming-length-fiels.pdf

        https://news.ycombinator.com/item?id=16124895

        (This is related to some drafts of posts I wanted to write, like Context Free Grammars Don’t Have Useful Engineering Properties, but regular languages do.)

        1. 2

          Yes. But parsers for regular or context free grammars can also be very easily generated ä, which would be preferable.