Isn’t it more accurate to say that any input could be a program if you’re not careful ?
For example, the stdin of grep is an input, but it should NOT be a program, and isn’t in any real implementation.
The regex arg should generally be considered a program. (And there is legitimate confusion over whether regexes are code or data; depending on the implementation they have some properties of both. The appearance of Regex DOS in many systems is a good example of this.)
OK although I guess what you can say is you can “program” the stdin input to make the regex blow up … which is true. Hm
Note that context-free grammars can still yield weird machines when implemented. The weirdness of a weird machine comes from the fact that it is a concrete implementation of an abstract machine, not the complexity of the original grammar. State machines for old video games usually were context-free and had a finite number of states; nonetheless, they often yielded weird machines because the original developers did not account for all combinations of inputs.
(This is related to some drafts of posts I wanted to write, like Context Free Grammars Don’t Have Useful Engineering Properties, but regular languages do.)
What a timely reminder that any input is a program (remember log4j?).
Related to langsec and it’s notion of (in)security. The latest google project zero blog post provides a very “weird machine”.
Isn’t it more accurate to say that any input could be a program if you’re not careful ?
For example, the
stdin
of grep is an input, but it should NOT be a program, and isn’t in any real implementation.The regex arg should generally be considered a program. (And there is legitimate confusion over whether regexes are code or data; depending on the implementation they have some properties of both. The appearance of Regex DOS in many systems is a good example of this.)
OK although I guess what you can say is you can “program” the
stdin
input to make the regex blow up … which is true. HmNote that context-free grammars can still yield weird machines when implemented. The weirdness of a weird machine comes from the fact that it is a concrete implementation of an abstract machine, not the complexity of the original grammar. State machines for old video games usually were context-free and had a finite number of states; nonetheless, they often yielded weird machines because the original developers did not account for all combinations of inputs.
I think the langsec work generally makes a lot of good points, but the focus on grammars and especially context-free grammars was misleading.
The frequent objection was the length prefix, which is a good protocol design if you care about safety, but not context-free.
They wrote a follow-up about this:
http://spw17.langsec.org/papers/grosch-taming-length-fiels.pdf
https://news.ycombinator.com/item?id=16124895
(This is related to some drafts of posts I wanted to write, like Context Free Grammars Don’t Have Useful Engineering Properties, but regular languages do.)
Yes. But parsers for regular or context free grammars can also be very easily generated ä, which would be preferable.