The elephant in the room here, at least to me, is the existence of LR (and GLR) parsing techniques that are faster than PEG, also powerful enough to handle the examples mentioned, and have the added advantage of requiring you to sort out or explicitly allow ambiguities in your grammar before it will compile.
I would like to add Earley parsing, especially with the Leo optimizaiton to the mix. These are especially easy to implement.
I’ll add sklogic’s trick of using two parsers: one like LALR to be fast if no errors; one good at dealing with errors. If ambiguity remains, replace first with hybrid like Elkhound that includes GLR as an option.
A blog post by the author of the Janet programming language on the topic of PEGS and how they work:
I think https://leafo.net/guides/parsing-expression-grammars.html is a good introduction as well, more from a user’s point of view.
The lead author of Lua authored one of the main PEG libraries (http://www.inf.puc-rio.br/~roberto/lpeg/). It is widely used in the community for a lot of things, however the parser of Lua itself is still custom, hand-written C. The reason is mostly performance, I think: storing data as Lua source code used to be very common, and I there are still code bases out there that evaluate 100 MB files regularly.
I personally haven’t heard about many 100 MB Lua scripts, although I did find a parsing related bug with an 80 MB script that caused performance problems.
I use LPEG a lot—to the point where I tend to use it over the build in Lua patterns (which are somewhat like regular expressions). I even used it at work for real time processing of SIP messages and we really haven’t had performance issues dealing with millions of calls per day. The killer feature for me? LPEG expressions are composable which makes it easier to write and test grammars.
There’s a great D project called pegged where you write the PEG grammar as a string and the D compiler produces a parser for it!