1. 6
  1.  

  2. 2

    Note that LR(1) parsers are memory-intensive …, so something like full C grammar can require up to 4GBs of free RAM.

    wat. Shouldn’t this literally just be a state transition table and a token stack?

    1. 5

      With LALR(1), sure (which I plan to add later). Check this out: https://stackoverflow.com/questions/24174458/lr1-parser-state-size-still-an-issue

      1. 2

        Oh wow, I didn’t know the state tables for pure LR(1) were that big. Cool, thanks! Any comment on the “Minimal LR(1) parser” algorithm mentioned at the end of that thread? That leads me to HYACC but I don’t find much else about it.

    2. 1

      You might look at Pager’s algorithm or ELR(1) if you’re interested in reducing the table size.

      1. 2

        I implemented Pager’s algorithm – or, more accurately, a hybrid of Pager’s algorithm, Chen’s thesis, and some guesswork – in grmtools. As that suggests, Pager’s algorithm is a bit harder to implement than one would like!

        The IELR paper (I wondered if ELR is a typo for IELR? But maybe there’s another state reduction algorithm I don’t know about!) makes a convincing case that Pager’s algorithm has flaws, and that IELR corrects those flaws. I haven’t implemented IELR, though if I was to revisit that part of grmtools/lrtable, I would start with IELR and not Pager.

      2. 1

        There’s also Lemon parser generator (which is used by SQLite, from the same authors).

        1. 1

          Lemon is LALR(1), so it’s completely different.