1. 21

This post provides a complete tour of Earley parsing along with Aycock’s epsilon fix, Leo’s right recursion optimization, and extraction of parse trees from parse forests.

I am trying to see how useful Pyodide is when compared to Jupyter notebooks for students. I have tried to convert a portion of our chapter on parsing.

I am looking for feedback on structure as well as content. Is this better as a Jupyter notebook like the original or is this format reasonable? Does my explanations make sense? Would this be useful to you (as a student)?

Note: Pyodide takes a little time to initialize, but it should be faster to initialize than spinning up the binder service from Jupyter (but slower to execute).

This is the writeup.


  2. 2

    I remember Robert Strandh trying to convince everyone who would listen that we should be using Earley parsers back in 2005 or so.

    He was probably right but I didn’t understand dynamic programming back then.

      1. 4

        Indeed, I link to that and also Marpa in the beginning paragraphs. Hopefully, being interactive, this tutorial can share the space?

        1. 2

          Well, if you implement Leo’s optimisation, you already have a significant advantage, compared to my tutorial. I’m personally very interested in closing that gap, so I’m definitely going to read your tutorial.

          1. 1

            Thanks! your tutorial was one of the main references I used while implementing most parts.

      2. 2

        I’ve been looking for one of these write-ups since forever. It’s a fantastic resource!

        1. 2

          Nicely explained. You may want to examine relational parsing, a different approach which also has the same characteristic performance properties as Marpa-style Early parsing.

          1. 2

            While on the subject of novel techniques, a non-Early-like (sorry for being off-topic) that I really like is the Pika Parser. It has superb error recovery, which is how I found it in the first place.

            1. 1

              Indeed, there is also the Glush parser that is very recent. There is also the derivatives approach for parsing. I hope that at some point, we are able to bring all these together (implementations in the same language, accepting same grammar format), and do a systematic study of pros and cons.

              1. 1

                The linked-to paper is an excellent overview of parsing techniques.

            2. 1

              I would add some examples for each defined term (at the end of each definition). I always get tripped up by this vocab when reading about parsing, so every little hint helps.

              1. 1

                Thanks!, I will do that.