1. 25
  1. 2

    Nice writeup, thanks! Would nom be a good option to parse pcap files and various contained network protocols (Ethernet, IP, TCP/UDP, application level messages)? I need to extract some information from large pcap files, but I need to look at various fields of the whole stack.

    1. 5

      nom not only has support, but was in fact built specifically as a binary parser from the beginning, so it should work excellently for this use case. And it has dedicated support for streaming parsers that’s been steadily improving.

      1. 3

        Yep, it has specific functionality for binary protocol parsing. It’s been a while since I used it but to my memory its streaming support wasn’t great, but for pcap files in particular I’d expect it to work really well. (And I could be way out of date on the streaming support, they’ve had more than one major version since I used it.)

      2. 2

        Very nice. I have previously ported a parser for the MongoDB Language Model from JavaScript’s PEG to Rust’s pest.rs[1] to support my MongoDB to PostgreSQL translation layer Oxide[2].

        My initial idea was to port it to nom, but since I was new to Rust altogether I felt a little bit intimidated by the complete new approach vs. changing from one representation of the grammar to the other (PEG -> pest.rs).

        I plan on re-evaluating this idea, and I was wondering if anyone have any other good resources to learn more, specially in video format.


        [1] https://github.com/fcoury/mongodb-language-model-rust
        [2] https://github.com/fcoury/oxide

        1. 2

          ugghhhh this is the blog post I needed 3 months ago when I was trying to write a large parser with nom. I completed it but it way messier than this

          1. 1

            Does anyone know of a parsing combinator library that supports concrete syntax trees and fast updates? This would be for a language server.

            1. 5

              For LSP, you also want error-resilient parsing: even if input is garbage, you want to get some sort of bear effort syntax tree out of it. I don’t know a parser combinator library for that.

              My general advice would be:

              • if you want to implement LS for a whole bunch of different languages, go with tree sitter
              • if you only have one lang to deal with, strongly consider hand written parser