1. 12
  1. 6

    I want to point a couple of inaccuracies in the article.

    Well, because it’s gosh-darn hard to do it (parsing) the right way

    I don’t think this is the case. For a reasonable language, writing a parser is not hard. rust-analyzer’s parser is just 2.5k lines of code, and stupidly simple code at that (just a bunch of top-level functions, no generics, no lambdas). (video about how that works).

    I would say that the hard bit is elsewhere – committing to supporting (at least lightweight) semantic analysis for programming languages, and creating something where a parser can be plugged into and be useful. We could have had LSP couple of decades ago. But open-source community just didn’t move beyond ctags, probably because this is a hard coordination problem.

    Perfect syntax highlighting.

    For perfect syntax highlighting, you need to augment parse trees with semantic information, just parsing gives you a reasonable baseline of non-approximate syntax highlighting.

    The reason (most) LSP servers don’t offer syntax highlighting is because of the drag on performance.

    I have doubts that this assertion is true. The implication is definitely incorrect – the slow part of semantic highlighting is semantic analysis (resolving identifiers & inferring types), not the RPC overhead. Even then, it’s not particularly slow.

    If there is a slow adoption of semantic highlighting, then a more important reasons seems to be that it’s just a relatively recent addition to the LSP.

    To clarify, tree sitter is super cool, and it indeed allows much easier scaling for supporting many niche languages. The point I want to make though, is that “complexity of parsing” is not the reason why our code editor’s don’t come with parsers for popular languages.

    1. 1

      Could you provide a URL to the sources of the rust-analyzer parser?

    2. 1

      I want to read this article because of the extensive Twitter discourse around hand-hacked vs. generated parsers.