1. 21
  1. 8

    This is a really fascinating write-up - especially how the footnote about their ngram index design and how regular trigrams weren’t quite performant enough for their particular application: https://github.blog/2023-02-06-the-technology-behind-githubs-new-code-search/#fn-69904-bignote

    1. 7

      Hello Lobsters! Thanks for sharing this. I work on this product, happy to answer questions about it or the blog post.

      1. 1

        I couldn’t quite grasp the algorithm for the sparse grams—the chester example seemed to break down into trigrams, and I wasn’t clear on why hes was selected while ste wasn’t. I suspect you could make a post just on this topic, though. :-)

      2. 1

        Bitbucket’s code search is also implemented using Rust.

        1. 1

          Interesting! Are there any details about it on the web?

          1. 2

            Don’t think there’s much public, sadly. Indexing uses syntect. SQS is used for queuing the jobs. The search is mostly powered by Elasticsearch with some custom analysers.