1. 31
  1. 6

    this looks cool, and kudos for using LMDB. The 35MB binary size leapt out at me, though — what contributes to that? (LMDB itself is only ~100KB.) Lots of stemming tables and stop-word lists?

    1. 4

      Rust isn’t entirely svelte either. It doesn’t take too many transitive dependencies before you’re at 10MB.

      1. 3

        How do you call yourself minimalist when you’re pulling in that many dependencies?

        1. 11

          To be clear, I’m not the author. But if I were, this would come off more as a personal dig than a real question. Be kind. :)

          1. -1

            Holy passive aggressiveness, batman :)

            Remind me to avoid rhetorical questions in the future.

          2. 7

            It depends what you compare it to. An Elasticsearch x86-64 gzipped tarball is at > 340MB https://www.elastic.co/downloads/elasticsearch

        2. 4

          If anyone wants to figures this out, two tools to use are:

          I am 0.3 certain that at least one significant component is serialization code: rust serialization is fast, but is rumored to inflate binaries quite a bit. I haven’t measured that directly, but I did observe compile time hits due to serialization.

          1. 3

            My guess is assets for the web UI are packed into the binary

          2. 5

            Tantivy, which is also written in Rust and sits on top of Lucene

            It’s inspired by it, sitting on top of it would be tricky given Lucene is a Java library.

            1. 2

              Thanks. I cleared that up.

            2. 4

              Another Rusty alternative is Sonic: https://github.com/valeriansaliou/sonic

              1. 4

                It’s cool, but contrary to TypeSense, doesn’t support HA, although it’s high on the “under consideration” list.

                The Rust SDK is also a bit surprising. For instance, the fn to add documents to the index, add_documents() is async and therefore “returns” a future, but the future is itself a data structure representing “progress”, which seems redundant and error-prone. So in order to wait for completion of add_documents(), a wait_for_pending_update() (which has some arbitrary defaults, BTW) loop is needed instead of simply doing add_documents(...).await?.

                I also don’t see any support of atomic transactions in contrast to e.g. Tantivy.

                1. 1

                  Seems very interesting. I wonder how it compares to Xapian.