1. 20
  1.  

  2. 4

    I was going to share this in a new thread, but since it’s related I’ll share it here.

    (Not trying to take over your thread, just trying to keep the home page clean.)

    I found a really nice locate alternative written in Rust, and it’s really good (at least in my opinion).

    https://github.com/mosmeh/indexa

    1. 3

      Is it as fast as this one though?

      1. 3

        In terms of searching for the file based on the query, yes. It feels almost instataneous.

        However, it’s used for interactive selection. That way you can wrap commands around it like:

        emacs "$(ix)", vim "$(ix)", or mpv "$(ix)"

        Selecting the file you want in indexa will output the full path to stdout.

        1. 2

          I just tried out both. plocate is much more a near drop-in replacement for mlocate. plocate ingests (usually pre-built) mlocate databases while ix does its own file tree walking. This makes plocate build time 100s of times faster than ix as well as sharing the usually in cron DB builds.

          In terms of query time, plocate runs in low single-digit milliseconds. ix seems to have no non-interactive mode. The only way to make it non-interactive would appear to be a pseudo-terminal (the setup & control of which which might well dominate run time).

          1. 1

            Actually, indexa creates it’s own database, than tree-walks from that.

            And I did say that indexa was used for interactive selection.

            I said that in terms of searching, yes, it’s just as fast.

            1. 2

              You can reimplement indexa with plocate and fzf. It would probably be faster and less diskspace used for the file database.

              1. 1

                I meant that “just as fast” is hard to know. indexa is only interactive. So, one is stuck with “how fast my screen changes”. plocate was taking 5 millisec. A 10x slower indexa at 50 ms might well “look” the same, roughly “movie frame instant”. I’m not saying indexa does take 50 ms on my test file hierarchy. I just don’t know. It’s hard to measure. :-) That was my point of my 2nd paragraph. Sorry it was unclear. Could be under 1 ms or maybe up to 100ms. A more careful comparison is warranted before claiming “just as fast” conclusions, though.

                1. 2

                  For example, if type time ix -q MissingFile and hit the ENTER key twice in as rapid succession as I am physically able then the lowest time I can see is about 75 ms. Meanwhile, if I strace -tttfs16 -o/dev/shm/ix.st ix -q MissingFile and do grep -C2 'execve\|read.0,' /dev/shm/ix.st then I see times around 75-85 ms until calls just before the read(0,..). That is some 15..17x slower than plocate on the same test data.

                  These are admittedly lame benchmarks & include all screen/terminal set up time in both cases and strace/ptrace mode overheads in the more precise benchmark. Whoever wrote indexa already added -q. If they just add a -n non-interactive option to just print any answers then performance would be much easier to compare.

                  Looking at the strace shows a lot of millisecs in memory allocation system calls, though. So, I am not optimistic that this Rust thing is much faster than 10x slower than plocate, carefully assessed. Also, for my test data, indexa|ix uses 286 MiB while plocate uses only 4 MiB. So, I would have to agree with @Foxboron that plocate + fzf would likely be more efficient in multiple metrics.

      2. 3

        I think the last time I used locate was last century. Just curious, what do people use locate for?

        1. 4

          I have a mix of NVMe, SSD, as well as Winchester & use locate all the time. Even mlocate on pure NVMe is dozens of times faster than a non-DIMM-cached find and for me - 30 seconds vs 1.2 seconds. And that’s NVMe. It gets worse with SSD and much worse for Winchester. Winchesters fully engaged and uncached, this would have been several minutes.

          It’s true that to leverage the pre-built DB you must have some inkling of substrings and you don’t also have i-node metadata to work with. It is very useful even so.

          E.g., to check for whether Gentoo had an ebuild for plocate I did locate -r 'plocate.*ebuild' which was about 700 ms. After I installed plocate, I ran plocate 'plocate'|rg ebuild which took 1.8 ms, almost 400x faster than mlocate.

          Anyway, with just mlocate DBs the contrast is between 100s of thousands or millions of IOs vs. how ever many IOs to read a few 100 MiB DB file which is a big contrast. The index plocate makes takes that speed diff up another level.

          1. 3

            SSDs made it obsolete. Running find is no longer a system-halting thing.
            But I could see it being useful on large network storage or archive HDDs.

          2. 2

            Nice to have a faster alternative. A little sad that the filesystem is still slow enough that it’s needed.

            1. 1

              These large single-file databases are typically built by cron jobs overnight. Filesystems instead keep things up-to-date to the microsecond and typically colocate metadata nearby file data on storage. So, the problems are just different. Large, out of date single-file will always be much faster (though you may or may not have large file sets to search through or care about the “absolute” time).