I just tried out both. plocate is much more a near drop-in replacement for mlocate. plocate ingests (usually pre-built) mlocate databases while ix does its own file tree walking. This makes plocate build time 100s of times faster than ix as well as sharing the usually in cron DB builds.
In terms of query time, plocate runs in low single-digit milliseconds. ix seems to have no non-interactive mode. The only way to make it non-interactive would appear to be a pseudo-terminal (the setup & control of which which might well dominate run time).
I meant that “just as fast” is hard to know. indexa is only interactive. So, one is stuck with “how fast my screen changes”. plocate was taking 5 millisec. A 10x slower indexa at 50 ms might well “look” the same, roughly “movie frame instant”. I’m not saying indexadoes take 50 ms on my test file hierarchy. I just don’t know. It’s hard to measure. :-) That was my point of my 2nd paragraph. Sorry it was unclear. Could be under 1 ms or maybe up to 100ms. A more careful comparison is warranted before claiming “just as fast” conclusions, though.
For example, if type time ix -q MissingFile and hit the ENTER key twice in as rapid succession as I am physically able then the lowest time I can see is about 75 ms. Meanwhile, if I strace -tttfs16 -o/dev/shm/ix.st ix -q MissingFile and do grep -C2 'execve\|read.0,' /dev/shm/ix.st then I see times around 75-85 ms until calls just before the read(0,..). That is some 15..17x slower than plocate on the same test data.
These are admittedly lame benchmarks & include all screen/terminal set up time in both cases and strace/ptrace mode overheads in the more precise benchmark. Whoever wrote indexa already added -q. If they just add a -n non-interactive option to just print any answers then performance would be much easier to compare.
Looking at the strace shows a lot of millisecs in memory allocation system calls, though. So, I am not optimistic that this Rust thing is much faster than 10x slower than plocate, carefully assessed. Also, for my test data, indexa|ix uses 286 MiB while plocate uses only 4 MiB. So, I would have to agree with @Foxboron that plocate + fzf would likely be more efficient in multiple metrics.
I have a mix of NVMe, SSD, as well as Winchester & use locate all the time. Even mlocate on pure NVMe is dozens of times faster than a non-DIMM-cached find and for me - 30 seconds vs 1.2 seconds. And that’s NVMe. It gets worse with SSD and much worse for Winchester. Winchesters fully engaged and uncached, this would have been several minutes.
It’s true that to leverage the pre-built DB you must have some inkling of substrings and you don’t also have i-node metadata to work with. It is very useful even so.
E.g., to check for whether Gentoo had an ebuild for plocate I did locate -r 'plocate.*ebuild' which was about 700 ms. After I installed plocate, I ran plocate 'plocate'|rg ebuild which took 1.8 ms, almost 400x faster than mlocate.
Anyway, with just mlocate DBs the contrast is between 100s of thousands or millions of IOs vs. how ever many IOs to read a few 100 MiB DB file which is a big contrast. The index plocate makes takes that speed diff up another level.
These large single-file databases are typically built by cron jobs overnight. Filesystems instead keep things up-to-date to the microsecond and typically colocate metadata nearby file data on storage. So, the problems are just different. Large, out of date single-file will always be much faster (though you may or may not have large file sets to search through or care about the “absolute” time).
I was going to share this in a new thread, but since it’s related I’ll share it here.
(Not trying to take over your thread, just trying to keep the home page clean.)
I found a really nice locate alternative written in Rust, and it’s really good (at least in my opinion).
https://github.com/mosmeh/indexa
Is it as fast as this one though?
In terms of searching for the file based on the query, yes. It feels almost instataneous.
However, it’s used for interactive selection. That way you can wrap commands around it like:
emacs "$(ix)"
,vim "$(ix)"
, ormpv "$(ix)"
Selecting the file you want in indexa will output the full path to stdout.
I just tried out both.
plocate
is much more a near drop-in replacement formlocate
.plocate
ingests (usually pre-built)mlocate
databases whileix
does its own file tree walking. This makesplocate
build time 100s of times faster thanix
as well as sharing the usually in cron DB builds.In terms of query time,
plocate
runs in low single-digit milliseconds.ix
seems to have no non-interactive mode. The only way to make it non-interactive would appear to be a pseudo-terminal (the setup & control of which which might well dominate run time).Actually, indexa creates it’s own database, than tree-walks from that.
And I did say that indexa was used for interactive selection.
I said that in terms of searching, yes, it’s just as fast.
You can reimplement indexa with plocate and fzf. It would probably be faster and less diskspace used for the file database.
I meant that “just as fast” is hard to know.
indexa
is only interactive. So, one is stuck with “how fast my screen changes”.plocate
was taking 5 millisec. A 10x slowerindexa
at 50 ms might well “look” the same, roughly “movie frame instant”. I’m not sayingindexa
does take 50 ms on my test file hierarchy. I just don’t know. It’s hard to measure. :-) That was my point of my 2nd paragraph. Sorry it was unclear. Could be under 1 ms or maybe up to 100ms. A more careful comparison is warranted before claiming “just as fast” conclusions, though.For example, if type
time ix -q MissingFile
and hit the ENTER key twice in as rapid succession as I am physically able then the lowest time I can see is about 75 ms. Meanwhile, if Istrace -tttfs16 -o/dev/shm/ix.st ix -q MissingFile
and dogrep -C2 'execve\|read.0,' /dev/shm/ix.st
then I see times around 75-85 ms until calls just before theread(0,..)
. That is some 15..17x slower thanplocate
on the same test data.These are admittedly lame benchmarks & include all screen/terminal set up time in both cases and strace/ptrace mode overheads in the more precise benchmark. Whoever wrote
indexa
already added-q
. If they just add a-n
non-interactive option to just print any answers then performance would be much easier to compare.Looking at the
strace
shows a lot of millisecs in memory allocation system calls, though. So, I am not optimistic that this Rust thing is much faster than 10x slower thanplocate
, carefully assessed. Also, for my test data,indexa|ix
uses 286 MiB whileplocate
uses only 4 MiB. So, I would have to agree with @Foxboron thatplocate
+fzf
would likely be more efficient in multiple metrics.I think the last time I used
locate
was last century. Just curious, what do people use locate for?I have a mix of NVMe, SSD, as well as Winchester & use
locate
all the time. Evenmlocate
on pure NVMe is dozens of times faster than a non-DIMM-cachedfind
and for me - 30 seconds vs 1.2 seconds. And that’s NVMe. It gets worse with SSD and much worse for Winchester. Winchesters fully engaged and uncached, this would have been several minutes.It’s true that to leverage the pre-built DB you must have some inkling of substrings and you don’t also have i-node metadata to work with. It is very useful even so.
E.g., to check for whether Gentoo had an ebuild for
plocate
I didlocate -r 'plocate.*ebuild'
which was about 700 ms. After I installedplocate
, I ranplocate 'plocate'|rg ebuild
which took 1.8 ms, almost 400x faster thanmlocate
.Anyway, with just
mlocate
DBs the contrast is between 100s of thousands or millions of IOs vs. how ever many IOs to read a few 100 MiB DB file which is a big contrast. The indexplocate
makes takes that speed diff up another level.SSDs made it obsolete. Running
find
is no longer a system-halting thing.But I could see it being useful on large network storage or archive HDDs.
Nice to have a faster alternative. A little sad that the filesystem is still slow enough that it’s needed.
These large single-file databases are typically built by cron jobs overnight. Filesystems instead keep things up-to-date to the microsecond and typically colocate metadata nearby file data on storage. So, the problems are just different. Large, out of date single-file will always be much faster (though you may or may not have large file sets to search through or care about the “absolute” time).