I think the author buried the lede. ag is fast enough for anything I’ve thrown at it, this is the real reason to check out the tool:
ripgrep also implements full support for .gitignore, where as there are many bugs related to that functionality in The Silver Searcher. Of the things that don’t work in The Silver Searcher, ripgrep supports .gitignore priority (including in parent directories and sub-directories), whitelisting and recursive globs.
Just knowing that the author has used ag enough to know how annoying those bugs are is reason enough for me to pay attention.
ag is more general than ripgrep, which is probably why there is a slight speed difference.
I like my generality unless I really, really need the speed. I think this is why GNU utils are still used a lot today (because they are versatile vs not built for one true single purpose, going against the UNIX philosophy a bit).
Good job for making a somewhat alternative to ag in Rust. Competition is always awesome to see. At least we can say ripgrep is not only fast for source code searching, but also memory safe?
Would you mind if you elaborated on what you meant by ag being more general than ripgrep?
(There are absolutely some features in ag that aren’t in rg, but it goes both ways, and I don’t think any of them are of the galactic variety, but I could be missing something!)
which is probably why there is a slight speed difference
To be fair, my blog post was meant to stop speculation like this by providing both data and analysis that explains the speed difference.
At least we can say ripgrep is not only fast for source code searching, but also memory safe?
Sure, but the other selling point of ripgrep is that it’s not just for source code searching. You can search anything at faster-than-grep speeds. At least, that’s what I claim anyway!
And yes, of course, writing it in Rust was pure joy. <3
ag can do binary searching, whereas ripgrep is utf-8 specific? Correct me if I’m wrong, please!
rg is strictly superior here. rg can handle any ASCII-compatible encoding just fine (and will happily munch on pure binary data), just like ag. Additionally, rg has Unicode support. e.g., \w matches all Unicode word characters.
Could you help me understand where the confusion was? I’d like to fix whatever documentation led you to conclude this. :-)
Nice Job! I’m curious, do you know if there would be an algorithm difference on SSDs vs HDs? Does it make sense to buffer at the native SSD block size for example…
For the most part, I ignored disk I/O by 1) running 3 warmup iterations of every command before taking measurements and 2) ensuring that I benchmarked on a machine that could fit the corpora in memory.
That’s not to say that benchmarking the differences on disks isn’t important or isn’t interesting, it’s just not something I did. My kind-of-sort-of perception was that if files need to be read from disk, then there probably isn’t that much difference in the tools. (I’ve been informed today that this was wrong of me to assume, by the way.)
The thing about benchmarks is that there’s always more to do. At over 18,000 words, I had to stop somewhere. :-)
I find it to be just a tiny bit slower than ag, comparing time rg "fn .*(.*)" and time ag "fn .*(.*)"
time rg "fn .*(.*)"
time ag "fn .*(.*)"
Is there anyway I can reproduce it? Is it on data that we can both access? What platform?
What happens if you pass --mmap to rg? (I wonder if there’s more to the memory map story than I’ve let on in my blog.)
I ran these two on a mid sized repo (4k lines of rust).
rg ranges from 0.015 to 0.025 while ag ranges from 0.009 to 0.014 total cpu time.
I didn’t run very comprehensive benchmarks though :P
Would you believe it if I said that never did I once benchmark any of the tools on very small repositories such as that? I’d say you’re comfortably well within “startup time matters” size.
Of course, that isn’t to say it isn’t important. There’s no technical reason why we shouldn’t be just as fast on ag in this case. I created an issue for it. Thanks!
I just ran them against the dragonflybsd codebase (25k files, 8 million lines), and rg won.
time rg “fn .(.)”:
0.79s user 0.33s system 203% cpu 0.548 total
time ag “fn .(.)”:
1.79s user 0.57s system 140% cpu 1.676 total