SIEVE is cool, and so simple it’s surprising that it doesn’t seem to have been described before Zhang et al’s paper. I’m not sure whether the SIEVE-k variant is practically useful, but it does show that it’s a nice composable component (and could be composed in more elaborate ways, such as in the same authors’ S3-FIFO).
The SIEVE paper says in passing that 2Q-SIEVE (2Q with each LRU replaced by SIEVE) outperforms 2Q. Wouldn’t 2Q-SIEVE also be more scan-resistant than SIEVE, and maybe have better blended performance than SIEVE-2?
The key difference is where a retained object is kept. SIEVE keeps it in the old position, while FIFO-Reinsertion inserts it at the head, together with newly inserted objects.
But TigerBeetle’s implementation doesn’t shift anything: it’s just a circular buffer, and the elements stay put, only the clock hand moves. But there’s no contradiction here: logically TB does reinsertion, it’s just that physically nothing needs to be moved for that. That is, inked list vs ring buffer is the salient difference here, and not just an implementation detail. With a linked list, it’s is true that SIEVE doesn’t need to move, while clock has two. With ring buffer, clock doesn’t move anything, and the SIEVE isn’t really possible, as, unlike clock, it want so insert and remove elements from different positions in the queue!
Yup, FIFO-Reinsertion and CLOCK are different implementations of the same algorithm. FIFO-Reinsertion uses a queue, but CLOCK uses a ring buffer.
It is not trivial to implement SIEVE on a ring buffer due to the need to copy. I discussed this with Joran, so he decided to stick to CLOCK-2. But for the set-associative cache in TB, the hardware cache eviction designs, e.g., RRIP, might be a good option.
Wow, SIEVE is delightfully elegant. I’m always happy to be reminded that there are still small algorithms like this waiting to be invented/discovered.
I’ve worked on cache code that would have benefitted from it. I share the author’s surprise that apparently nobody had thought of it until now – I’ll admit that as I read the SIEVE paper, I couldn’t help feeling a little of, “Damn, why didn’t I think of this?”
Can recommend reading the Postgres buffer manager, was a big inspiration for the in-house page cache at Neo4j, before Chris Vest chucked that and wrote the crazy optimised “Munnin” page cache that underpins neo now.
Timely! — I just spent several days implementing ARC from the pseudo code in the paper, only to find that it doesn’t help my use case (a b-tree persistent store, sort of like LMDB.) I’m not sure if I did something wrong or am not testing the right workloads…
SIEVE is much, much simpler, so now that I’ve got a pluggable CachePolicy interface I’ll try implementing it too and see how it works.
Always nice to see my posts here :)
SIEVE is cool, and so simple it’s surprising that it doesn’t seem to have been described before Zhang et al’s paper. I’m not sure whether the SIEVE-k variant is practically useful, but it does show that it’s a nice composable component (and could be composed in more elaborate ways, such as in the same authors’ S3-FIFO).
The SIEVE paper says in passing that 2Q-SIEVE (2Q with each LRU replaced by SIEVE) outperforms 2Q. Wouldn’t 2Q-SIEVE also be more scan-resistant than SIEVE, and maybe have better blended performance than SIEVE-2?
If I am not mistaken, the eviction algorithm we use In TigerBeetle is more-or-less k-SEIVE:
https://github.com/tigerbeetle/tigerbeetle/blob/25425afcdd6a9b6405d839b4775e9e256aae5939/src/lsm/set_associative_cache.zig#L310-L338
TigerBeetle uses CLOCK-2 see https://x.com/jorandirkgreef/status/1672151424833142785?s=20
Ah, thanks, I see where I was confused now! That other SIEVE article (https://cachemon.github.io/SIEVE-website/blog/2023/12/17/sieve-is-simpler-than-lru/#meet-sieve) says:
But TigerBeetle’s implementation doesn’t shift anything: it’s just a circular buffer, and the elements stay put, only the clock hand moves. But there’s no contradiction here: logically TB does reinsertion, it’s just that physically nothing needs to be moved for that. That is, inked list vs ring buffer is the salient difference here, and not just an implementation detail. With a linked list, it’s is true that SIEVE doesn’t need to move, while clock has two. With ring buffer, clock doesn’t move anything, and the SIEVE isn’t really possible, as, unlike clock, it want so insert and remove elements from different positions in the queue!
Yup, FIFO-Reinsertion and CLOCK are different implementations of the same algorithm. FIFO-Reinsertion uses a queue, but CLOCK uses a ring buffer. It is not trivial to implement SIEVE on a ring buffer due to the need to copy. I discussed this with Joran, so he decided to stick to CLOCK-2. But for the set-associative cache in TB, the hardware cache eviction designs, e.g., RRIP, might be a good option.
Wow, SIEVE is delightfully elegant. I’m always happy to be reminded that there are still small algorithms like this waiting to be invented/discovered.
I’ve worked on cache code that would have benefitted from it. I share the author’s surprise that apparently nobody had thought of it until now – I’ll admit that as I read the SIEVE paper, I couldn’t help feeling a little of, “Damn, why didn’t I think of this?”
Can recommend reading the Postgres buffer manager, was a big inspiration for the in-house page cache at Neo4j, before Chris Vest chucked that and wrote the crazy optimised “Munnin” page cache that underpins neo now.
Bufmgr: https://github.com/postgres/postgres/blob/master/src/backend/storage/buffer/bufmgr.c
Munnin: https://github.com/neo4j/neo4j/blob/5.13/community/io/src/main/java/org/neo4j/io/pagecache/impl/muninn/MuninnPageCache.java#L73
Timely! — I just spent several days implementing ARC from the pseudo code in the paper, only to find that it doesn’t help my use case (a b-tree persistent store, sort of like LMDB.) I’m not sure if I did something wrong or am not testing the right workloads…
SIEVE is much, much simpler, so now that I’ve got a pluggable CachePolicy interface I’ll try implementing it too and see how it works.
I highly recommend reading the Caffeine docs to get an idea of the state of the art in caching.
https://github.com/ben-manes/caffeine/wiki/Design
Some benchmarking of popular caching policies including ARC:
https://github.com/ben-manes/caffeine/wiki/Efficiency
Have you tried SIEVE? There are more evaluations https://observablehq.com/@1a1a11a/sieve-miss-ratio-plots
[Comment removed by author]