@nwf pointed this out to me this morning. It made me happy for two reasons:
The authors say that this is early work and I’d love to see it continued. I’d like to see a bit more evidence that the tag bit helps. If you just assume 64-bit numbers are pointers if they refer to a valid entry in the VM map, do you get the same performance? If you look at adjacent values reachable from the capability that caused the fault, do you get better results? If you put the prefetched pages into a separate LRU queue or similar, what happens? If you track, per page, whether prefetching with each prefetcher helps and switch between them, what happens?
Lots of interesting work to be done, I’m looking forward to the follow on papers.