The engineering that went into SwissTable (and F14) is impressive, but it’s a very complex design and implementation, and it seems to depend crucially on SIMD instructions for performance, which is not always acceptable (e.g. in the kernel). For a much simpler approach to open-addressed hashing that supports similar load factors, see Bidirectional Linear Probing: https://github.com/senderista/hashtable-benchmarks. (Note that I have not attempted to benchmark any of the implementations in this repo against any variant of the SwissTable algorithm.)
SwissTable is surprisingly simple in its design. The implementation is more complex than one might want because of the need to support C++11. If one could use C++14/20 the implementation would be much simpler (there is a lot of machinery to elide copies when one uses insert/emplace/operator).
Also SwissTable does not require SIMD. It works better with SIMD (16 comparisons per load) but it can work with scalars just as well (8 comparisons per load). In fact the ARM implementation uses scalars because there is no single cycle equivalent of pmovmskb for ARM.
Thanks for the explanation. I really should benchmark my BLP implementation against SwissTable for integer sets/maps (but that would require translating to C++, so not sure when it’ll happen).
Thanks, I wasn’t aware of that algorithm. The original paper confuses me, though: it seems to require a hash function that produces hashes with the same ordering as keys, so h(x) < h(y) iff x < y. That isn’t true for any real hash function (on strings) I know of. The hash functions in the repo all operate on ints, so maybe they can meet that criterion?
That’s right, the original paper assumes “ordered hash tables”, but that doesn’t imply that the hashed keys must have the same order as the original keys. You just need to store the hash codes explicitly along with the keys (unless you’re willing to pay the cost of constantly hashing keys). Robin Hood hash tables already do this (to determine distance from the original bucket), and BLP does the same thing. The reason I only used integer sets in the benchmarks (integer->integer maps would work as well) is that it’s something of a “sweet spot” for this algorithm: you can “hash” the keys with an integer permutation (rather than a one-way hash function), and only store the permuted form, reconstructing the original keys on demand by applying the inverse. The result is integer sets with constant-time lookup, in essentially the same space as a sorted array (but of course they can’t be iterated in the original order, which is the essential tradeoff).
Note that you can achieve constant-time lookup for compressed integer sets by e.g. using bucketed Elias-Fano coding over an integer permutation (with say cache-line-size buckets) in the static context, or Cleary hash tables in the dynamic context (which is essentially Elias-Fano coding combined with BLP).
Got it. Keeping the table ordered by hash code is an attractive property for some persistent hash table algorithms I’m looking at, which split pages by hash range.
I wonder if this paper on external (i.e. disk-based) minimal perfect hashing might be relevant?
They order entries by hash value as well. If you’re willing to “throw away the keys” (which is a precondition for minimal perfect hashing to be useful), then this approach needs only ~2.5 bits/key.
“The unique feature of ECT [entropy-coded tries] with respect to sort is that it does not use the original key as the sort key. Instead, it hashes each key and uses this hashed result as the sort key.”
Note that the use of hash codes as keys in a trie is by no means new: the Hash Array Mapped Trie (HAMT) is used as a standard map data structure in Clojure, Scala, and Haskell.
Also, it took me a while to figure out that Robin Hood hashing (at least in one version) actually orders its entries by hash code as well. Then I learned that Paul Khuong had figured out the same thing (https://pvk.ca/Blog/more_numerical_experiments_in_hashing.html). I haven’t seen this observation anywhere else though.
So both Robin Hood (at least the version that enforces the ordering invariant) and BLP can both be considered variations on “ordered hash tables”. BLP has a bit more flexibility though (it allows a cluster to “surround” its “target bucket” rather than forcing it to follow the “target bucket”), which is reflected in its better performance and space utilization.