If implementing this hash table, I recommend picking up my refinement to it.
Using this modulus method means we’re limited by the range of uint32
Every mainstream 64-bit architecture features a widening or ‘high’ 64-bit multiply, which can be used to implement this with a full 64-bit range.
On supported platforms, the runtime hasher will use AES instructions to efficiently generate strong hashes
AES-based hashing appears to be quite fraught and give poor quality results in practice—aes as an encryption algorithm is secure, of course, but making aes secure requires running it for an appreciable number of rounds, at which point it becomes uninteresting as a fast hash function. Multiplication-based hashes seem to give better results in practice (both arithmetic and carry-less multiply).
No. It is quite specialised, not yet done, and optimised for performance rather than readability (unlike the prose description); as such, it is unlikely to be of much use to most people.
There’s an absence of numbers in this. The only graph is of memory consumption. My hunch is that you’d be hard pressed to observe a difference except for very large hashtables on a hot path.
Edit: looks like I missed one table of microbenchmark.
Given the central importance of the “Prolly Tree” (something I was unfamiliar with before this article), I was surprised that they basically never link directly to the really excellent introduction to the structure in the Noms documentation.
If implementing this hash table, I recommend picking up my refinement to it.
Every mainstream 64-bit architecture features a widening or ‘high’ 64-bit multiply, which can be used to implement this with a full 64-bit range.
AES-based hashing appears to be quite fraught and give poor quality results in practice—aes as an encryption algorithm is secure, of course, but making aes secure requires running it for an appreciable number of rounds, at which point it becomes uninteresting as a fast hash function. Multiplication-based hashes seem to give better results in practice (both arithmetic and carry-less multiply).
Do you have a link to your implementation?
No. It is quite specialised, not yet done, and optimised for performance rather than readability (unlike the prose description); as such, it is unlikely to be of much use to most people.
I bet people would like to use an even faster implementation :)
There’s an absence of numbers in this. The only graph is of memory consumption. My hunch is that you’d be hard pressed to observe a difference except for very large hashtables on a hot path.
Edit: looks like I missed one table of microbenchmark.
There are benchmarks at the end of https://www.dolthub.com/blog/2023-03-28-swiss-map/#porting-swisstable-to-golang.
Specifically, in the published benchmarks, the builtin
map
is faster up to and including 1024 elements, andswiss.Map
is faster from 8192 elements.Given the central importance of the “Prolly Tree” (something I was unfamiliar with before this article), I was surprised that they basically never link directly to the really excellent introduction to the structure in the Noms documentation.
For those interested in the underlying structure without wading through product blog tree, it’s here: https://github.com/attic-labs/noms/blob/master/doc/intro.md#prolly-trees-probabilistic-b-trees
I’m very thankful to the authors for exposing this. It is really cool.