It still sounds like “gear hashing” is just a re-branding of buzhash, except that in buzhash, you also pass in min/max block sizes, to trim the long and short tail from the block size distribution. What am I missing?
I was going to say the same thing. The article even links to a Wikipedia page mentioning Buzhash and block size tricks like you mention date as far back as the beginnings of CDC in Manber93/Spam Sum/LBFS. I do completely agree that Buzhash is simpler & probably usually faster and is overlooked in this space especially in teaching contexts. That said, Bob Uzgalis (“Buz”) actually died over 10 years ago now - and I’m not sure he did much more than popularize an S-box for this kind of hashing - DES in 1977 had S-boxes… Anyway, for the curious, there is a Nim impl of some of these ideas over at https://github.com/c-blake/ndup
I wouldn’t say a rebrand, but certainly closely related and quite similar to Gear Hashing. Buzhash is a rolling hash based on Cyclic Polynomials. Though both Gear and Buzhash use a table of byte-value to random integer, incurring an array lookup, one key difference is that Buzhash uses barrel-shifts and XORs while Gear uses regular shifts and ANDs.