1. 9
  1.  

  2. 3

    Here’s the companion talk by Pat Morin: https://www.youtube.com/watch?v=-1qRcDnbihA

    The Eytzinger layout and search has got to be the coolest data structure and most elegant algorithm I’ve learned in a long time. Plus, such a great “by-the-way” dive into all the essential elements of writing high performance code, and how mechanical sympathy trumps complexity analysis, in a single case study that has application to so many computer science problems.

    1. 2

      Laying out trees within a static array is a useful trick in general. Pity that the only widely used data structure that does this is the binary heap.

      1. 2

        I’ve experimented a little with the Eytzinger layout … problem is, the data I’m searching isn’t integers, it’s strings, so each array item is really a pointer to a string. I’m sure the string comparisons take so long they outweigh the differences in layout discussed here.

        1. 1

          Could you inline most strings that fit in e.g. 63 bits assuming you’re searching things like words in a dictionary? That way you can use the high bit to indicate a pointer jump to strings that don’t fit?

          This would be the hash table equivalent of open-addressing with a little bit of chaining for the bigger stuff.

          1. 2

            I might try something like that. But it turns out the biggest performance constraint of all in my code is the space the table takes up, because it lives in a page in a b-tree, and the more keys I can fit on a page, the less disk I/O there is.

            I keep trying to make my code fast, but when I look at profiles, they’re all dominated by pread and pwrite. So I really shouldn’t be reading papers like this one!