1. 18
  1.  

  2. 19

    There was a period of about a decade when tries were one of my low-level data structure secret weapons. Constant lookup time for a given key length with none of the rebucketing issues of a hashtable, and for some reason few people seemed to have ever learned about them so they had the added bonus of being a cool new thing I could show people.

    But then CPU caches got faster and faster, and low-level data structure performance started to be dominated by cache locality. “Constant number of instructions” and “constant execution time” became somewhat decoupled.

    The last time I reached for tries was when I was on the infrastructure team at a large web company circa 2008. We were using memcached and I’d looked at profiling data and discovered that it was spending a nontrivial amount of time computing key hashes.

    Aha, I thought, I know how to change this so no hashing is required, and it’ll be super quick to prototype! So I spent a couple hours replacing the hashtables with tries. The profiler confirmed that my implementation needed significantly fewer instructions to look up and insert values.

    But it was 4-5x slower in wall time because it was jumping from cache miss to cache miss walking the trie instead of doing a hash computation completely in the L1 cache and only hitting main memory a few times to do the hashtable access. After that, I never touched tries again.

    1. 3

      There was a period of about a decade when tries were one of my low-level data structure secret weapons.

      Exactly a decade ago, I was assigned with the task of optimising a web crawler cluster. A python dictionary holding necessary state was the bottleneck, using up all the machine memory. I found a python PATRICIA trie implementation that implemented the dict interface. It was a simple matter of adding an import and calling the trie constructor instead of dict(). Memory usage was reduced by a 50 fold with that simple changes. He freed up three quarters of the machines that very same day.

      1. 1

        For what it’s worth, tries are still great for compression. For example, Pyroscope uses them to great effect

      2. 5

        Nerd snipped myself into writing a Go version:

        type Trie struct {
        	children map[byte]*Trie
        }
        
        func NewTrie() *Trie {
        	return &Trie{make(map[byte]*Trie)}
        }
        
        func (t *Trie) Insert(word string) {
        	for _, c := range []byte(word) {
        		if t.children[c] == nil {
        			t.children[c] = NewTrie()
        		}
        		t = t.children[c]
        	}
        	t.children['\x00'] = nil
        }
        
        func (t *Trie) wordsWith(prefix string, yield func(string)) {
        	for c, t := range t.children {
        		if c == '\x00' {
        			yield(prefix)
        		} else {
        			t.wordsWith(prefix+string([]byte{c}), yield)
        		}
        	}
        }
        
        func (t *Trie) Autocomplete(prefix string) []string {
        	for _, c := range []byte(prefix) {
        		t = t.children[c]
        		if t == nil {
        			return nil
        		}
        	}
        	var words []string
        	t.wordsWith(prefix, func(word string) {
        		words = append(words, word)
        	})
        
        	return words
        }
        
        1. 2

          Thoughts:

          1. You could make this Unicode aware by changing byte to rune.
          2. You could keep it as a byte map but implement that as an array of 256 pointers to Tries. 2KB per node seems like a lot of overhead, but maybe it’s worth it?
          1. 3

            A pretty common technique I have used is to have a node for every half byte and use slices, so at most 16 pointers per node. Then, you can use an integer to represent which slots are used (slot 0 used => bit 0 set, etc), and use bits.OnesCount16 to find out which of the branches have been populated. It’s essentially the technique used in ideal hash trees by Bagwell, just half the size, as it’s a little neater to work with for bytes.

            1. 1

              If you insist that words are just 26 English alphabetical letters, then you could use an array of 27 pointers, which is probably pretty cheap.

              1. 1

                You could make this Unicode aware by changing byte to rune.

                i think typically folks build byte trie over utf8 encoding, to keep number of children small. That’s how Unicode automata generally tend to work

              2. 2

                t.children['\x00'] = nil that’s a funny hack :-) Why not just use a boolean?

                1. 1

                  I started with a boolean but then decided that a null character would be invalid anyway, so might as well use it.

              3. 3

                P.S. Please don’t talk to me about types and dataclasses, I will ridicule you to no end :-)

                My man!

                1. 3

                  P.S. Please don’t talk to me about types and dataclasses, I will ridicule you to no end :-)

                  What does this mean, @isagalaev?

                  1. 3

                    I’m a little tired of the condescending attitude expressed online here and there about type-checking being an obviously superior way of writing code. Especially in toy exercises aimed at readability. I’m not taking seriously anyone who would claim that slapping field(default_factory=dict) on the children field here constitutes any improvement.

                    1. 1

                      Ah that’s what you meant! I actually read the opposite into it: something in the vein of « don’t start testing me on data structures and typed languages as my knowledge is superior ». I was really confused about it, because it contrasted with using unannotated python, and also with the general tone of the article as a whole.

                      My 2 cents: it’s worth rephrasing as readers might leave with a wrong impression.

                      Thanks for the article, I found it very interesting!

                  2. 2
                        def autocomplete(self, prefix):
                            for char in prefix:
                                self = self.children.get(char)
                                if not self:
                                    return []
                            return list(self.words_with(prefix))
                    

                    ? One line shorter, one indent less, we are being specific which expression can KeyError, and we still do only a single dict lookup

                    1. 2

                      All true!

                      This bit of descending down the tree used to be in a separate method, and I didn’t like checking for None there and here at the call site, so changed it to catching KeyError. And it remained this way after that method got inlined.

                    2. 1

                      For a Ruby example of a Trie, Hanami’s router engine uses one.

                      1. 1

                        There are bunch of trie routers for Go too. Here’s one in the Chi router package: https://github.com/go-chi/chi/blob/0316d5a1df8598eceb137f5f77945be56810b564/tree.go#L74