1. 34
  1. 7

    Could you please add a warning about spoilers for AoC?

    1. 8

      There’s a neat trick to solve Advent of Code, day 6 in a single pass.

      If you read that and you still think he’s not going to talk about the solution, I don’t know what you need…

    2. 4

      A key piece of information that this description misses is that the problem relies on the alphabet only having the 26 letters. Any more than 32 and your overflow a u32.

      1. 3

        I benchmarked this in Go, and found the XOR+popcount approach to be slower than just using an array of 256 bools. See here: https://www.reddit.com/r/golang/comments/zh00t4/faster_go_code_by_being_mindful_of_memory_advent/izlwep1/

        Code if you want to bench it yourself:

        package main_test
        
        import (
        	"math/bits"
        	"testing"
        )
        
        var inputs = []struct {
        	data     []byte
        	hasDupes bool
        }{
        	{[]byte(""), false},
        	{[]byte("1"), false},
        	{[]byte("11"), true},
        	{[]byte("1234567890"), false},
        	{[]byte("12345678901"), true},
        	{[]byte("abcdefghijklmnopqrstuvwxyz1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ"), false},
        	{[]byte("aabcdefghijklmnopqrstuvwxyz1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ"), true},
        }
        
        func hasDuplicatesMap(block []byte) bool {
        	seen := make(map[byte]bool)
        	for _, b := range block {
        		if seen[b] {
        			return true
        		}
        		seen[b] = true
        	}
        
        	return false
        }
        
        func hasDuplicatesSlice(block []byte) bool {
        	var seen [256]bool
        	for _, b := range block {
        		if seen[b] {
        			return true
        		}
        		seen[b] = true
        	}
        
        	return false
        }
        
        func hasDuplicatesBits(block []byte) bool {
        	bits := make([]byte, 32)
        
        	for _, b := range block {
        		i, mask := b/8, byte(1<<(b%8))
        		if bits[i]&mask != 0 {
        			return true
        		}
        		bits[i] |= mask
        	}
        
        	return false
        }
        
        func hasDuplicatesPopcount(block []byte) bool {
        	var bf [32]byte
        	for _, b := range block {
        		i, mask := b/8, byte(1<<(b%8))
        		bf[i] |= mask
        	}
        	uniques := 0
        	for _, b := range bf {
        		uniques += bits.OnesCount8(b)
        	}
        	return uniques < len(block)
        }
        
        
        func BenchmarkMap(b *testing.B) {
        	for i := 0; i < b.N; i++ {
        		input := inputs[i%len(inputs)]
        		if hasDuplicatesMap(input.data) != input.hasDupes {
        			b.Fatal(input)
        		}
        	}
        }
        
        // etc for other benchmarks
        
        1. 2

          I KNEW IT! I ran into XOR for some reason a few days prior to Day 6 and was like … there’s something here.

          I’m so happy someone far more wiser than I figured this out! Really bravo!

          This algorithm is just counting how many characters appear an odd number of times in a window of size N. That number will be N if-and-only-if they’re all unique!

          One thing that came to mind though when reading your post, wouldn’t two ‘strings’ that are anagrams of one another, have the same bitmask though?

          1. 2

            Yes, they would. But that’s not relevant to the AoC problem. The bitset (not bitmask) represents only the N-character window the problem asks about, not the entire string.

          2. 2

            I was afraid this was going to be swapping integers/pointers without a temporary, but no, it’s something else entirely.

            1. 1

              You can also use bit tricks to create a pseudo-set or, rather a pseudo bloom filter for letters. I use this for windowing algos, when it’s possible: https://gist.github.com/nomemory/40450c38c6fd1cbe1be2ba8afdb8efe9

              I love when people are (re) discovering bit tricks, a thing that was forgotten in the last decade or so.