1. 39
  1.  

  2. 5

    Sometimes a millisecond IS fast, but most engineer’s conceptions of what you can accomplish in 1ms is wildly pessimistic. Well-tuned databases backed by B+ trees on pci-e flash drives should have no trouble serving <1ms@90th small point reads over the network in a DC, round-trip. You should be able to scatter-gather several back-end services 3 network hops away from each other in a building that perform minimal computation and don’t block on IO <1ms@50th.

    The longer your work takes, the more total work queues up in many systems, which gets expensive very quickly. So keep your timeouts tight on systems that see real traffic, or the total number of outstanding requests simply explodes and work that would otherwise have succeded simply gets bogged down by the overhead of the in-flight slow responses, and nobody is happy.

    A nice book is “Algorithms for Memory Hierarchies” which is all about interacting with a system in a way that is sympathetic to the size/performance trade-offs present in our hardware today.

    1. 2

      I’m not going to explain what this code does

      Can someone explain what this code does?

      My understanding is that it’s computing: 1/(sum of i from 0 to arr.size (2^-(arr[i]))). For example, if you input an array [1,2,3], it will return: 1/(2^-1 + 2^-2 + 2^-3).

      1. 2

        I agree with your analysis as far as it goes. The author said this is part of an implementation of HyperLogLog (wikipedia), which I am only vaguely familiar with, but I suspect reading the papers on it would clear up why this is useful. Overall, the algorithm is trying to do a constant-space estimate of the number of distinct values it’s given, knowing that the values are from a uniform distribution and I guess this must be the incremental step of it.

        1. 1

          Per the wikipedia page, 2^-n is the estimate of numbers in a particular bucket if n is the max number of leading zeroes, and the harmonic mean is the way to average the buckets.