This post is about load balancing and queuing theory.
The fascinating subtitle of the blog post: “Using less information to make better decisions.”
The summary here is that if you load-balance by picking two random candidate servers and sending your request to the less-loaded one, you do almost as well as if you send each request to the least-loaded of all servers, but that performance is much more robust against outdated load information, which can overwhelm previously-least-loaded servers under a thundering herd if everybody picks the same least-loaded server.
(Mitzenmacher, Richa, Sitaraman, 2001 is the paper the blog post is about, and it turns out to be a 60-page survey of fifteen years of research, rather than a report of a new result.)
This is a really interesting result, and I wonder if analogous results might hold in cases like these:
A thing that a lot of these have in common is that on a population of underlying variable and mostly poor quality — e.g. lots of really terrible open houses, lots of potential lovers who would be just barely acceptable, lots of always-lose bandit arms — they won’t perform well unless there’s some kind of bias in the original random candidate selection toward better candidates. But if you have that kind of bias, with some arbitrary distribution over the candidates, then this best-of-N-random-choices thing is just a way of transforming one such arbitrary distribution into another one, no? Perhaps it may be a particularly computationally parsimonious way of doing so (maybe your open-house candidates are first selected according to their price distribution, and only then do you go to the trouble of reading the listings) but it doesn’t seem super clear.
You may find cuckoo hashing interesting, it’s similar to your first idea (but with the effect of increasing the practical load factor which on a lot of real hardware is a pretty good performance improvement). Not coincidentally Michael Mitzenmacher has as well as doing a stint as dean of the CS department been studying cuckoo hashing for a while:
But if you have that kind of bias, with some arbitrary distribution over the candidates, then this best-of-N-random-choices thing is just a way of transforming one such arbitrary distribution into another one, no?
The delay in the simulated model is the key to why this approach works so well for the load balancing problem. Generally, in this problem, its not prohibitively expensive to keep track of the state of all downstream services, but you don’t want to do that synchronously. In control theory terms, acting on stale data introduces a “phase shift” which turns the negative feedback that should tame the system into positive feedback that causes oscillation.
In other cases, as described by Mtizenmacher et al, it is parsimony of information or computation that drives this choice of algorithm.
Thank you very much! I hadn’t thought of it in terms of an op-amp oscillating due to phase shift, but you’re clearly right.
I guess the thing I’m wondering about is how to make this kind of approach work (either for computational parsimony or to damp oscillations) in an environment where most uniformly-random choices are going to be unacceptably bad. You could bias your two random choices toward likely-good choices, or maybe you could just make more than two uniformly random choices, say, ⌈Φ/P(a uniformly random choice is acceptable)⌉ choices, or something.
Another perhaps too-obvious example: using median-of-N to select a partitioning element for Quicksort, particularly if you choose the N candidates at random.
We have empirically verified this result, and are seriously considering making a P2C loadbalancer the default for finagle.