1. 5

  2. 5

    A long time ago I wrote an implementation of a token-bucket-ish rate limiter with a completely different architecture that I’ve never seen anywhere else. The gist:

    1. Convert the allowable request rate to a request-interval (e.g. 100 req/min gives 600 ms/req).
    2. Multiply the bucket size (initial tokens) by the request-interval to get a burst-interval (e.g. 5 request bucket gives 3000 ms burst-interval).
    3. Store a next-allowed-request-timestamp per user, which always ratchets forward. When we receive a request from a user, if the next-allowed-request-timestamp exists and is less than burst-interval in the past, or it’s in the future, then simply increment it by request-interval. Otherwise, jump it forward to now - burst-interval + request-interval (burst-1 request times “in the past”). Since in the latter case, the previous value doesn’t matter, we can expire items from this table once they get old enough, e.g. in redis by doing a PEXPIREAT of (the new value of) next-allowed-request-timestamp + burst-interval.
    4. When a request comes in and the next-allowed-request-timestamp (prior to updating) is in the future, then block or throttle the request.

    This can be done in a distributed fashion by having servers stream enough access-log data to a central server to allow it to update the next-allowed-request-timestamps, and, whenever a next-available-request-timestamp is incremented to a future value, sending a broadcast message to all servers indicating who to throttle, and until when. The servers can then maintain their own local copy of the table that only contains information about who’s currently getting throttled, not the entire dataset. There are a few nice advantages of this:

    1. Contacting the central server isn’t in the request loop, so it’s a “zero-latency” approach.
    2. If the central server goes down, block messages just stop coming in, so it’s a “fail-permissive” approach.
    3. If there’s excessive latency in the system, rate-limiting might become less effective (rate-limit messages received too late to be useful) but it’s a fairly graceful degradation.

    I think this makes a nice middle-ground between the fully-distributed “as long as we get it kind of right” limiter discussed here and a fully-synchronized limiter.

    I have an implementation of it in node.js that I’m not entirely proud of (also it was written back in node 0.6 or 0.8 days so it probably doesn’t even run anymore) but I think that the idea is pretty much sound and I’ve been looking for a chance to resurrect it.

    1. 3

      Nice. Just to cross-reference, the first part sounds an awful lot like the Generic cell rate algorithm approach if I’ve read it correctly. Wikipedia even describes it in terms of a leaky bucket, which just goes to show how these things end up being more of a family of related methods, than individual discrete approaches.

      1. 1

        You’re right, that does seem like pretty much the same thing, once you get past the differences in terminology. Thanks, that’s a useful thing to know :)

        1. 0

          I think this is really just the standard way of (efficiently) implementing a token bucket. The linked article used a lot of obscure terminology to describe a very simple idea. I found the claim that “The leaky bucket is normally implemented using a background process that simulates a leak” to be ludicrous. Anyone who would actually do that is incompetent.

      2. 1

        This is actually a pretty standard approach that IIRC is described in the Redis docs. When I wrote a “sliding window” rate limiter for a take-home interview project (I wouldn’t use this design myself), I addressed the “trailing bucket” problem he described by weighting the count of the trailing bucket by the elapsed fraction of the current subinterval. E.g., take the sliding window to be 60m and the subinterval resolution to be 1m. Then if the current time was 12:00:20, I would count only 2/3 of the hits in the 11:00 subinterval (so logically from 11:00:20 - 11:01:00). My interviewers complained that this logic was incomprehensible and I didn’t get the job :) (I don’t think I’ve done a much better job of explaining it here, unfortunately.)

        1. 1

          It’s unfortunate that Redis API couldn’t support the first simple algorithm. I think it’d be better to just use Lua in Redis, or even use something else than Redis. Any programming language can easily hold an in-memory hash map and update it atomically.