1. 12

  2. 2

    those are some good observations. Following them will help to avoid some redesigns later on.

    I would also add, that I prefer my caching to be ‘temporal-aware’ meaning that underlying database/datamodel supports queries of type ‘as-of-time’. So that our caching layer can interrogate the request and decide if pre-cached data that was valid ‘as-of-time’ can be served back or not.

    To do that we basically add to cache key:

    • all the request parameters, that affect query result
    • rounded-off request time (eg 30 min intervals) so that request that was sent 14:45, can use the same data as request at 14:30 and request at 14:59.

    This way, we can decide, from the business perspective, how much ‘stale-ness’ is allowed in our API responses.

    Unfortunately figuring out what keys in the request affect query, what rounding to use for as-of-time – make our caching strategy, not very suitable for using external cache tools…

    1. 2

      Unfortunately figuring out what keys in the request affect query, what rounding to use for as-of-time – make our caching strategy, not very suitable for using external cache tools…

      I’m sorry, I’m confused by this bit. Doesn’t the strategy you outlined here make caching your site with a reverse proxy really really easy? If you do that, you can put long expiry times on every thing and a cache can easily discriminate on just the URL? (Did I misunderstand and you don’t mean caching reverse proxies such as varnish by “external cache tools”?)

      1. 2

        no problem.

        By external caching tools, yes, I meant something like Varnish. We ended up not going that route, perhaps we will re-examine.

        The clients execute same URL, but with different query parameters. One particular parameter is parts of JWT in Authorization header.

        So the cache key in many, not all queries will contain the user ID. Those caches though are not long lasting (because we do not expect user sessions lasting for more than say half an hour).

        our current cache is in process cache2k.

        1. 2

          So the cache key in many, not all queries will contain the user ID.

          Oof, that makes it hard to get a useful hit rate, yes.

          I know someone who once had a varnish configuration which did roughly the following for incoming requests for some routes:

          • normalise the request a bit to improve hit
          • make another HTTP request to a backend using curl (in C in the middle of the VCL!)
          • the route that 2nd request did nothing but check authn and return an authz result,
          • the authz result got used as part of the cache key in vcl_hash() but the cookies used for authn did not
          • the first request then would hit the back or the cache as normal

          The extra http request cost them some time (but not too much because it was only exercising a small fraction of the application server). In return they got a cache hit rate high enough to make it a net benefit to throughput. e.g. if you had “member” role in a particular subsection of the site, your requests into that section would be cached along with everybody else whose role there was “member” too.

          1. 1

            Our mechanism (and per instance usage of cache2k as object cache), allows us to be a bit more nuanced of what it means to have a ‘cache hit’.

            Sure, we might not get a ‘response’ cache hit all the time, but a response is composed of say 10-15 separate entities that each, by themselves, might have been cached already (and some, due to updates, might have been invalidated, already)

            This is the case with user ID, it is possible that like 80% of the objects needed to compose the response, are already cached (many of the responses are basically ‘business model + something specific about the user’s usage of that business model).

            In that way, we might not hit the ‘response’ cache, but will hit object cache, and create the response with minimal hit to the database.

            Caching the parts (and not just full responses), does cost us ram memory, obviously, so we keep our caches ‘somewhat’ aware of the heap size allocated for our back-end java processes.

            I think in general, Varnish would be able to handle response caches.

            But once you introduced caching of parts, then, introduce, awareness of temporal data and database queries that serve the request – separate caching layer (with its own DSL), becomes a challenge, that might not outweigh benefits it brings.

            1. 2

              Yeah your strategy makes a lot of sense.

              IME the place where a http layer cache like Varnish really shines is when you have something like a HTML templating library on the front and it is slow as molasses. Sometimes actually slower than the DB accesses that would be necessary to generate an average page.

              In that case, caching whole responses makes stuff go hell of faster because the cache skips rendering to HTML too, not just the DB accesses. Or even caching large chunks of responses, using Edge Side Includes to take care of subparts that vary a lot in order to raise the hit rate on the rest.

              Caching bits of fully rendered HTML inside the application server itself can also give big speedups too but has some downsides - it’s likely that that cache will end up duplicated in each of the load balanced application servers.

    2. 1

      “Rather than providing lots of query parameter options on one URL, such as /comments?userid={n}, we use separate URLs for distinct dimensions of the data.”

      Why is it bad to use query params in the URL of a GET?

      1. 2

        Guessing here,

        I think the author of the article, has in mind a particular hierarchical cache model that matches the URL of API request eg a URL


        is better than


        and is better than /most-often-cached?parm1=&parm2=…

        In business apps that would mean that ‘business model’ parameters come first in the url, and then they are followed by user-id (or instance-id) of something.

        It also has some affinity to a good ‘UX’, in a sense that your API should reflect the business model you are asking your API users to learn and follow.

        But I am not sure I agree with author’s premise, that creating a response that mixes ‘reference data’ (eg seating availability) and booking info – is, somehow, not desirable.

        In my view, if you have lightweight simple clients that expect the ‘backend’ to compose complex responses mixing static (reference) and user-specific data – then the APIs should just accommodate that.