I am dealing with a read-only view of this right now. I happen to interact with a large go fileserver on a diskless system. Spoiler: I don’t know the mathematically optimal way to configure these things.
The service serves a small file very frequently, it also has a long tail of large downloads (think big software install and data files). It already uses a CDN, so it mostly deals with long tail requests.
There’s a multi-tier caching system starting with an in-memory LRU, a pool of ram on nearby nodes, and object storage. So our LRU handles the small files extremely efficiently (it doesn’t even allocate because it can stream a single immutable entry to multiple concurrent clients). It doesn’t really do well with large files because right now it has no concept of “this file is so large it’s going to blow through the cache”.
For our use case the right call might just be separate clusters tuned for particular workloads. Serve small files diskless from ram, serve big files from either a traditional linux disk with tons of page caching and/or from nvme devices.