Creating a fully descriptive model of modern web app latency is tough. There’s a whole bunch of considerations to keep in mind, and some abstractions that most folks don’t really “pierce the veil” behind:
I’ve thought about writing a modeling framework for web apps based on my own observations, but the whole project does seem like a lot of work. At $WORK we’ve had good success measuring mean and 99th-percentile latencies (as means are always a safe summary statistic and are also meaningful for exponential distributions), but more of that is due to how we set our SLAs rather than an informed attitude on the experience we wish to offer users.
I’ve had decent success with counting the number of requests that are over some threshold (SLI) in addition to using histograms.