We use vegeta at work and it’s pretty great.
Importantly, it does not suffer from coordinated omission, which almost every other load generator does. Especially the ones you hack together on your own. Properly generating load is hard.
Coordinated omission is when the load generator backs off on the pressure when the target is overloaded. It happens naturally since you can’t keep an infinite amount of outgoing requests active at once. You have to compensate for this effect.
Coordinated omission sounds like something that will just result in overestimated performance numbers, but it’s worse than that. It can easily reverse figures, making performance improvements look like things got worse.
It seems like the Coordinated Omission problem is more or less a euphemism for backpressure. Backpressure is a cross-cutting concern, though. You can see it if any of your systems in the chain slow down, from the load generator, through the web server, appserver, etc. A bottleneck in any area makes the downline systems look “bettter” under reduced load.
The remedy for backpressure awareness is that, as you add more concurrency you watch that the graph for throughput and requests/second increase in the exact same manner as the increase in VUs. I don’t think this is a tool problem–just an area to have awareness. For example, as I ramp up load, if I’m using a step pattern, bandwidth and hits/second should follow the step pattern. If they don’t, the test is invalid the moment they deviate.
The worst tools, in my experience, are the ones that “throw load” when real users would be “stuck.” Those tools are far more dangerous and have worse failure modes. For example, if the page returned happens to have a 200 status code, with a giant message saying “Critical System Failure” a tool that ignores backpressure and slings load might show you improved performance once the error page appears as it loads fast!
Good points! With your awareness of the problems involved, and willingness to declare a test invalid when the actual load doesn’t match the attempted load, I’m not too worried about you drawing the wrong conclusions – no matter what tool you use.
The problem you mention in your last paragraph is a problem whether or not your tool corrects for constant attempted load. Any tool ignores backpressure when the “backpressure mechanism” is a very fast 200 OK.
This is not what the question asked, but since details of the question reveal some gaps in knowledge about load testing, I’ll take a moment to address one issue:
Knowledge of how to properly measure and test load is way more important than choice of tooling. Gil Tene talks about this in an approachable way. A good start might be the YouTube video that got me down the rabbit hole a long time ago: https://m.youtube.com/watch?v=lJ8ydIuPFeU
Really good content in that, in general!
I have some thoughts on the Coordinated Omission problem, but I moved them to your comment above.