A simple approach is to set a fixed concurrency level - say 1 or 2 - and have at most that many requests in-flight at a time. That will slow down or speed up the rate at which you send requests depending on how long the system takes to respond.
This requires you to make a conservative guess at the amount of concurrency the system you are talking to can withstand. I think that is often feasible because you can count the number of CPUs allocated.
A simple approach is to set a fixed concurrency level - say 1 or 2 - and have at most that many requests in-flight at a time. That will slow down or speed up the rate at which you send requests depending on how long the system takes to respond.
This requires you to make a conservative guess at the amount of concurrency the system you are talking to can withstand. I think that is often feasible because you can count the number of CPUs allocated.