[Comment removed by author]
The OS can still be your bottleneck in a IO-heavy workload if you have a one thread:request model. Your maximum number of concurrent requests is going to be limited by the amount of threads that you are able to keep around at any one time. This is the thesis of the article.
The solution then is to break the one thread per request model by reaching out to libraries (or languages) that use lightweight threads/fibers. These are managed by the runtime instead of the OS and allow for about two orders of magnitude more concurrent operations than threads do.
Does anyone implement this? Its the naive base-case, practically a strawman. A pool of threads is going to be far more common […]
Yes, this is what I (and the article, I assume) was referring to. Having a thread pool does not invalidate the conclusion; even though you can reuse threads, you still have one thread of execution per OS thread, thus limiting your overall concurrency.