This article is an incredible performance deep dive, kudos to Netflix! If anyone is interested in the original issue, the replies to this Brendan Gregg’s tweet mention many related resources. Notably, a type-pollution-agent and a related async-profiler discussion.
While running similar tests on Mac against a WebSocket application, I’ve also encountered crashes at a certain limit, no diagnostic info either. I wonder if anyone could shed light on how these crashes could be investigated and whether there is a solution.
I personally find simple TCP proxies great for failure testing. E.g. when you want to know how your application behaves when the connection interrupts suddenly, blackhole situations, timeouts, malformed data, etc. It’s often quite hard to test the failure scenarios in any other way. I’ve never found a ready-to-use solution that works perfectly for all possible cases. Usually, it’s much simpler to write your own very quickly in your favourite language and make it do what you want to do.
This is a unique type of post. It sparked a whole new stream of language and library feature development. One of the most notable is the design shift of Kotlin coroutines which was well described by one of the lead language designers.
Not only Kotlin, Swift as well: https://github.com/apple/swift-evolution/blob/main/proposals/0304-structured-concurrency.md#cancellation
A very interesting paper! Another example of a metastable failure could be a recent Cloudflare outage where a partial network partitioning caused a repeatedly failing RAFT election cycle which then caused a prolonged degraded state due to a replica rebuild, as described in Michael Pigott on Toward a Generic Fault Tolerance Technique.
Reading with interest, but only part-way through.
One small note: in one place you refer to “the listing on the left”, but this doesn’t work on mobile, because it doesn’t show the listings next to each other.
Thanks! The issue is a result of me migrating the blog from the old clojure-based static generator to fancier static nextjs. I need to figure out how to make the horizontal scroll more visible there, or maybe replace it with something more responsive.