a proper solution for true fault isolation would have been one microservice per queue per customer, but that would have required over 10,000 microservices
…Why would Segment create different, individual microservices for every customer?
I suspect they really meant “one worker process/queue per customer” so one customer’s sudden influx of work doesn’t delay another customer’s work. If it’s a Rails monolith, you could conceivably start 10,000 Sidekiq processes to handle each customer’s workload.
In the monolith, problems connecting to one provider destination could have an adverse effect on all destinations and the entire system.
Hard to say without knowing more details, but I suspect that’s nothing to do with it being a monolith; that’s a higher-level design flaw or a limitation of the implementation.
My main project at work is a monolith that connects to lots of external providers, some of them highly unreliable. We designed its provider interaction subsystem with an assumption that providers would be flaky and slow and have frequent outages, and we never have this kind of problem.
…Why would Segment create different, individual microservices for every customer?
I suspect they really meant “one worker process/queue per customer” so one customer’s sudden influx of work doesn’t delay another customer’s work. If it’s a Rails monolith, you could conceivably start 10,000 Sidekiq processes to handle each customer’s workload.
Hard to say without knowing more details, but I suspect that’s nothing to do with it being a monolith; that’s a higher-level design flaw or a limitation of the implementation.
My main project at work is a monolith that connects to lots of external providers, some of them highly unreliable. We designed its provider interaction subsystem with an assumption that providers would be flaky and slow and have frequent outages, and we never have this kind of problem.