Yeah, the correlated failure issue was a question I had. My experience with AWS was that asking for a bunch of VMs simultaneously would inevitably lead to correlated failures.
Those are great for a few specific use cases. If I understand correctly, AWS only allows seven placement groups per zone for the types that are useful for availability. That’s far from nothing, and it’s great for things like a zonal etcd deployment. However, it’s not particularly useful when you have hundreds or thousands of VMs. There’s really no way for a public cloud to provide the kind of rack diversity you can get with your own data center.
Almost correct: it’s seven instance per PG, but you can have multiple PGs (multiple groups of 7).
You almost certainly never need size-7 groups unless you’re trying to implement your own disk controllers. size-5 gives you N+2 (1 planned maintenance, 1 unplanned maintenance, 3 operational). size-7 is only needed for N+3 (1 planned maintenance, 2 unplanned maintenance because rewriting an entire drive really does take that long, 4 operational).
I think you mixed it up: as far as I read it, they only allow seven instances per placement group per AZ for the one type that’s useful for availability. It still can be useful when you have a ton of VMs, but at that point you need to start considering using the AZ’s as the failure domains (don’t put everything into a single AZ!)
Interesting results, though obviously only directly useful for your distributed transactional database workloads.
Latency measurements are a little bit odd here, because you want good latency but not at the cost of correlated failures.
Yeah, the correlated failure issue was a question I had. My experience with AWS was that asking for a bunch of VMs simultaneously would inevitably lead to correlated failures.
Oh, just checked and they have special support for this:
https://cloud.google.com/compute/docs/instances/define-instance-placement https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/placement-groups.html#placement-groups-spread
Those are great for a few specific use cases. If I understand correctly, AWS only allows seven placement groups per zone for the types that are useful for availability. That’s far from nothing, and it’s great for things like a zonal etcd deployment. However, it’s not particularly useful when you have hundreds or thousands of VMs. There’s really no way for a public cloud to provide the kind of rack diversity you can get with your own data center.
Almost correct: it’s seven instance per PG, but you can have multiple PGs (multiple groups of 7).
You almost certainly never need size-7 groups unless you’re trying to implement your own disk controllers. size-5 gives you N+2 (1 planned maintenance, 1 unplanned maintenance, 3 operational). size-7 is only needed for N+3 (1 planned maintenance, 2 unplanned maintenance because rewriting an entire drive really does take that long, 4 operational).
I think you mixed it up: as far as I read it, they only allow seven instances per placement group per AZ for the one type that’s useful for availability. It still can be useful when you have a ton of VMs, but at that point you need to start considering using the AZ’s as the failure domains (don’t put everything into a single AZ!)