1. 4

  2. 1

    One mechanism of sharding that I think is much simpler and easier to scale is range-based sharding. In this scenario, you’d have the shards:

    • customers-1-100
    • customers-101-200
    • customers-201-300
    • customers-301-infinity

    Here, when you start you can simply have customers-1-infinity and as the database begins to reach say 50% capacity, cap it at say customers-1-100 and then make customers-101-infinity.

    Once the customers-1-100 shard grows to, say, 90% capacity, you can further, and simply split it in to customers-1-50 and customers-51-100, using fairly simple replication topologies to do this with little-to-no down time.

    This range-based mechanism means you don’t have to preemptively guess how many shards you want to hash by.

    Another way to do this would be to simply provision customers-1-100 and no -infinity shard, and monitor the highest customer ID you have, and preemptively provision another shard when you get “close” to the customer ID cap of the existing shard.