1. 7

    1. 2

      Released in 2002, Tangosol Coherence 1.2 introduced distributed scale-out shared-nothing auto-partitioning in-memory data management, with dynamic balancing and HA. Acquired by Oracle in 2007, and open sourced a few years ago: https://github.com/oracle/coherence/

      Other implementations appeared shortly thereafter, including Gemstone (acquired by VMWare/Springsource) and Gigaspaces (based on Linda / Tuplespaces, and built on JINI/Javaspaces). More recently: Hazelcast and GridGain (open sourced as Apache Ignite).

    2. 2

      Every sharding solution has one critical component in its architecture. This component can go by various names, including coordinator, router, or director … The coordinator is the sole component aware of data distribution. It maps client requests to specific shards and then to the corresponding database instance. This is why clients must always route their requests through the coordinator.

      It’s definitely possible to implement sharding without requiring requests to go through a central coordinator. If you can assume a well-defined set of peers, then any consistent hash function can be executed on any node totally independently, and will map a given key to a deterministic subset of those peers. And techniques like Rendezvous hashing can be used if the set of peers changes over time.

      While sharding involves splitting data across multiple standalone instances, it doesn’t inherently mean the system is distributed.

      If a system is split across multiple instances, is that not the definition of “distributed”?