I’ve been starting to dive into Cassandra in the past couple weeks so this is aptly timed.
My question:
How can you provide ordering using Cassandra if wall clocks aren’t trustworthy? I understand how to do this with vclocks but don’t know how to get started building a system that needs ordering and convergent data structures while using Cassandra.
A simple example is using a unique column for each write. Imagine a row “user:brett:friends” and I do two writes from two servers.
// Bad way:
write1: "user:brett:friends" set column "data" to "[5]"
write2: "user:brett:friends" set column "data" to "[7]"
read1: "user:brett:friends" get column "data"
// Clearly only one of those writes will win, so you'll get [5] or [7]
// Better:
write1: "user:brett:friends" set column "68e0f266-89c1-11e3-96fa-cd5de4cf87ee" to "5"
write2: "user:brett:friends" set column "723ba1a8-89c1-11e3-94ad-82121e36f60d" to "7"
read1: "user:brett:friends" get all columns, treat items like siblings and do merge/dupe handling in your app
// Final result will be [5, 7]
Also, Cassandra 2.0 comes with it’s own consensus implementation so you can actually hold a lock on a key to do a write without having to the UUID dance. I can’t elaborate much because I don’t use it (almost all of my data is event logs so TimeUUID for columns is all I need).
I’ve been starting to dive into Cassandra in the past couple weeks so this is aptly timed.
My question:
How can you provide ordering using Cassandra if wall clocks aren’t trustworthy? I understand how to do this with vclocks but don’t know how to get started building a system that needs ordering and convergent data structures while using Cassandra.
A simple example is using a unique column for each write. Imagine a row “user:brett:friends” and I do two writes from two servers.
Also, Cassandra 2.0 comes with it’s own consensus implementation so you can actually hold a lock on a key to do a write without having to the UUID dance. I can’t elaborate much because I don’t use it (almost all of my data is event logs so TimeUUID for columns is all I need).
Thanks. That is generally in line with what I expected.