1. 7

  2. 4

    I read the Tango paper a few months ago, and was generally disappointed, in that I felt that it didn’t have any novel contributions. Here, to me, is how Tango failed to be interesting:

    1. They claim to have a new idea for a high performance log. Their proof of this claim is essentially that SSDs are fast, and so by using something like paxos to dole out log locations, they can scalably write log entries in order. The issue here is that, for one thing, this strategy of doling out logical log indices before actually committing data has been known for a long time (for instance, in HPC durable queues).
    2. They write a lot about a system that amounts to “Hey look! Use materialized views to get value object semantics over a WAL!” This is pretty much how all databases work, so I was disappointed that they didn’t have a new idea here.
    3. They point out that they can shard the log in all kinds of cool ways, to meet any workload. Unfortunately, they can only shard the data portion of the log, not the transaction manager (i.e. the log index provider), and in order to determine whether transactions can proceed concurrently or must be serialized, they use the most restrictive trick in the book: analyze all reads and writes to do standard optimistic concurrency. The data portion of logs like this can obviously be cached and replicated to support the workload however needed.

    You can learn more about systems that behave like this by reading about Write Ahead Logs (for high performance transactions), Aeron [1] (HPC message queue with a similar style of fast appends), Datomic (uses a distirbuted WAL, materialized views, and adaptive replication [albeit simpler]), or Samza [2].

    Some areas of work that could, I think, improve the performance of systems like this:

    1. Come up with a limited form of transactions that would allow for logical merging of multiple logs, given limitations on transactions that span the transaction ID dispatcher.
    2. Come up with novel, adaptive ways to shard the queue to answer queries faster. Unfortunately, this is impossible with the structure of the usual queue, since the standard queue will have temporal locality and partition/entity locality. What about complex queries?

    [1] (please reply with a link to the project itself or video!) https://thestrangeloop.com/sessions/aeron-open-source-high-performance-messaging

    [2] http://www.youtube.com/watch?v=fU9hR3kiOK0