1. 14

  2. 3

    Repeating a comment I left on HN because I’m curious about something they mentioned.

    If the dedupe worker crashes for any reason or encounters an error from Kafka, when it re-starts it will first consult the “source of truth” for whether an event was published: the output topic.

    Does this mean that “on worker crash” the worker replays the entire output topic and compare it to the rocksdb dataset?

    Also, how do you handle scaling up or down the number of workers/partitions?

    1. 1

      Sorry I didn’t get to this sooner! Having recently had to deal with this piece of software I can answer it now.

      We have a few levels of recovery when the worker crashes:

      • First check the last message in the output topic and compare it to what’s in rocksdb. If they match, we assume that we’re good.
      • Second, if for whatever reason the output topic is behind but we have commited something in rocksdb, we can fetch the last message from the output topic and update our rocksdb to the input offset the output topic has stored in its metadata (in the last message) then reconsume from there in the input topic.
      • Finally, we can reconsume the entire output topic and compare each message ID to what’s in rocksdb before moving forward.

      As for your second question, that’s much more difficult to answer. We had to scale up the number of partitions and it was quite painful, a topic unto itself. The gist of it though is that we had two instances of dedupe running, one after another, then turned off the upstream one and wrote directly to the downstream one. There were many nuanced details in making sure there weren’t any duplicates though, enough for a blog post of it’s own.