1. 3

    Repeating a comment I left on HN because I’m curious about something they mentioned.

    If the dedupe worker crashes for any reason or encounters an error from Kafka, when it re-starts it will first consult the “source of truth” for whether an event was published: the output topic.

    Does this mean that “on worker crash” the worker replays the entire output topic and compare it to the rocksdb dataset?

    Also, how do you handle scaling up or down the number of workers/partitions?

    1. 1

      Sorry I didn’t get to this sooner! Having recently had to deal with this piece of software I can answer it now.

      We have a few levels of recovery when the worker crashes:

      • First check the last message in the output topic and compare it to what’s in rocksdb. If they match, we assume that we’re good.
      • Second, if for whatever reason the output topic is behind but we have commited something in rocksdb, we can fetch the last message from the output topic and update our rocksdb to the input offset the output topic has stored in its metadata (in the last message) then reconsume from there in the input topic.
      • Finally, we can reconsume the entire output topic and compare each message ID to what’s in rocksdb before moving forward.

      As for your second question, that’s much more difficult to answer. We had to scale up the number of partitions and it was quite painful, a topic unto itself. The gist of it though is that we had two instances of dedupe running, one after another, then turned off the upstream one and wrote directly to the downstream one. There were many nuanced details in making sure there weren’t any duplicates though, enough for a blog post of it’s own.

    1. 6

      Sadly, the author doesn’t go into much detail. I think that python gets it wrong.

      Here’s the thing: I’m a visually oriented person. I imagine a stack as - well - a stack. If you ask me to draw a stack, I will draw it with the last frame on top.

      If you would ask me to write down an exception history, I’d write. “A while evaluating B while…”.

      If you turn the whole thing upside down, it confuses the hell out of me.

      Scrolling is cheap.

      Also, bonus points for getting mad through garybernhardt, who I really dislike. Because he gets people to rant without interacting with the issue enough.

      1. 7

        Last frame on top also keeps the last frames (usually by far the most important) in roughly the same early character locations in things like log records, allowing you to do things like easily group by the first hundred characters of a collection of stack traces to see what errors/exceptions you’re encountering with the most frequency.

        1. 7

          I don’t believe there’s a “correct” way to print tracebacks. Neither method conveys more information than the other and they’re both cheap to implement. Sounds more like a religious thing to me.

          1. 2

            I just argued why the “traditional” method conveys information better for me (not more, but better). I probably does better for others - that’s not religious, but a matter of preference. So… a preference would be great :D.