1. 15
  1.  

  2. 12

    I appreciate that a columnar data store confers many advantages to high-cardinality time-oriented telemetry data, and this article was a really nice overview of the mechanics, but

    You pretty much need to use a distributed column store in order to have observability.

    feels like an over-reach that isn’t really supported by the facts brought to the table.

    Also,

    The result is blazing-fast query performance

    Which petition do I sign to put a moratorium on the word “blazing” in any technical context?

    1. 4

      feels like an over-reach

      They failed to set the scene. I guess their implied context is “you have many thousands of machines continuously spamming you with rich data points you need to query ad-hoc with various aggregation styles” which is not everyone’s experience, so the “need” is different.

      1. 3

        Yeah it would be nice to see a low level comparison of how their system handles high cardinality metrics vs prometheus

        1. 1

          I guess their implied context is “you have many thousands of machines continuously spamming you with rich data points you need to query ad-hoc with various aggregation styles” which is not everyone’s experience, so the “need” is different.

          Even then, though.

      2. 3

        I found it hard to get past how inaccurate the description of row-oriented stores is, although it’s correct enough to explain the difference.

        Seems clear that sharding by timestamp is a key technique for answering observability queries efficiently. The rest seems more tenuous, but at least plausible.

        1. 2

          The part I’m still failing to understand is how one moves from columnar stores, where events are ordered by timestamps, to tracing (which is something Honeycomb supports). Subsequent events don’t share timestamps, and now we’re having a relational data (that is, two events are in a relation).

          1. 2

            When computing an aggregate, you only have to read one column. That’s a substantial I/O saving. If you’re filtering first, you can read the filter column to build a bitmap (1 bit per row), then read the aggregation column; still far less I/O than reading whole records.

        2. 2

          Silly me, I thought “observability” was something to do with pub-sub. But it turns out it’s just a buzzword for “fast arbitrary queries”.

          The article also conflates row-oriented storage with fixed schemas. It’s not hard to add arbitrary columns to a row-oriented store; one common way is to store JSON, or some other semi-structured key-value map.

          That aside, this is a decent intro to column stores and their pros and cons. I’m not sure what makes theirs different/better than the others, though.

          1. 2

            Silly me, I thought “observability” was something to do with pub-sub. But it turns out it’s just a buzzword for “fast arbitrary queries”.

            No, observability refers to the nature of the data they’re ingesting — telemetry data from running services.