1. 6
  1.  

  2. 2

    I used this paper’s technique in a wireless sensor network research project (embedded hardware, very limited RAM) back in like 2006 or so. So, a fun trip down memory lane seeing it posted here! :)

    Is there nothing that builds upon this idea since? Surely stats dashboards (StatsD / Graphite, etc.) would want to use something like this I would think.

    1. 1

      I haven’t had a chance to look it over carefully yet, but this one seems to be very similar, though I’m not sure how it compares.

    2. 2

      For estimating histograms:

      We show that a very good estimate of any quantile can be obtained by locating points at the minimum, the maximum, and at the current estimates of (p/2)-, p-, and (1 + p)/2-quantiles. This divides observations into four cells whose boundaries (called marker heights) are ad- justed if necessary using a Piecewise-Parabolic (PP or P²) formula.

      1. 1

        This algorithm lets you calculate percentiles (or whatever quantile) in a constant amount of memory. On the other hand, the naive approach is to capture every data point and calculate the percentiles from that - which means high throughput streams of data will require a lot of memory. This seems to be an old algorithm that’s unfortunately been lost.

        1. 2

          I assure you that these algorithms are alive and well in databases.