Nice intro to HyperLogLog, Locality-Sensitive Hashing (LSH), and a few other probabilistic algorithms. Not in-depth, best for those new to this stuff.
Very nice writeup! Thanks for posting. A small point: The curse of dimensionality refers to the mathematical difficulty of obtaining enough data to sensibly partition high dimensional feature spaces. This is independent of the computational difficulty mentioned here. i.e the curse of dimensionality still exists even if we have infinitely powerful computers - the limiting factor is data not compute power.
Hey, article author here. You’re totally right. I did not describe that part well. Thank you for the feedback!
This connects well with the RethinkDB article posted earlier https://lobste.rs/s/dqxcch/jepsen_rethinkdb_2_1_5/comments/wlk4z5#c_wlk4z5:
And hey, here’s @tbm’s presentation on the topic: http://www.infoq.com/presentations/probabilistic-algorithms