This is really cool! Thanks for writing & sharing :)
I’m curious about a couple of things:
How do you recommend actually issuing alerts based on the query? I could imagine having a cron job execute the query and text/email/PagerDuty if there’s an anomaly returned for given period, but I’m wondering if there’s a better existing solution here.
You mentioned at the end there are some tools that provide similar functionality, I’m wondering if you could give a few examples? I know Datadog has great alarm tooling but I would imagine those are less general-purpose than a technique like this.
Hey Jeff, glad you liked it.
How do you recommend actually issuing alerts based on the query?
Just like you said. A cron job executing the query at regular intervals and sending an email/text/whatever if it detects an anomaly. I know that there are some reporting tools (I use Redash for example) that have this ability as well.
I’m wondering if you could give a few examples?
I imagine any monitoring tool should have this functionality. Datadog and Scout are two that come to mind.
The main point I wanted to convey in this article is that you set up a pretty descent monitoring system with plain SQL, zero dependencies and no $$$. From my experience, this simple method can go a long way.
We use https://www.anodot.com/ (somewhere in the org). It’s downstream from our metrics gathering that I maintain (Graphite, Prometheus) but supports both AFAIK. I also don’t know if there’s a free/open source version or it’s a paid tool.
At home:
At work:
P.S. This is my first time posting on this thread :)
Is the Recurse center a venue for ongoing technical / CS education? I had thought it was a boot camp focused on core CS skills.
Preparing for a public launch! Exciting after several months of building in private beta :)
That mostly means putting all new features on hold, and going through our backlog to see what’s essential and has been put off…