I was curious how my fellow 🦞’s manage logs generated by their services, be it at home or at work.
It looks like “Observability” is the cool new trend nowadays and it focuses heavily on structured logging and dynamic sampling, with many tools built to aid that so I’m wondering what tools my fellow crustaceans use, are they built in-house or a vendor product, and how has that been going.
I was reading a blog post on Observability and found a few interesting quotes:
When a request enters a service, we initialize an empty Honeycomb blob and pre-populate it with everything we know or can infer about the request: parameters passed in, environment, language internals, container/host stats, etc. While the request is executing, you can stuff in any further information that might be valuable: user ID, shopping cart ID — anything that might help you find and identify this request in the future, stuff it all in the Honeycomb blob. When the request is ready to exit or error, we ship it off to Honeycomb in one arbitrarily wide structured event, typically 300-400 dimensions per event for a mature instrumented service.
Dynamic sampling isn’t the dumb sampling that most people think of when they hear the term. It means consciously retaining ALL the events that we know to be high-signal (e.g. 50x errors), sampling heavily the common and frequent events we know to be low-signal (e.g. health checks to /healthz), sampling moderately for everything in between (e.g. keep lots of HTTP 200 requests to /payment or /admin, few of HTTP 200 requests to /status or /). Rich dynamic sampling is what lets you have your cake and eat it too.