This seems to be a well thought out proposal, and is pretty close to how I’ve seen structured logging implemented in other languages.
The issue I have with it is that structured logging is a bit dated; all projects I have been part of have moved to using OpenTelemetry to output traces/spans. We have it configured to output to the console in local development, and in other environments send to an OpenTelemetry collector.
I have yet to see a situation where I have traces, and I’ve thought “what this needs is a log message”.
Logging also not to be confused with console output as part of a CLI etc, that’s still important!
We have people pushing at work to replace logging with tracing, and it seems like a mixed bag. We don’t have consistent collectors for every layer of our architecture, and we apparently can’t/won’t collect traces without sampling for various reasons.
It also seems like the otel library people break their API on minor versions, which was frustrating as someone that had to integrate their code but definitely isn’t an expert.
Traces are cool when every service in your environment has the library configured, isn’t sampling, and your collectors are able to keep up.
The other use case that I cannot see traces ever replacing is any kind of access/request log that may be searched for audit purposes.
Ok apologies for the tracing rant, I’m glad that they are convenient and work for your use cases 👍
Everyone has different opinions :)
I think the sampling question is a really interesting one, regardless of whether you are doing it on logs or traces. For instance, we sample a lot of http traffic based on status code (and endpoint), not all traffic is equal!
As for the breakages, that could be down to pre 1.0.0 versions? Not totally sure on this, although I have definitely had issues when docs haven’t matched actual APIs.
One other nice side affect is when a downstream service suddenly starts appearing in your traces because they’ve started adding otel themselves; we have benefit without their data, but getting more just makes it even easier to debug (“when we send this particular kind of request, it hits a different codepath which is slow, let’s talk to them and see if we can change something in one or both of our systems “)
Not all codebase a need tracing and the simplicity of logs is still good. I originally set up a tracing client in a new product at the start of this year but never used it because logging was just easier to use with the existing datadog integration and my aggregated logs equal my local cli logs.
Tracing is useful and it has its place for sure but not every service is large or complex enough to warrant the additional engineering overhead.
I’ve found on all projects that I’ve preferred tracing, even without any real infrastructure behind it (i.e. just JSON to the console).
To me, the engineering overhead of tracing is just using otel libraries Vs a structured logging library, in other words, around the same cost of implementation.
OpenTelemetry tracing is basically a superset of structured logging. The differences are basically:
In my mind I don’t see the two as fundamentally different. Tracing is just a slightly evolved form of structured logging. So I definitely agree that if you have tracing you don’t need a second set of structured logging.
I think that’s pretty spot on with the hierarchy being most important. I’d also add that there is another difference: removal of duplicate properties between all the spans in a trace.
I’ve also seen tracing described as “Structured Logging on steroids”, which I think is pretty accurate too :)