1. 4
  1.  

    1. 2

      I’ve actually had success with log all the things before, but usually in the form of what are essentially traces or request logs. Even this has it’s problems. It’s expensive, although often worth the cost. It’s almost certain you will need to do ongoing compliance work.

      The thing that makes this work is that it’s a structured log with a pretty uniform schema. There’s also a pretty well trodden pathway to sampling (including sampling complete transactions not just randomly dropping log line).

      What is usually a disaster though is log shipping unstructured printf style logs. As the article mentions, if you don’t know who’s going consuming them you don’t know which debug statements are load bearing. I’ve seen big companies trying to reduce logging costs but even though they knew 99% of the data was useless, they didn’t know which 99% it was.

      I think that a lot of companies are religiously saving random library output on stderr/stdout to long term storage. Instead it’s often better to have a local ring buffer or some sort.

      [edit]

      I should add the domain I work in because I think that matters. I’ve spent time in e-commerce and ad-tech. In e-commerce, if you have an average order value of $50 the percentage of revenue going to logging http requests is relatively small. In ad-tech, you just can’t budget as much storage for every impression.