As you can see in the image below we have a simple status code alert set up to fire if the average 404 errors count is greater than 3 across my service. The hidden beauty of this alert is it dynamically adapts based on the scope of my Frontend service, whether it’s on 2 hosts or across 20, but that’s another blog post!)
This assumes the traffic levels of your service are constant, which may change due to the time of the day or performance improvements/degradation as you add features. An alert at 3 errors per second for a service that’s doing only 1 request per second can never fire.
Ratios of failures to total requests are generally a better way to handle this, as that scales with traffic levels and is more in line with what both you and your users care about.