1. 3

A detailed analysis of finding and fixing database driver bugs at scale in production.

  1.  

  2. 1

    When I read the fix for the first problem (read-to-see-if-error-before-write) it occurred to me that there was still a race (since the timeout could occur after the read but before the write).

    i.e. this is not a “fix” in the sense that “the problem cannot occur”, but it turns the problem into a “so unlikely to occur that it is fixed for all practical purposes while we have other bugs in the system”.

    Which I find interesting since it is on the interface between correctness and practicality.