1. 9
  1. 2

    When writing the post-mortem for this bug, I spotted that data in our staging and production services were different. And that’s why our data migration crushed and left one of the core tables in the broken state.

    Isn’t the discrepancy between staging and production data the root cause of your issue? If your staging service data is different enough from production data that this happens, I don’t think adding additional testing of migrations is a real fix.