1. 19

  2. 2

    A note from the trenches: (especially) if your Postgres workload involves lots of transactions, make sure that your AUTOVACUUM and/or VACUUM processes run often enough, and to completion.

    If not, you can hit an issue where the database runs out of transaction IDs – xid is an int32, and very bad things happen when you hit the ceiling. The database will dutifully shut down and come back up read-only until a VACUUM process (manual, this time) is run to clean up various toast (temp) files. This takes days to weeks.

    I came distressingly close to this point last fall. I woke up one day to discover that my poor DBA skills had given me about ten days to migrate off that database, because I’d been pushing it too hard for months and AUTOVACUUM wasn’t keeping up. One very stressed week later, most of that workload had been migrated to DynamoDB, and the rest we were able to transfer to a fresh Postgres DB.

    Here’s a blow-by-blow of when this problem took Sentry down in 2015: https://blog.sentry.io/2015/07/23/transaction-id-wraparound-in-postgres.html

    Anyway, I hope this anecdote saves at least one Lobster from a week like I had.