1. 18

  2. 6

    Adding and removing columns, keeping track of the basic state of the database is the easy part.

    Something that would be worth expanding on; how do you test migrations? How do you verify your migrations do what you want and that your code works with the (now updated) data?

    Also important; when do you run migrations? How do you do zero downtime migrations?

    1. 1

      Heroku employee here from the Department of Data. For migrations that could potentially cause high-impact, we advise customers to create a fork of their database and run a migration there first. Observe any problems and figure out a rough timeline. This may add some additional cost because you are briefly paying for two databases, but the overall cost is low compared to the risk of production impact.

      This isn’t a pure apples-to-apples comparison since you don’t have a production workload, but it is far better than running the migration on your laptop or a tiny test database and hoping for the best.

      As far as zero downtime migrations go, this requires some further engineering effort. The author mentions adding columns without a default value and then backfilling data. Solutions like these are best since they involve nearly 0 production impact. However, discipline and follow-up are required since your database will be in a sub-ideal state for a time.

      As for when… At Heroku we tend to do migrations and maintenance during the day when help is more readability available and everyone is at their best. Some seasoned developers/sysadmins may insist on doing migrations at 3 AM on a Sunday, but this only increases the risk of mistakes since everyone is tired, grumpy, and trying to move as quickly as possible to get back to sleep.

    2. 2

      Somewhat on-topic, the best tool I’ve found for managing migrations is sqitch. Not only is it way better than letting whatever obnoxious ORM you’re using manage the migrations, it’s actually pretty pleasant to use.

      1. 2

        This looks very similar to Flyway, which I love for the exact reasons you’re highlighting: you write the updates in pure SQL, making it trivial to use things like UPDATE INDEX CONCURRENT and the like (though unlike sqitch, Flyway does use database version numbers—something I’m okay with, honestly). That said, I don’t honestly find ORM migration tools universally awful. There’s nothing wrong with Django’s/Rails’/EF’s built-in tools for small projects; I’d just go to something like Flyway when the project gets a bit older and you start caring about things like details index specifications/making use of things such as the PostgreSQL update patterns like here.

        1. 1

          One of my team mates at a previous job was able to open-source our schema management tool (https://github.com/dbsteward/dbsteward). It was developed in-house and used in a production environment for many years before things like ActiveRecord migrations were commonplace.

          It requires a good bit of XML/XSLT knowledge but it worked very well for us in a HIPAA/PHI environment where data had to be tightly controlled and audited. It was a useful tool back in 2011 when working in a fairly legacy home-grown PHP codebase. I haven’t used dbsteward since I left that job, so I can’t really comment on using it with more modern frameworks.