1. 12
    1. 2

      That is neat, the kind of things you are really thankful to the blog’s author to have written, as it’s a rather obscure edge-case (at least from my point of view).

      For those (like me) wondering why there’s no such mechanism on MySQL : https://stackoverflow.com/questions/25153532/why-is-it-a-vacuum-not-needed-with-mysql-compared-to-the-postgresql

      PS: I’d suggest removing the [ruby] tab, and go with a [postgre] one.

      1. 2

        Just one clarification as PostgreSQL vacuum just does a lot of things and I think this behavior could have also happened with MySQL.

        In this specific blog post, the problem was caused by outdated table statistics.

        If you look at the query plan: https://explain.depesz.com/s/7qJI

        You can see that the planner wrongly estimated the amount of rows the query would return (the “rows x” column and up/down arrows compared to “rows” column).

        Vacuum in PostgreSQL is a process for reclaiming removed rows, preventing ID wraparound, updating statistics, dumping the pending list of a GIN index to the main tree and other maintenance tasks. This is a costly process and takes time.

        There is also a VACUUM FULL which does full table rewrites. That’s something you almost never run or even want to run - it takes A LOT of time.

        There are also autovacuum processes that are automatically ran to do the above in smaller incremental steps.

        Now, in this particular case. In my humble opinion a vacuum was not needed. The author could have just ran ANALYZE TABLE order_items or even ANALYZE TABLE order_items purchase_order_number would have the same result in a fraction of the time.

        I’m not a MySQL expert, but that database also has statistics (https://dev.mysql.com/doc/refman/8.0/en/analyze-table.html) and a step like this also might help the planner after large imports or adding new columns to big tables. I don’t see anything in the linked stackoverflow post indicating that MySQL would be immune to outdated statistics leading to an inefficient plan.