1. 14

  2. 3

    A nice overview. Some things that jumped out to me:

    • pg_blocking_pids. You used to have to do a gnarly select on pg_stat_activity to figure this out; this seems much better.
    • jsonb_insert. Updating jsonb has been painful in the past, glad to see it get a bit better.
    • pg_stat_progress_vacuum seems like it could be nice for monitoring.
    • parallel sequential scan (and apparently aggregations too).
    • I can’t quite figure out what the deal is with the “Avoid Full-Table Vacuum” bit but it certainly sounds useful.
    1. 8

      I can’t quite figure out what the deal is with the “Avoid Full-Table Vacuum” bit but it certainly sounds useful.

      Postgres periodically has to run what is called a “freeze vacuum”. Previously a freeze vacuum would require reading all of the data for a table. In 9.6, they changed it to read only a subset of the table.

      Freeze vacuums are necessary because of how Postgres handles transactions. When processing a query, Postgres will ignore every row that was inserted after the query started, or deleted before the query started. To do this, in every row, Postgres keeps both the id of the transaction that inserted the row, and the id of the transaction that deleted the row. Since transaction ids come from a counter, the two transaction ids in the row, along with a list of transactions that were running when the query started allow you to determine whether a row should be visible to the query. If the transaction that inserted the row wasn’t running when the query started, and the id of the inserting transaction is smaller than the id of the transaction of the query, the row was inserted before the query started. Similarly logic is used to determine whether the row was deleted after the query started.

      The problem with this scheme is that transaction ids are only 32 bits. To work around this limitation, Postgres will periodically cleanup old transaction ids, so it can reuse them. When the newest transaction id reaches 2^32, the transaction id will wrap back around to the smallest transaction id (I believe 3 is the smallest since 0-2 are reserved for special purposes). By the time this happens, Postgres will have already replaced everywhere the transaction id 3 is used with a special transaction id (I believe 2) that represents “a really old transaction”.

      Making sure old transaction ids are no longer mentioned is done by the freeze vacuum. Once the difference between the oldest id still used in a table and the newest transaction id reaches some threshold, a vacuum freeze will be triggered on that table. The vacuum freeze will replace all transaction ids older than a certain point with the special transaction id.

      The vacuum freeze used to read every single page for the table and check every single row on that page. This means if you ran a vacuum freeze twice, it would read all of the data for the table twice, even though the second freeze was a no-op. The way they fixed this in 9.6 is by externally keeping track of what pages have only frozen tuples (tuples with the special transaction id). When the freeze vacuum freezes every row on a page, it will set the bit for that page. When a new row is inserted onto that page, or a row from that page is deleted, the bit is unset. The freeze vacuum will then skip over pages for which the bit is set since it knows that all of the rows on those pages are already frozen. If the table is append-only, this means the freeze vacuum won’t redo any work previously done.