1. 4
  1.  

  2. 4

    I think this article describes audit tables. And it seems an argument is made that an audit table is specifically designed for the table whose changes it tracks, so audit tables as a solution are not “automatic”.

    I’ve been working on a flexible audit table solution for Postgres, based on the official docs, some blog posts and advice from the Postgres IRC channel. It works like this:

    First you create the audit table to store the changes. Yes, the table means there’s only one for one or more other tables to be tracked. The assumption is that this table is only for storage and whenever you need to manipulate data, you copy a subset of the table into a temporary table for further work.

    CREATE TABLE IF NOT EXISTS public.audit
    (
      change_date timestamp with time zone NOT NULL DEFAULT now(),
    
      -- session_user may be the (or an) application's DB role or perhaps a developer's role
      session_user_name text NOT NULL,
    
      -- current_user may be set to something else than the user that started the session
      current_user_name text NOT NULL,
    
      -- this will be provided by the application (SET "application"."user" = 'bob';)
      -- useful if its users are stored in a table and not DB roles
      application_user_name text,
    
      -- indicating command that modified data (insert / delete / update)
      action character(1) NOT NULL,
    
      -- the table where the data was modified
      table_name text NOT NULL,
    
      -- the table id. May be useful if indexed and you query the audit table for a specific relid
      relid oid NOT NULL,
    
      -- values identifying the changed row
      pkey jsonb,
    
      -- the object before and after the change. JSONB for schemaless data (this can store rows from multiple different tables)
      before_change jsonb,
      after_change jsonb NOT NULL
    );
    

    Everything in that table except the applitation user name will be provided by Postgres, there’s no manual work to do to make it work.

    If it’s not guaranteed that the application will always provide the application_user_name, it’s convenient to set an empty default:

    SET "application"."user" = '';
    ALTER SYSTEM SET "application"."user" = '';
    SELECT pg_reload_conf();
    

    This has to be done only once for a PG cluster.

    Then you’ll have to define a function whose purpose is to record changes. It’s designed to be executed by a trigger; you’ll define such triggers for each table you want to audit.

    CREATE OR REPLACE FUNCTION public.audit() RETURNS trigger AS
    $BODY$
    DECLARE
      before JSONB; after JSONB;
      pkey JSONB;
      source record;
    BEGIN
      IF TG_OP = 'UPDATE' THEN
        IF NEW IS NOT DISTINCT FROM OLD THEN RETURN NEW; END IF;
          SELECT json_object_agg(key, value)::jsonb
          INTO after
          FROM (
            -- EXCEPT here eliminates fields that didn't change.
            SELECT * FROM json_each_text(row_to_json(NEW.*))
            EXCEPT
            SELECT * FROM json_each_text(row_to_json(OLD.*))
          ) y;
          SELECT json_object_agg(key, value)::jsonb
          INTO before
          FROM (
            SELECT * FROM json_each_text(row_to_json(OLD.*))
            EXCEPT
            SELECT * FROM json_each_text(row_to_json(NEW.*))
          ) y;
        source := NEW;
      ELSIF TG_OP = 'DELETE' THEN
        SELECT json_object_agg(key, value)::jsonb INTO after FROM json_each_text(row_to_json(OLD.*));
        source := OLD;
      ELSIF TG_OP = 'INSERT' THEN
        SELECT json_object_agg(key, value)::jsonb INTO after FROM json_each_text(row_to_json(NEW.*));
        source := NEW;
      END IF;
    
        SELECT json_object_agg(key, value)::jsonb
        INTO pkey
        FROM (
          SELECT *
          FROM json_each(row_to_json(source.*)) AS j
          WHERE EXISTS (
            SELECT a.attname
            FROM pg_index i
            JOIN pg_attribute a ON a.attrelid = i.indrelid AND a.attnum = ANY(i.indkey)
            WHERE
              i.indrelid = CONCAT_WS('.', TG_TABLE_SCHEMA, TG_TABLE_NAME)::regclass
              AND i.indisprimary
              AND a.attname = j.key
          )
        ) y;
    
      INSERT INTO audit(
        session_user_name,
        current_user_name,
        application_user_name,
        action,
        table_name,
        relid,
        pkey,
        before_change,
        after_change
      )
      VALUES (
        session_user,
        current_user,
        current_setting('application.user'),
        SUBSTRING(TG_OP, 1, 1),
        CONCAT_WS('.', TG_TABLE_SCHEMA, TG_TABLE_NAME),
        TG_RELID,
        pkey,
        before,
        after
      );
      RETURN NEW;
    END;
    $BODY$ LANGUAGE plpgsql VOLATILE COST 100;
    

    Having done the above, you can now start auditing chosen tables, for example:

    CREATE TRIGGER audit
      AFTER INSERT OR UPDATE OR DELETE -- you can choose any combination here. Note that UPDATE lets you choose columns to watch for changes
      ON shop.books
      FOR EACH ROW
      EXECUTE PROCEDURE public.audit();
    
    1. 1

      What’s the reasoning behind COST 100?

      1. 1

        It’s just a default. I haven’t put much thought into it.

    2. 1

      I get an “insecure site” warning in both Firefox & Chrome.

      1. 2

        Seems like a slightly buggy github pages setup. I found the post here: https://github.com/horia141/horia141.github.io/blob/master/_posts/2017-08-21-history-tables.md