Infrastructure at your Service

Daniel Westermann

PostgreSQL 13 – Autovacuum can now be triggered based on inserts

A common issue with insert only tables in PostgreSQL is, that autovacuum well never kick in as the formula for autovacuum considers obsoleted tuples since the last vacuum but not the number of inserted tuples. Because of this you usually run a manual vacuum when a table is bulk loaded right after the load or you schedule regular vacuums for tables you know are insert only. PostgreSQL 13 will finally solve that issue because autovacuum will now be also triggered based on the amount of inserts into a table (more on the exact formula below). As usual let’s do a small, reproducible, demo to see how that works.

I have the latest (as of today) PostgreSQL development version running in the default configuration, except for some logging parameters I usually change. The table I’ll be playing with is this one:

postgres=# create table t1 ( a int, b text, c date );
CREATE TABLE
postgres=# \d t1
                 Table "public.t1"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           |          | 
 b      | text    |           |          | 
 c      | date    |           |          | 

When I bulk load that table in PostgreSQL before 13, autovacuum will never kick in. With PostgreSQL 13 we have two new configuration parameters that control autovacuum when it comes to inserts:

postgres=# show autovacuum_vacuum_insert_threshold;
 autovacuum_vacuum_insert_threshold 
------------------------------------
 1000
(1 row)

postgres=# show autovacuum_vacuum_insert_scale_factor;
 autovacuum_vacuum_insert_scale_factor 
---------------------------------------
 0.2
(1 row)

postgres=# 

The formula is:

vacuum insert threshold = autovacuum_vacuum_insert_threshold + autovacuum_vacuum_insert_scale_factor * number of tuples

When we start with an empty table autovacuum should kick in when we insert more than 1000 rows:

postgres=# insert into t1 (a,b,c) select i, i::text, now() from generate_series(1,1001) i;
INSERT 0 1001

Wait some time (autovacuum_naptime is 1 minute in the default configuration, so this is the maximum you will need to wait) and check pg_stat_user_tables for the last autovacuum:

postgres=# select last_autovacuum, autovacuum_count from pg_stat_user_tables where relname = 't1';
       last_autovacuum        | autovacuum_count 
------------------------------+------------------
 2020-03-29 15:25:42.75171+02 |                1
(1 row)

Perfect, that works. This is really a great improvement as you do not need to take special care of mostly insert only tables that do not receive a lot of updates or deletes. In addition you can probably skip all the manual vacuums after bulk loading.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Daniel Westermann
Daniel Westermann

Principal Consultant & Technology Leader Open Infrastructure