Infrastructure at your Service

Daniel Westermann

PostgreSQL 13: parallel vacuum for indexes

Because of its implementation of MVCC PostgreSQL needs a way to cleanup old/dead rows and this is the responsibility of vacuum. Up to PostgreSQL 12 this is done table per table and index per index. There are a lot of parameters to fine tune auto vacuum but none of those allowed vacuum to run in parallel against a relation. The only option you had to allow auto vacuum to do more work in parallel was to increase autovacuum_max_workers so that more relations can be worked on at the same time. Working against multiple indexes of the same table was not possible. Once PostgreSQL 13 will be released this will change.

When you have a look at the help of vacuum you will notice a new option:

postgres=# \h vacuum
Command:     VACUUM
Description: garbage-collect and optionally analyze a database
Syntax:
VACUUM [ ( option [, ...] ) ] [ table_and_columns [, ...] ]
VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ table_and_columns [, ...] ]

where option can be one of:

    FULL [ boolean ]
    FREEZE [ boolean ]
    VERBOSE [ boolean ]
    ANALYZE [ boolean ]
    DISABLE_PAGE_SKIPPING [ boolean ]
    SKIP_LOCKED [ boolean ]
    INDEX_CLEANUP [ boolean ]
    TRUNCATE [ boolean ]
    PARALLEL integer

and table_and_columns is:

    table_name [ ( column_name [, ...] ) ]

URL: https://www.postgresql.org/docs/devel/sql-vacuum.html

By providing a positive integer to the “PARALLEL” option you tell vacuum how many background workers should be used to vacuum indexes for a given table (0 will disable parallel processing). Lets do a small test setup to demonstrate this:

postgres=# create table t1 as select i as a, i::text as b, now() as c from generate_series(1,3000000) i;
SELECT 3000000
postgres=# create index i1 on t1(a);
CREATE INDEX
postgres=# 
postgres=# create index i2 on t1(b);
CREATE INDEX
postgres=# create index i3 on t1(c);
CREATE INDEX

One table, four indexes. If we go for parallel 4 we should see four background workers doing the work against the indexes:

postgres=# update t1 set a=5,b='ccc',c=now() where mod(a,5)=0;
UPDATE 600000
postgres=# vacuum (parallel 4) t1;
VACUUM

As the table and the indexes are quite small we need to be fast but at least 2 parallel workers show up in the process list for the vacuum operation:

postgres 16688 15925 13 07:30 ?        00:01:07 postgres: postgres postgres [local] VACUUM
postgres 19184 15925  0 07:39 ?        00:00:00 postgres: parallel worker for PID 16688   
postgres 19185 15925  0 07:39 ?        00:00:00 postgres: parallel worker for PID 16688   

Nice. Please note that indexes are only considered for parallel vacuum when they meet the min_parallel_index_scan_size criteria. For FULL vacuum there will be no parallel processing as well.

Btw: The current maximal value is 1024:

postgres=# vacuum (parallel -4) t1;
ERROR:  parallel vacuum degree must be between 0 and 1024
LINE 1: vacuum (parallel -4) t1;

You can also see the parallel stuff on the verbose output:

postgres=# vacuum (parallel 4, verbose true) t1;
INFO:  vacuuming "public.t1"
INFO:  launched 2 parallel vacuum workers for index vacuuming (planned: 2)
INFO:  scanned index "i2" to remove 600000 row versions by parallel vacuum worker
DETAIL:  CPU: user: 0.24 s, system: 0.06 s, elapsed: 0.89 s
INFO:  scanned index "i1" to remove 600000 row versions
DETAIL:  CPU: user: 0.17 s, system: 0.10 s, elapsed: 1.83 s
INFO:  scanned index "i3" to remove 600000 row versions by parallel vacuum worker
DETAIL:  CPU: user: 0.16 s, system: 0.14 s, elapsed: 1.69 s
INFO:  scanned index "i4" to remove 600000 row versions by parallel vacuum worker
DETAIL:  CPU: user: 0.25 s, system: 0.09 s, elapsed: 1.17 s
INFO:  "t1": removed 600000 row versions in 20452 pages
DETAIL:  CPU: user: 0.17 s, system: 0.16 s, elapsed: 1.43 s
INFO:  index "i1" now contains 3000000 row versions in 14308 pages
DETAIL:  600000 index row versions were removed.
1852 index pages have been deleted, 640 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "i2" now contains 3000000 row versions in 14305 pages
DETAIL:  600000 index row versions were removed.
1851 index pages have been deleted, 640 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "i3" now contains 3000000 row versions in 14326 pages
DETAIL:  600000 index row versions were removed.
3941 index pages have been deleted, 1603 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "i4" now contains 3000000 row versions in 23391 pages
DETAIL:  600000 index row versions were removed.
5527 index pages have been deleted, 2246 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "t1": found 600000 removable, 3000000 nonremovable row versions in 21835 out of 22072 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 499
There were 132 unused item identifiers.
Skipped 0 pages due to buffer pins, 237 frozen pages.
0 pages are entirely empty.
CPU: user: 0.75 s, system: 0.36 s, elapsed: 5.07 s.
INFO:  vacuuming "pg_toast.pg_toast_16392"
INFO:  index "pg_toast_16392_index" now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "pg_toast_16392": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 499
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Daniel Westermann
Daniel Westermann

Principal Consultant & Technology Leader Open Infrastructure