Infrastructure at your Service

Daniel Westermann

Be careful with prepared transactions in PostgreSQL

PostgreSQL gives you the possibility for two-phase commit. You’ll might need that if you want an atomic distributed commit. If you check the PostgreSQL documentation there is a clear warning about using these kind of transactions: “Unless you’re writing a transaction manager, you probably shouldn’t be using PREPARE TRANSACTION”. If you really need to use them, you need to be very careful, that prepared transactions are committed or rollback-ed as soon as possible. In other words, you need a mechanism that monitors the prepared transactions in your database and takes appropriate action if they are kept open too long. If this happens you will run into various issues and it is not immediately obvious where your issues come from.

To start with, lets create a simple prepared transaction:

postgres=# begin;
BEGIN
postgres=*# create table t1 (a int);
CREATE TABLE
postgres=*# insert into t1 values (1);
INSERT 0 1
postgres=*# prepare transaction 'abc';
PREPARE TRANSACTION

From this point on, the transaction is not anymore associated with the session. You can verify that easily if you try to commit or rollback the transaction:

postgres=# commit;
WARNING:  there is no transaction in progress
COMMIT

This also means that the “t1” table that was created before we prepared the transaction is not visible to us:

postgres=# select * from t1;
ERROR:  relation "t1" does not exist
LINE 1: select * from t1;
                      ^

Although we are not in any visible transaction anymore, there are locks in the background because of our prepared transaction:

postgres=# select * from pg_locks where database = (select oid from pg_database where datname = 'postgres') and mode like '%Exclusive%';
 locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid |        mode         | granted | fastpath | waitstart 
----------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-----+---------------------+---------+----------+-----------
 relation |    12969 |    24582 |      |       |            |               |         |       |          | -1/562             |     | RowExclusiveLock    | t       | f        | 
 relation |    12969 |    24582 |      |       |            |               |         |       |          | -1/562             |     | AccessExclusiveLock | t       | f        | 
(2 rows)

There is one AccessExclusiveLock lock, wihch is the lock on the “t1” table. The other lock, “RowExclusiveLock”, is the lock that protects the row we inserted above. How can we know that? Well, currently this is only a guess, as the “t1” table is not visible:

postgres=# select relname from pg_class where oid = 24582;
 relname 
---------
(0 rows)

Once we commit the prepared transaction, we can verify, that it really was about “t1”:

postgres=# commit prepared 'abc';
COMMIT PREPARED
postgres=# select relname from pg_class where oid = 24582;
 relname 
---------
 t1
(1 row)

postgres=# select * from t1;
 a 
---
 1
(1 row)

We can also confirm that by again taking a look the locks:

postgres=# select * from pg_locks where database = (select oid from pg_database where datname = 'postgres') and mode like '%Exclusive%';
 locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted | fastpath | waitstart 
----------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-----+------+---------+----------+-----------
(0 rows)

These locks are gone as well. So, not a big deal, as soon as the prepared transaction is committed all is fine. This is the good case and if it goes like that you will probabyl not hit any issue.

Lets create another prepared transaction:

postgres=# begin;
BEGIN
postgres=*# insert into t1 values(2);
INSERT 0 1
postgres=*# prepare transaction 'abc';
PREPARE TRANSACTION

First point to remember: Once you create a prepared transaction it is fully stored on disk:

postgres=# \! ls -la $PGDATA/pg_twophase/*
-rw------- 1 postgres postgres 212 Feb 26 11:24 /u02/pgdata/DEV/pg_twophase/00000233

Once it is committed the file is gone:

postgres=# commit prepared 'abc';
COMMIT PREPARED
postgres=# \! ls -la $PGDATA/pg_twophase/
total 8
drwx------  2 postgres postgres 4096 Feb 26 11:26 .
drwx------ 20 postgres postgres 4096 Feb 26 10:49 ..

Why is that? The answer is, that a prepared transaction even can be committed or rollback-ed if the server crashes. But this also means, that prepared transactions are persistent across restarts of the instance:

postgres=# begin;
BEGIN
postgres=*# insert into t1 values(3);
INSERT 0 1
postgres=*# prepare transaction 'abc';
PREPARE TRANSACTION
postgres=# \! pg_ctl restart 
waiting for server to shut down.... done
server stopped
waiting for server to start....2021-02-26 11:28:51.226 CET - 1 - 10576 -  - @ LOG:  redirecting log output to logging collector process
2021-02-26 11:28:51.226 CET - 2 - 10576 -  - @ HINT:  Future log output will appear in directory "pg_log".
 done
server started
postgres=# \! ls -la  $PGDATA/pg_twophase/
total 12
drwx------  2 postgres postgres 4096 Feb 26 11:28 .
drwx------ 20 postgres postgres 4096 Feb 26 11:28 ..
-rw-------  1 postgres postgres  212 Feb 26 11:28 00000234

Is that an issue? Imagine someone prepared a transaction and forgot to commit or rollback the transaction. A few days later someone wants to modify the application and tries to add a column to the “t1” table:

postgres=# alter table t1 add column b text;

This will be blocked for no obvious reason. Looking at the locks once more:

 locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction |  pid  |        mode         | granted | fastpath |           waitstart           
----------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-------+---------------------+---------+----------+-------------------------------
 relation |    12969 |    24582 |      |       |            |               |         |       |          | 3/4                | 10591 | AccessExclusiveLock | f       | f        | 2021-02-26 11:30:30.303512+01
 relation |    12969 |    24582 |      |       |            |               |         |       |          | -1/564             |       | RowExclusiveLock    | t       | f        | 
(2 rows)

We can see that pid 10591 is trying to get the look but cannot get in (granted=’f’). The other entry has no pid entry and this is the prepared transaction. The pid will always be empty for prepared transactions, so if you already know this, it might point you to the correct solution for this. If you don’t, then you are almost stuck. There is no session you can terminate, as nothing is reported about that in pg_stat_activity:

postgres=# select datid,datname,pid,wait_event_type,wait_event,state,backend_type from pg_stat_activity ;
 datid | datname  |  pid  | wait_event_type |     wait_event      | state  |         backend_type         
-------+----------+-------+-----------------+---------------------+--------+------------------------------
       |          | 10582 | Activity        | AutoVacuumMain      |        | autovacuum launcher
       |          | 10584 | Activity        | LogicalLauncherMain |        | logical replication launcher
 12969 | postgres | 10591 | Lock            | relation            | active | client backend
 12969 | postgres | 10593 |                 |                     | active | client backend
       |          | 10580 | Activity        | BgWriterHibernate   |        | background writer
       |          | 10579 | Activity        | CheckpointerMain    |        | checkpointer
       |          | 10581 | Activity        | WalWriterMain       |        | walwriter
(7 rows)

You will not see any blocking sessions (blocked_by=0):

postgres=# select pid
postgres-#      , usename
postgres-#      , pg_blocking_pids(pid) as blocked_by
postgres-#      , query as blocked_query
postgres-#   from pg_stat_activity
postgres-#   where cardinality(pg_blocking_pids(pid)) > 0;
  pid  | usename  | blocked_by |           blocked_query           
-------+----------+------------+-----------------------------------
 10591 | postgres | {0}        | alter table t1 add column b text;

Even if you restart the instance the issue will persist. The only solution to that is, to either commit or rollback the prepared transactions;

postgres=# select * from pg_prepared_xacts;
 transaction | gid |           prepared            |  owner   | database 
-------------+-----+-------------------------------+----------+----------
         564 | abc | 2021-02-26 11:28:37.362649+01 | postgres | postgres
(1 row)
postgres=# rollback prepared 'abc';
ROLLBACK PREPARED
postgres=# 

As soon this completed the other session will be able to complete it’s work:

postgres=# alter table t1 add column b text;
ALTER TABLE

Remember: When things look really weird, it might be, because you have ongoing prepared transactions.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Daniel Westermann
Daniel Westermann

Principal Consultant & Technology Leader Open Infrastructure