Infrastructure at your Service

Daniel Westermann

Can you start two (or more) PostgreSQL instances against the same data directory?

As PostgreSQL does not know the concept of running multiple instances against the same files on disk (e.g. like Oracle RAC) it should not be possible to start two or more instances against the same data directory. If that would work the result can only be corruption. In this post we will look at how PostgreSQL is detecting that and what mechanism are build in to avoid the situation of having multiple instances working against the same files on disk.

To start with we create a new cluster:

[email protected]:/home/postgres/ [PGDEV] mkdir /var/tmp/pgtest
12:16:46 [email protected]:/home/postgres/ [PGDEV] initdb -D /var/tmp/pgtest/
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locales
  COLLATE:  en_US.utf8
  CTYPE:    en_US.utf8
  MESSAGES: en_US.utf8
  MONETARY: de_CH.UTF-8
  NUMERIC:  de_CH.UTF-8
  TIME:     en_US.UTF-8
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /var/tmp/pgtest ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default timezone ... Europe/Zurich
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    pg_ctl -D /var/tmp/pgtest/ -l logfile start

We use a dedicated port and then start it up:

[email protected]:/home/postgres/ [PGDEV] export PGPORT=8888
[email protected]:/home/postgres/ [PGDEV] pg_ctl -D /var/tmp/pgtest start
waiting for server to start....2019-05-16 12:17:22.399 CEST [7607] LOG:  starting PostgreSQL 12devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.2.1 20180905 (Red Hat 8.2.1-3), 64-bit
2019-05-16 12:17:22.403 CEST [7607] LOG:  listening on IPv6 address "::1", port 8888
2019-05-16 12:17:22.403 CEST [7607] LOG:  listening on IPv4 address "127.0.0.1", port 8888
2019-05-16 12:17:22.409 CEST [7607] LOG:  listening on Unix socket "/tmp/.s.PGSQL.8888"
2019-05-16 12:17:22.446 CEST [7608] LOG:  database system was shut down at 2019-05-16 12:16:54 CEST
2019-05-16 12:17:22.455 CEST [7607] LOG:  database system is ready to accept connections
 done
server started

[email protected]:/home/postgres/ [PGDEV] psql -p 8888 -c "select version()" postgres
                                                  version                                                  
-----------------------------------------------------------------------------------------------------------
 PostgreSQL 12devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.2.1 20180905 (Red Hat 8.2.1-3), 64-bit
(1 row)

What happens when we want to start another instance against that data directory?

[email protected]:/home/postgres/ [PGDEV] pg_ctl -D /var/tmp/pgtest start
pg_ctl: another server might be running; trying to start server anyway
waiting for server to start....2019-05-16 12:18:26.252 CEST [7629] FATAL:  lock file "postmaster.pid" already exists
2019-05-16 12:18:26.252 CEST [7629] HINT:  Is another postmaster (PID 7607) running in data directory "/var/tmp/pgtest"?
 stopped waiting
pg_ctl: could not start server
Examine the log output.

When PostgreSQL is starting up it will look at a file called “postmaster.pid” which exists in the data directory once the instance is started. If that file exists PostgreSQL will not start up another instance against the same data directory. Once the instance is stopped the file is removed:

[email protected]:/home/postgres/ [PGDEV] pg_ctl -D /var/tmp/pgtest/ stop
waiting for server to shut down....2019-05-16 12:48:50.636 CEST [7896] LOG:  received fast shutdown request
2019-05-16 12:48:50.641 CEST [7896] LOG:  aborting any active transactions
2019-05-16 12:48:50.651 CEST [7896] LOG:  background worker "logical replication launcher" (PID 7903) exited with exit code 1
2019-05-16 12:48:50.651 CEST [7898] LOG:  shutting down
2019-05-16 12:48:50.685 CEST [7896] LOG:  database system is shut down
 done
server stopped
[email protected]:/home/postgres/ [PGDEV] ls -al /var/tmp/pgtest/postmaster.pid
ls: cannot access '/var/tmp/pgtest/postmaster.pid': No such file or directory

At least by default this is not possible to start two or more instances as PostgreSQL checks if postmaster.pid already exists. Lets remove that file and try again:

[email protected]:/home/postgres/ [PGDEV] rm /var/tmp/pgtest/postmaster.pid
[email protected]:/home/postgres/ [PGDEV] pg_ctl -D /var/tmp/pgtest start
waiting for server to start....2019-05-16 12:20:17.754 CEST [7662] LOG:  starting PostgreSQL 12devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.2.1 20180905 (Red Hat 8.2.1-3), 64-bit
2019-05-16 12:20:17.756 CEST [7662] LOG:  could not bind IPv6 address "::1": Address already in use
2019-05-16 12:20:17.756 CEST [7662] HINT:  Is another postmaster already running on port 8888? If not, wait a few seconds and retry.
2019-05-16 12:20:17.756 CEST [7662] LOG:  could not bind IPv4 address "127.0.0.1": Address already in use
2019-05-16 12:20:17.756 CEST [7662] HINT:  Is another postmaster already running on port 8888? If not, wait a few seconds and retry.
2019-05-16 12:20:17.756 CEST [7662] WARNING:  could not create listen socket for "localhost"
2019-05-16 12:20:17.756 CEST [7662] FATAL:  could not create any TCP/IP sockets
2019-05-16 12:20:17.756 CEST [7662] LOG:  database system is shut down
 stopped waiting
pg_ctl: could not start server
Examine the log output.

Again, this does not work and even the initial instance was shutdown because PostgreSQL detected that the lock file is not there anymore:

2019-05-16 12:20:22.540 CEST [7607] LOG:  could not open file "postmaster.pid": No such file or directory
2019-05-16 12:20:22.540 CEST [7607] LOG:  performing immediate shutdown because data directory lock file is invalid
2019-05-16 12:20:22.540 CEST [7607] LOG:  received immediate shutdown request
2019-05-16 12:20:22.540 CEST [7607] LOG:  could not open file "postmaster.pid": No such file or directory
2019-05-16 12:20:22.544 CEST [7612] WARNING:  terminating connection because of crash of another server process
2019-05-16 12:20:22.544 CEST [7612] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2019-05-16 12:20:22.544 CEST [7612] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2019-05-16 12:20:22.549 CEST [7664] WARNING:  terminating connection because of crash of another server process
2019-05-16 12:20:22.549 CEST [7664] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

Lets start the first instance again:

[email protected]:/home/postgres/ [PGDEV] pg_ctl -D /var/tmp/pgtest/ start
waiting for server to start....2019-05-16 12:22:20.136 CEST [7691] LOG:  starting PostgreSQL 12devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.2.1 20180905 (Red Hat 8.2.1-3), 64-bit
2019-05-16 12:22:20.140 CEST [7691] LOG:  listening on IPv6 address "::1", port 8888
2019-05-16 12:22:20.140 CEST [7691] LOG:  listening on IPv4 address "127.0.0.1", port 8888
2019-05-16 12:22:20.148 CEST [7691] LOG:  listening on Unix socket "/tmp/.s.PGSQL.8888"
2019-05-16 12:22:20.193 CEST [7693] LOG:  database system was interrupted; last known up at 2019-05-16 12:17:22 CEST
.2019-05-16 12:22:21.138 CEST [7693] LOG:  database system was not properly shut down; automatic recovery in progress
2019-05-16 12:22:21.143 CEST [7693] LOG:  redo starts at 0/15D3420
2019-05-16 12:22:21.143 CEST [7693] LOG:  invalid record length at 0/15D3458: wanted 24, got 0
2019-05-16 12:22:21.143 CEST [7693] LOG:  redo done at 0/15D3420
2019-05-16 12:22:21.173 CEST [7691] LOG:  database system is ready to accept connections
 done
server started

[email protected]:/home/postgres/ [PGDEV] psql -p 8888 -c "select version()" postgres
                                                  version                                                  
-----------------------------------------------------------------------------------------------------------
 PostgreSQL 12devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.2.1 20180905 (Red Hat 8.2.1-3), 64-bit
(1 row)

Lets change the port for the second instance and then try again to start it against the same data directory:

[email protected]:/home/postgres/ [PGDEV] export PGPORT=8889
[email protected]:/home/postgres/ [PGDEV] pg_ctl -D /var/tmp/pgtest/ start
pg_ctl: another server might be running; trying to start server anyway
waiting for server to start....2019-05-16 12:24:41.700 CEST [7754] FATAL:  lock file "postmaster.pid" already exists
2019-05-16 12:24:41.700 CEST [7754] HINT:  Is another postmaster (PID 7741) running in data directory "/var/tmp/pgtest"?
 stopped waiting
pg_ctl: could not start server
Examine the log output.

Does not work as well, which is good. Lets be a bit more nasty and truncate the postmaster.pid file:

[email protected]:/home/postgres/ [PGDEV] cat /var/tmp/pgtest/postmaster.pid 
7790
/var/tmp/pgtest
1558002434
8888
/tmp
localhost
  8888001    819201
ready   
[email protected]:/home/postgres/ [PGDEV] cat /dev/null > /var/tmp/pgtest/postmaster.pid
[email protected]:/home/postgres/ [PGDEV] cat /var/tmp/pgtest/postmaster.pid

The pid file is now empty and right after emptying that file we can see this in the PostgreSQL log file:

019-05-16 12:30:14.140 CEST [7790] LOG:  lock file "postmaster.pid" contains wrong PID: 0 instead of 7790
2019-05-16 12:30:14.140 CEST [7790] LOG:  performing immediate shutdown because data directory lock file is invalid
2019-05-16 12:30:14.140 CEST [7790] LOG:  received immediate shutdown request
2019-05-16 12:30:14.149 CEST [7795] WARNING:  terminating connection because of crash of another server process
2019-05-16 12:30:14.149 CEST [7795] DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
2019-05-16 12:30:14.149 CEST [7795] HINT:  In a moment you should be able to reconnect to the database and repeat your command.
2019-05-16 12:30:14.160 CEST [7790] LOG:  database system is shut down

So even that case it is detected and PostgreSQL protects you from starting up another instance against the same data directory. Lets try something else and modify PGDATA in the postmaster.pid file:

[email protected]:/home/postgres/ [PGDEV] cat /var/tmp/pgtest/postmaster.pid 
7896
/var/tmp/pgtest
1558002751
8888
/tmp
localhost
  8888001    851969
ready   

[email protected]:/home/postgres/ [PGDEV] sed -i  's/\/var\/tmp\/pgtest/\/var\/tmp\/pgtest2/g' /var/tmp/pgtest/postmaster.pid 
[email protected]:/home/postgres/ [PGDEV] cat /var/tmp/pgtest/postmaster.pid
7896
/var/tmp/pgtest2
1558002751
8888
/tmp
localhost
  8888001    851969
ready   

Although we changed PGDATA PostgreSQL will not start up another instance against this data directory:

[email protected]:/home/postgres/ [PGDEV] pg_ctl -D /var/tmp/pgtest/ start
pg_ctl: another server might be running; trying to start server anyway
waiting for server to start....2019-05-16 12:35:28.540 CEST [7973] FATAL:  lock file "postmaster.pid" already exists
2019-05-16 12:35:28.540 CEST [7973] HINT:  Is another postmaster (PID 7896) running in data directory "/var/tmp/pgtest"?
 stopped waiting
pg_ctl: could not start server
Examine the log output.

So by default you can not get PostgreSQL to start two or even more instances against the same data directory. There is an comment about this behaviour in src/backend/postmaster/postmaster.c in the source code:

                /*
                 * Once a minute, verify that postmaster.pid hasn't been removed or
                 * overwritten.  If it has, we force a shutdown.  This avoids having
                 * postmasters and child processes hanging around after their database
                 * is gone, and maybe causing problems if a new database cluster is
                 * created in the same place.  It also provides some protection
                 * against a DBA foolishly removing postmaster.pid and manually
                 * starting a new postmaster.  Data corruption is likely to ensue from
                 * that anyway, but we can minimize the damage by aborting ASAP.
                 */

“Once a minute” might be critical and we might be able to start a second one if we are fast enough, so lets try again. This time we start the first one, remove the lock file and immediately start another one using another port:

export PGPORT=8888
pg_ctl -D /var/tmp/pgtest start
rm -f /var/tmp/pgtest/postmaster.pid
export PGPORT=8889
pg_ctl -D /var/tmp/pgtest start

And here you have it:

[email protected]:/home/postgres/ [pg120] ps -ef | grep postgres
postgres  1445     1  0 May27 ?        00:00:00 /usr/lib/systemd/systemd --user
postgres  1456  1445  0 May27 ?        00:00:00 (sd-pam)
root      9780   786  0 06:09 ?        00:00:00 sshd: postgres [priv]
postgres  9783  9780  0 06:09 ?        00:00:00 sshd: [email protected]/1
postgres  9784  9783  0 06:09 pts/1    00:00:00 -bash
postgres 10302     1  0 06:19 ?        00:00:00 /u01/app/postgres/product/DEV/db_1/bin/postgres -D /var/tmp/pgtest
postgres 10304 10302  0 06:19 ?        00:00:00 postgres: checkpointer   
postgres 10305 10302  0 06:19 ?        00:00:00 postgres: background writer   
postgres 10306 10302  0 06:19 ?        00:00:00 postgres: walwriter   
postgres 10307 10302  0 06:19 ?        00:00:00 postgres: autovacuum launcher   
postgres 10308 10302  0 06:19 ?        00:00:00 postgres: stats collector   
postgres 10309 10302  0 06:19 ?        00:00:00 postgres: logical replication launcher   
postgres 10313     1  0 06:19 ?        00:00:00 /u01/app/postgres/product/DEV/db_1/bin/postgres -D /var/tmp/pgtest
postgres 10315 10313  0 06:19 ?        00:00:00 postgres: checkpointer   
postgres 10316 10313  0 06:19 ?        00:00:00 postgres: background writer   
postgres 10317 10313  0 06:19 ?        00:00:00 postgres: walwriter   
postgres 10318 10313  0 06:19 ?        00:00:00 postgres: autovacuum launcher   
postgres 10319 10313  0 06:19 ?        00:00:00 postgres: stats collector   
postgres 10320 10313  0 06:19 ?        00:00:00 postgres: logical replication launcher   
postgres 10327  9784  0 06:19 pts/1    00:00:00 ps -ef

Conclusion: PostgreSQL does some basic checks to avoid starting two instances against the same files on disk. But if you really want (and of course you should never do that) then you can achieve that => with all the consequences! Don’t do it!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Daniel Westermann
Daniel Westermann

Principal Consultant & Technology Leader Open Infrastructure