Infrastructure at your Service

Daniel Westermann

Running PostgreSQL on ZFS on Linux – Fun with snapshots and clones

In the last post we looked at how to get a ZFS file system up and running on a CentOS 7 host and how to enable the auto mount of the ZFS file systems. In this post we’ll look at two of the features ZFS provides: Snapshots and clones.

A ZFS snapshot is a read only copy of a file system. How can we benefit from that when it comes to PostgreSQL. There are several scenarios where this can be useful. Imagine you are developing an application and you want to test the deployment of a new release on top of a previous release. What you probably want to have is a production like PostgreSQL instance with lots of data for being able to test the upgrade path. In addition it would be great if you can revert in seconds and start from scratch just in case you run into troubles or you missed one important point in the upgrade scripts. Using ZFS snapshots you can have all of this. Lets see.

Currently my PostgreSQL instance from the last post does not contain any user data, so lets generate some:

postgres= create table my_app_table ( a int, b varchar(50) );
CREATE TABLE
postgres=# with aa as 
postgres-# ( select * 
postgres(#     from generate_series (1,1000000) a
postgres(# )
postgres-# insert into my_app_table
postgres-# select aa.a, md5(aa.a::varchar)
postgres-#   from aa;
INSERT 0 1000000

This is the release we want to test our upgrade scripts from so lets create a snapshot of the current state of our instance:

[[email protected] ~] zfs snapshot pgpool/[email protected]
[[email protected] ~] zfs list -t snapshot
NAME                        USED  AVAIL  REFER  MOUNTPOINT
pgpool/[email protected]  16.6M      -   202M  -

The “@baserelease” is the name of the snapshot or to be correct everything after the “@” is the name of the snapshot.

Are you worried about consistency? This should not be an issue as PostgreSQL fsyncs the WAL so the instance should just start, apply all the wal records which are missing from the data files and you’re fine. Anyway, this is a scenario for testing: So as long as you have a consistent starting point you are fine.

A simple upgrade script could be:

postgres=# alter table my_app_table add column c date;
ALTER TABLE
postgres=# update my_app_table set c = now();
UPDATE 1000000

What happened to the snapshot?

[[email protected] ~] zfs list -t snapshot
NAME                        USED  AVAIL  REFER  MOUNTPOINT
pgpool/[email protected]  78.3M      -   202M  -

As soon as you modify data the snapshot will grow, no surprise.

So you did run your tests and discovered some things you could improve and once you improved what you wanted you want to start from the same point again. When having a snapshot this is quite easy, just revert to the snapshot. Of course you’ll need to stop your PostgreSQL instance first:

[email protected]:/home/postgres/ [PG1] pg_ctl stop -D /pgpool/pgdata/ -m fast
waiting for server to shut down....LOG:  received fast shutdown request
LOG:  aborting any active transactions
LOG:  autovacuum launcher shutting down
LOG:  shutting down
 done
server stopped

As soon as the instance is down the snapshot can be reverted:

[[email protected] ~] zfs rollback pgpool/[email protected]
[[email protected] ~] zfs list -t snapshot
NAME                        USED  AVAIL  REFER  MOUNTPOINT
pgpool/[email protected]     1K      -   202M  -

When you check the data after you started the instance again it is exactly as it was before:

[email protected]:/home/postgres/ [PG1] pg_ctl start -D /pgpool/pgdata/
[email protected]:/home/postgres/ [PG1] LOG:  database system was not properly shut down; automatic recovery in progress
[email protected]:/home/postgres/ [PG1] psql postgres
psql (9.5.4 dbi services build)
Type "help" for help.

postgres= \d my_app_table
        Table "public.my_app_table"
 Column |         Type          | Modifiers 
--------+-----------------------+-----------
 a      | integer               | 
 b      | character varying(50) | 

Notice the message about the automatic recovery, that is when the wal is replayed. Now you can just start your upgrade script again, revert in case of issues, start again, revert again, and so on.

Another use case: Rapid cloning of PostgreSQL instances (clones are writable, snapshots not). How does that work? This is where clones come into the game. For being able to clone you need a snapshot as clones depend on snapshots. Another thing to keep in mind is that you can not delete a snapshot when you have a clone still sitting on top of it. Lets see how it works:

As said we need a snapshot:

[[email protected] ~] zfs snapshot pgpool/[email protected]

On top of this snapshot we can now create a clone:

[[email protected] ~] zfs create pgpool/clones
[[email protected] ~] zfs clone pgpool/[email protected] pgpool/clones/1
[[email protected] ~] zfs list
NAME              USED  AVAIL  REFER  MOUNTPOINT
pgpool            170M  9.46G    21K  /pgpool
pgpool/clones    20.5K  9.46G  19.5K  /pgpool/clones
pgpool/clones/1     1K  9.46G   169M  /pgpool/clones/1
pgpool/pgdata     170M  9.46G   169M  /pgpool/pgdata

Using the new clone we bring up another PostgreSQL instance in seconds, containing the exact data from the source of the clone:

[email protected]:/home/postgres/ [PG1] rm /pgpool/clones/1/*.pid
[email protected]:/home/postgres/ [PG1] sed -i 's/#port = 5432/port=5433/g' /pgpool/clones/1/postgresql.conf
[email protected]:/home/postgres/ [PG1] pg_ctl start -D /pgpool/clones/1/
[email protected]:/home/postgres/ [PG1] psql -p 5433 postgres
psql (9.5.4 dbi services build)
Type "help" for help.

postgres=

Quite cool and easy.

Conclusion: I am not sure if I’d use ZFS for production databases on Linux because I have not tested enough. But for development and testing purposes there are quite a few benefits such as snapshots and cloning. This can simply your processes a lot. You could even use snapshots and clones as a basis for your backups although I’d prefer barman or bart.

PS: To clean up:

[[email protected] ~] zfs destroy pgpool/clones/1
[[email protected] ~] zfs destroy pgpool/clones

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Daniel Westermann
Daniel Westermann

Principal Consultant & Technology Leader Open Infrastructure