Infrastructure at your Service

Daniel Westermann

Rapid PostgreSQL cloning using AWS EBS snapshots

When you go for AWS EC2 instances to host your PostgreSQL deployments and you want to rapidly clone PostgreSQL instances for development or testing purposes you can make use of AWS EBS snapshots. In this post we’ll look at how you can quickly spin up EC2 instances that use these snapshots as a base for new PostgreSQL clusters. Although we’ll be using the AWS console to show how this can be done you should use some automation around this to make it really rapid. There are several options you can go for like the AWS command line interface, Terraform, AWS CloudFormation, Ansible and many more.

In the last posts about AWS we mainly used the CentOS, Red Hat or Amazon Linux 2 AMIs. In this post, just to make it a bit more colorful, we’ll be using SUSE Linux Enterprise 15. The base setup for SLES 15 in AWS is straight forward so not all screenshots will be provided, except this one:

The second EBS volume will be the one for PGDATA and this will also be the one we’ll be using as the source for our snapshots. The second volume will show up as xvdb and not as sdb as it is listed in the above screenshot:

ip-10-0-1-22:/home/ec2-user $ ls -la /dev/xv*
brw-rw---- 1 root disk 202,  0 Mar 30 05:50 /dev/xvda
brw-rw---- 1 root disk 202,  1 Mar 30 05:50 /dev/xvda1
brw-rw---- 1 root disk 202,  2 Mar 30 05:50 /dev/xvda2
brw-rw---- 1 root disk 202,  3 Mar 30 05:50 /dev/xvda3
brw-rw---- 1 root disk 202, 16 Mar 30 05:50 /dev/xvdb
ip-10-0-1-22:/home/ec2-user $ lsblk
NAME    MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
xvda    202:0    0  10G  0 disk 
├─xvda1 202:1    0   2M  0 part 
├─xvda2 202:2    0  20M  0 part /boot/efi
└─xvda3 202:3    0  10G  0 part /
xvdb    202:16   0  30G  0 disk 

After a new partition was created on the second device and a file system was created it is now mounted at /u02:

[email protected]:/home/postgres/ [pg122] df -h | grep u02
/dev/xvdb1       30G   63M   30G   1% /u02

Initializing and starting the PostgreSQL cluster:

[email protected]:/home/postgres/ [PG1] initdb -D /u02/pgdata/PG1
[email protected]:/home/postgres/ [PG1] pg_ctl -D /u02/pgdata/PG1/ start -l /dev/null
[email protected]:/home/postgres/ [PG1] psql postgres
psql (12.2)
Type "help" for help.

postgres=# select version();
                                      version                                       
------------------------------------------------------------------------------------
 PostgreSQL 12.2 on x86_64-pc-linux-gnu, compiled by gcc (SUSE Linux) 7.5.0, 64-bit
(1 row)

postgres=# 

That’s our starting point. Let’s assume we already have an application installed consisting of a few objects and some data:

postgres=# create schema my_app;
CREATE SCHEMA
postgres=# create table my_app.t1 ( a int, b text, c date );
CREATE TABLE
postgres=# create view my_app.v1 as select a,c from my_app.t1;
CREATE VIEW
postgres=# insert into my_app.t1 (a,b,c) select i, i::text, now() from generate_series(1,1000000) i;
INSERT 0 1000000

Now we want to clone that for testing purposes, how can we do that? The first step is to create a snapshot of the volume that holds the PostgreSQL cluster. To identify the EBS volume you can follow the link in the AWS console:

Once you have identified and selected the EBS volume you can create a new snapshot:

Snapshots can be viewed under the Snapshots section:

Using that snapshot you can spin up a new instance and reference the snapshot in the storage screen:

Once the new instance is ready all the files in PGDATA are there:

[email protected]:/home/postgres/ [pg122] ls -la /u02/pgdata/PG1/
total 64
drwx------ 20 postgres postgres  4096 Mar 30 06:46 .
drwx------  3 postgres postgres    17 Mar 30 06:44 ..
-rw-------  1 postgres postgres     3 Mar 30 06:44 PG_VERSION
drwx------  6 postgres postgres    58 Mar 30 06:49 base
drwx------  2 postgres postgres  4096 Mar 30 06:46 global
drwx------  2 postgres postgres     6 Mar 30 06:44 pg_commit_ts
drwx------  2 postgres postgres     6 Mar 30 06:44 pg_dynshmem
-rw-------  1 postgres postgres  4513 Mar 30 06:44 pg_hba.conf
-rw-------  1 postgres postgres  1636 Mar 30 06:44 pg_ident.conf
drwxr-xr-x  2 postgres postgres     6 Mar 30 06:45 pg_log
drwx------  4 postgres postgres    68 Mar 30 06:53 pg_logical
drwx------  4 postgres postgres    36 Mar 30 06:44 pg_multixact
drwx------  2 postgres postgres    18 Mar 30 06:46 pg_notify
drwx------  2 postgres postgres     6 Mar 30 06:44 pg_replslot
drwx------  2 postgres postgres     6 Mar 30 06:44 pg_serial
drwx------  2 postgres postgres     6 Mar 30 06:44 pg_snapshots
drwx------  2 postgres postgres     6 Mar 30 06:44 pg_stat
drwx------  2 postgres postgres    63 Mar 30 07:00 pg_stat_tmp
drwx------  2 postgres postgres    18 Mar 30 06:44 pg_subtrans
drwx------  2 postgres postgres     6 Mar 30 06:44 pg_tblspc
drwx------  2 postgres postgres     6 Mar 30 06:44 pg_twophase
drwx------  3 postgres postgres   188 Mar 30 06:53 pg_wal
drwx------  2 postgres postgres    18 Mar 30 06:44 pg_xact
-rw-------  1 postgres postgres    88 Mar 30 06:44 postgresql.auto.conf
-rw-------  1 postgres postgres 26599 Mar 30 06:44 postgresql.conf
-rw-------  1 postgres postgres    70 Mar 30 06:46 postmaster.opts
-rw-------  1 postgres postgres    81 Mar 30 06:46 postmaster.pid

Startup PostgreSQL and the new clone can be used for any purpose:

[email protected]:/home/postgres/ [pg122] pg_ctl -D /u02/pgdata/PG1/ start
[email protected]:/home/postgres/ [pg122] psql postgres
psql (12.2)
Type "help" for help.

postgres=# select count(*) from my_app.t1;
  count  
---------
 1000000
(1 row)

Quite easy and fast, especially when automated.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Daniel Westermann
Daniel Westermann

Principal Consultant & Technology Leader Open Infrastructure