Infrastructure at your Service

Daniel Westermann

Does the wal segment size matter in PostgreSQL?

In PostgreSQL you configure the size of the wal (write ahead log) segments when you compile from source. If you use an installer or if you use the packages provided by your OS distribution the size of the wal segments is usually 16MB. Although 16MB seems very low you don’t need to worry about that in most of the cases, it just works fine. However there are cases where you might want to adjust this, e.g. when you have an application that generates thousands of transactions in a very short time and therefore forces PostgreSQL to generate huge amounts of wal segments. In this post we’ll look at a specific case: Usually you want to archive the wal segments for being able to do point in time recovery in case your severs crashes for some reason. Does the size of the wal segments matter for archiving?

Archiving of wal segments in PostgreSQL is done by specifying an archive_command. Whatever you put there will be executed by PostgreSQL once a new wal segment is completed. Usually you’ll find something like this in archive_command (from the documentation):

archive_command = 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'  # Unix
archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"'  # Windows

Or something like this:

archive_command = 'rsync -a %p [email protected][SOME_OTHER_HOST]:/path/to/wal_archive/%f'

Or:

archive_command ='scp %p [email protected][SOME_OTHER_HOST]:/path/to/wal_archive/%f'

Lets test how the size of wal segments impact the three ways of archiving outlined above. To begin with lets create 100 files each 16MB (the same as the default wal segment size in PostgreSQL) and 25 files 64MB each:

rm -rf /var/tmp/test16mb
mkdir /var/tmp/test16mb
for i in {1..100}; do
   dd if=/dev/zero of=/var/tmp/test16mb/${i} bs=1M count=16
done
ls -la /var/tmp/test16mb
rm -rf /var/tmp/test64mb
mkdir /var/tmp/test64mb
for i in {1..25}; do
   dd if=/dev/zero of=/var/tmp/test64mb/${i} bs=1M count=64
done
ls -la /var/tmp/test64mb
du -sh /var/tmp/test16mb
du -sh /var/tmp/test64mb

This will give us a total size of 1.6GB for each of the wal sizes (16MB and 64MB). Lets start by testing the “cp” way:

echo 3 > /proc/sys/vm/drop_caches
mkdir -p /var/tmp/target
rm -rf /var/tmp/target/*
time for i in `ls /var/tmp/test16mb`; do
    cp /var/tmp/test16mb/${i} /var/tmp/target/
done

My result (on a VM local on my notebook):

real	0m17.444s
user	0m0.275s
sys	0m8.569s

The same test for the 64MB files:

echo 3 > /proc/sys/vm/drop_caches
mkdir -p /var/tmp/target
rm -rf /var/tmp/target/*
time for i in `ls /var/tmp/test64mb`; do
    cp /var/tmp/test16mb/${i} /var/tmp/target/
done

It is almost 3 times as fast to copy the large files than to copy the smaller files:

real	0m5.365s
user	0m0.065s
sys	0m1.835s

Of course, for production systems, you would copy the files not locally but rather to e.g. NFS mount and then the numbers will change.

What are the numbers for scp? For the smaller files:

echo 3 > /proc/sys/vm/drop_caches
mkdir -p /var/tmp/target
rm -rf /var/tmp/target/*
time for i in `ls /var/tmp/test16mb`; do
    scp /var/tmp/test16mb/${i} [email protected]:/var/tmp/target/
done

The result:

real	2m51.708s
user	0m14.136s
sys	0m35.292s

Quite a huge overhead. What is the result with the 64MB files?:

echo 3 > /proc/sys/vm/drop_caches
mkdir -p /var/tmp/target
rm -rf /var/tmp/target/*
time for i in `ls /var/tmp/test64mb`; do
    scp /var/tmp/test64mb/${i} [email protected]:/var/tmp/target/
done

Approximately double as fast:

real	1m23.326s
user	0m10.353s
sys	0m30.814s

And finally rsync, for the smaller files:

echo 3 > /proc/sys/vm/drop_caches
mkdir -p /var/tmp/target
rm -rf /var/tmp/target/*
time for i in `ls /var/tmp/test16mb`; do
    rsync -a /var/tmp/test16mb/${i} [email protected]:/var/tmp/target/${i}
done

The result:

real	0m51.624s
user	0m4.488s
sys	0m10.247s

For the larger ones:

echo 3 > /proc/sys/vm/drop_caches
mkdir -p /var/tmp/target
rm -rf /var/tmp/target/*
time for i in `ls /var/tmp/test64mb`; do
    rsync -a /var/tmp/test64mb/${i} [email protected]:/var/tmp/target/${i}
done

The result:

real	0m34.342s
user	0m3.623s
sys	0m9.685s

Conclusion: When you have applications with high transaction rates it can make sense to increase the default wal segment size as archiving will usually be much faster when you use bigger segments. Of course you’ll need to test this on your specific hardware and for your specific workload. In a next post we’ll look at how bigger segments affect performance of PostgreSQL.

6 Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Daniel Westermann
Daniel Westermann

Principal Consultant & Technology Leader Open Infrastructure