Almost every PostgreSQL I get in touch with is not configured to use huge pages, which is quite a surprise as it can give you a performance boost. Actually it is not the PostgreSQL instance you need to configure but the operating system to provide that. PostgreSQL will use huge pages by default when they are configured and will fall back to normal pages otherwise. The parameter which controls that in PostgreSQL is huge_pages which defaults to “try” leading to the behavior just described: Try to get them, otherwise use normal pages. Lets see how you can do that on RedHat and CentOS. I’ll write another post about how you do that for Debian based distributions shortly.

What you need to know is that RedHat as well as CentOS come with tuned profiles by default. This means kernel parameters and other settings are managed through profiles dynamically and not anymore by adjusting /etc/sysctl (although that works as well). When you are in virtualized environment (VirtualBox in my case) you probably will see something like this:

postgres@pgbox:/home/postgres/ [PG10] tuned-adm active
Current active profile: virtual-guest

Virtual guest is maybe not the best solution for database server as it comes with those settings (especially vm.dirty_ratio and vm.swappiness):

postgres@pgbox:/home/postgres/ [PG10] cat /usr/lib/tuned/virtual-guest/tuned.conf  | egrep -v "^$|^#"
[main]
summary=Optimize for running inside a virtual guest
include=throughput-performance
[sysctl]
vm.dirty_ratio = 30
vm.swappiness = 30

What we do at dbi services is to provide our own profile which adjusts the settings better suited for a database server.

postgres@pgbox:/home/postgres/ [PG10]  cat /etc/tuned/dbi-postgres/tuned.conf | egrep -v "^$|^#"
[main]
summary=dbi services tuned profile for PostgreSQL servers
[cpu]
governor=performance
energy_perf_bias=performance
min_perf_pct=100
[disk]
readahead=>4096
[sysctl]
vm.overcommit_memory=2
vm.swappiness=0
vm.dirty_ratio=2
vm.dirty_background_ratio=1

What has all this to do with larges pages you might think. Well, tuning profiles can also be used to configure them and for us this is the preferred method because we can do it all in one file. But we before we do that lets look at the PostgreSQL instance:

postgres=# select version();
                                                          version                                                           
----------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 10.0 build on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16), 64-bit
(1 row)

postgres=# show huge_pages;
 huge_pages 
------------
 try
(1 row)

As said at the beginning of this post the default behavior of PostgreSQL is to use them if available. The question now is: How can you check if you have huge pages configured on the operating system level? The answer is in the virtual /proc/meminfo file:

postgres=# ! cat /proc/meminfo | grep -i huge
AnonHugePages:      6144 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

Alle “HugePages” statistics report a zero so this system definitely is not configured to provide huge pages to PostgreSQL. AnonHugePages is for Transparent Hugepage and it is common recommendation to disable them for database servers. So we have two tasks to complete:

  • Disable transparent huge pages
  • Configure the system to provide enough huge pages for our PostgreSQL instance

For disabling transparent huge pages we just need to add the following lines to our tuning profile:

postgres@pgbox:/home/postgres/ [PG10] sudo echo "[vm]
> transparent_hugepages=never" >> /etc/tuned/dbi-postgres/tuned.conf

When transparent huge pages are enabled you can see that in the following file:

postgres@pgbox:/home/postgres/ [PG10] cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never

Once we switch the profile to our own profile:

postgres@pgbox:/home/postgres/ [PG10] sudo tuned-adm profile dbi-postgres
postgres@pgbox:/home/postgres/ [PG10] sudo tuned-adm active
Current active profile: dbi-postgres

… you’ll notice that it is disabled from now on:

postgres@pgbox:/home/postgres/ [PG10] cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]

Task one completed. For configuring the operating system to provide huge pages for our PostgreSQL we need to know how many huge pages we require. How do we do that? The procedure is documented in the PostgreSQL documentation. Basically you start your instance and then check how many you would require. In my case, to get the PID of the postmaster process:

postgres@pgbox:/home/postgres/ [PG10] head -1 $PGDATA/postmaster.pid
1640

To get the VmPeak for that process:

postgres@pgbox:/home/postgres/ [PG10] grep ^VmPeak /proc/1640/status
VmPeak:	  344340 kB

As the huge page size is 2MB on my system (which should be default for most systems):

postgres@pgbox:/home/postgres/ [PG10] grep ^Hugepagesize /proc/meminfo
Hugepagesize:       2048 kB

… we will require at least 344340/2048 huge pages for this PostgreSQL instance:

postgres@pgbox:/home/postgres/ [PG10] echo "344340/2048" | bc
168

All we need to do is to add this to our tuning profile in the “[sysctl]” section:

postgres@pgbox:/home/postgres/ [PG10] grep nr_hugepages /etc/tuned/dbi-postgres/tuned.conf 
vm.nr_hugepages=170

Re-set the profile and we’re done:

postgres@pgbox:/home/postgres/ [PG10] sudo tuned-adm profile dbi-postgres
postgres@pgbox:/home/postgres/ [PG10] cat /proc/meminfo | grep -i huge
AnonHugePages:      4096 kB
HugePages_Total:     170
HugePages_Free:      170
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

This confirms that we now have 170 huge pages of which all of them are free to consume. Now lets configure PostgreSQL to only start when it can get the amount of huge pages required by switching the “huge_pages” parameter to “on” and restart the instance:

postgres@pgbox:/home/postgres/ [PG10] psql -c "alter system set huge_pages=on" postgres
ALTER SYSTEM
Time: 0.719 ms
postgres@pgbox:/home/postgres/ [PG10] pg_ctl -D $PGDATA restart -m fast
waiting for server to shut down.... done
server stopped
waiting for server to start....2018-02-25 11:21:29.107 CET - 1 - 3170 -  - @ LOG:  listening on IPv4 address "0.0.0.0", port 5441
2018-02-25 11:21:29.107 CET - 2 - 3170 -  - @ LOG:  listening on IPv6 address "::", port 5441
2018-02-25 11:21:29.110 CET - 3 - 3170 -  - @ LOG:  listening on Unix socket "/tmp/.s.PGSQL.5441"
2018-02-25 11:21:29.118 CET - 4 - 3170 -  - @ LOG:  redirecting log output to logging collector process
2018-02-25 11:21:29.118 CET - 5 - 3170 -  - @ HINT:  Future log output will appear in directory "pg_log".
 done
server started

As the instance started all should be fine and we can confirm that by looking at the statistics in /proc/meminfo:

postgres@pgbox:/home/postgres/ [PG10] cat /proc/meminfo | grep -i huge
AnonHugePages:      4096 kB
HugePages_Total:     170
HugePages_Free:      162
HugePages_Rsvd:       64
HugePages_Surp:        0
Hugepagesize:       2048 kB

You might be surprised that not all (actually only 8) huge pages are used right now but this will change as soon as you put some load on the system:

postgres=# create table t1 as select * from generate_series(1,1000000);
SELECT 1000000
postgres=# select count(*) from t1;
  count  
---------
 1000000
(1 row)

postgres=# ! cat /proc/meminfo | grep -i huge
AnonHugePages:      4096 kB
HugePages_Total:     170
HugePages_Free:      153
HugePages_Rsvd:       55
HugePages_Surp:        0
Hugepagesize:       2048 kB
postgres=# 

Hope this helps …