Infrastructure at your Service

Daniel Westermann

PostgreSQL 12: log_statement_sample_rate

A common way to identify long running queries in PostgreSQL is to set log_min_duration_statement to a value that is known to cause troubles. In other words: If you know most of your statements usually execute in under a second you could set log_min_duration_statement to “2s” (which means two seconds) so that whenever a statement takes longer than two seconds it will be reported in the log file. The issue with this however is, that this can cause a lot of activity in the server log file which of course is not good for performance as well. PostgreSQL 12 will probably come with a solution to that.

The default value of log_min_duration_statement is “-1″, which means disabled:

postgres=# select version();
                                                  version                                                   
------------------------------------------------------------------------------------------------------------
 PostgreSQL 12devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit
(1 row)

postgres=# show log_min_duration_statement;
 log_min_duration_statement 
----------------------------
 -1
(1 row)

Lets set this to 5ms and then generate some queries:

postgres=# alter system set log_min_duration_statement = '5ms';
ALTER SYSTEM
postgres=# select pg_reload_conf();
 pg_reload_conf 
----------------
 t
(1 row)

postgres=# show log_min_duration_statement;
 log_min_duration_statement 
----------------------------
 5ms
(1 row)

When we execute some queries that take longer than 5ms they should get reported in the log file:

postgres=# select 'select pg_sleep(0.5)' from generate_series(1,10); \gexec
       ?column?       
----------------------
 select pg_sleep(0.5)
 select pg_sleep(0.5)
 select pg_sleep(0.5)
 select pg_sleep(0.5)
 select pg_sleep(0.5)
 select pg_sleep(0.5)
 select pg_sleep(0.5)
 select pg_sleep(0.5)
 select pg_sleep(0.5)
 select pg_sleep(0.5)
(10 rows)

 pg_sleep 
----------
 
(1 row)

...

Looking at the log file there are exactly 10 entries:

2018-11-28 03:13:32.240 CET - 20 - 28978 - [local] - postgres@postgres LOG:  duration: 529.825 ms  statement: select pg_sleep(0.5)
2018-11-28 03:13:32.770 CET - 21 - 28978 - [local] - postgres@postgres LOG:  duration: 529.904 ms  statement: select pg_sleep(0.5)
2018-11-28 03:13:33.273 CET - 22 - 28978 - [local] - postgres@postgres LOG:  duration: 501.729 ms  statement: select pg_sleep(0.5)
2018-11-28 03:13:33.783 CET - 23 - 28978 - [local] - postgres@postgres LOG:  duration: 509.532 ms  statement: select pg_sleep(0.5)
2018-11-28 03:13:34.305 CET - 24 - 28978 - [local] - postgres@postgres LOG:  duration: 520.946 ms  statement: select pg_sleep(0.5)
2018-11-28 03:13:34.809 CET - 25 - 28978 - [local] - postgres@postgres LOG:  duration: 502.624 ms  statement: select pg_sleep(0.5)
2018-11-28 03:13:35.315 CET - 26 - 28978 - [local] - postgres@postgres LOG:  duration: 505.043 ms  statement: select pg_sleep(0.5)
2018-11-28 03:13:35.817 CET - 27 - 28978 - [local] - postgres@postgres LOG:  duration: 502.034 ms  statement: select pg_sleep(0.5)
2018-11-28 03:13:36.321 CET - 28 - 28978 - [local] - postgres@postgres LOG:  duration: 503.417 ms  statement: select pg_sleep(0.5)
2018-11-28 03:13:36.824 CET - 29 - 28978 - [local] - postgres@postgres LOG:  duration: 501.790 ms  statement: select pg_sleep(0.5)

Imagine there are thousands of statements that exceed log_min_duration_statement: This will make the log file grow quite fast and performance will suffer because writing the log file takes resources as well. PostgreSQL 12 introduces a new parameter “log_statement_sample_rate” which helps with this:

postgres=# show log_statement_sample_rate;
 log_statement_sample_rate 
---------------------------
 1
(1 row)

The default is 1 which means log all statements that exceed the log_min_duration_statement threshold. When that is too much we can now go and say: We want only 10% of these statements getting logged:

postgres=# alter system set log_statement_sample_rate=0.1;
ALTER SYSTEM
postgres=# select pg_reload_conf();
 pg_reload_conf 
----------------
 t
(1 row)

postgres=# show log_statement_sample_rate;
 log_statement_sample_rate 
---------------------------
 0.1
(1 row)

Doing the same test as above again:

postgres=# select 'select pg_sleep(0.5)' from generate_series(1,10); \gexec
       ?column?       
----------------------
 select pg_sleep(0.5)
 select pg_sleep(0.5)
 select pg_sleep(0.5)
 select pg_sleep(0.5)
 select pg_sleep(0.5)
 select pg_sleep(0.5)
 select pg_sleep(0.5)
 select pg_sleep(0.5)
 select pg_sleep(0.5)
 select pg_sleep(0.5)
(10 rows)

 pg_sleep 
----------
 
(1 row)

...

… we only get one entry in the log file, instead of 10:

2018-11-28 03:20:36.218 CET - 34 - 28978 - [local] - postgres@postgres LOG:  duration: 504.845 ms  statement: select pg_sleep(0.5)

Nice. Btw: The commit is here.

Leave a Reply

Daniel Westermann
Daniel Westermann

Senior Consultant and Technology Leader Open Infrastructure