Infrastructure at your Service

Daniel Westermann

Active session history in PostgreSQL: Say hello to pgSentinel

There is new project, currently in beta, which aims to bring active session history (and probably more) to PostgreSQL: pgSentinel. Because PostgreSQL is highly extensible such projects are possible and usually are coming as an extension. pgSentinel is no exception to that so lets see how it can be installed. When you want to try the beta for your own, please connect with the project on twitter.

This is what I got:

postgres@pgbox:/home/postgres/beta/ [pg103] ll
total 120
-rw-r--r--. 1 postgres postgres   1108 Jul  8 22:13 pgsentinel--1.0.sql
-rw-r--r--. 1 postgres postgres    117 Jul  5 22:15 pgsentinel.control
-rwxr-xr-x. 1 postgres postgres 108000 Jul  9 11:12 pgsentinel.so
-rw-r--r--. 1 postgres postgres    623 Jul  9 11:22 readme.txt

You can already see from here that we probably need to load a library because of the pgsentinel.so file. Lets copy that to the correct locations, in my case:

postgres@pgbox:/home/postgres/beta/ [pg103] cp pgsentinel--1.0.sql pgsentinel.control /u01/app/postgres/product/10/db_3/share/extension/
postgres@pgbox:/home/postgres/beta/ [pg103] cp pgsentinel.so /u01/app/postgres/product/10/db_3/lib/

As I plan to run pgbench later to get some load onto the system I’ve created a separate database for installing the extension:

postgres@pgbox:/home/postgres/ [PG10] psql -c "create database bench" postgres
CREATE DATABASE
postgres@pgbox:/home/postgres/ [PG10] pgbench -i -s 10 bench

When we need to load a library we need to tell PostgreSQL about that by setting the shared_preload_libraries parameter. As pgsentinel depends on pg_stat_statements this needs to be installed as well.

postgres=# alter system set shared_preload_libraries='pg_stat_statements,pgsentinel';
ALTER SYSTEM

So once we have that set and the instance is restarted:

postgres@pgbox:/home/postgres/beta/ [PG10] pg_ctl -D $PGDATA restart -m fast

… we should see the new extension in the pg_available_extensions view:

postgres=# select * from pg_available_extensions where name = 'pgsentinel';
    name    | default_version | installed_version |        comment         
------------+-----------------+-------------------+------------------------
 pgsentinel | 1.0             |                   | active session history
(1 row)

Ready to install the extensions:

postgres=# create extension pg_stat_statements;
CREATE EXTENSION
postgres=# create extension pgsentinel ;
CREATE EXTENSION
postgres=# \dx
                                     List of installed extensions
        Name        | Version |   Schema   |                        Description                        
--------------------+---------+------------+-----------------------------------------------------------
 pg_stat_statements | 1.5     | public     | track execution statistics of all SQL statements executed
 pgsentinel         | 1.0     | public     | active session history
 plpgsql            | 1.0     | pg_catalog | PL/pgSQL procedural language
(3 rows)

So what did we get? One solution is to look at the sql file:

cat pgsentinel--1.0.sql
-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION pgsentinel" to load this file. \quit

CREATE FUNCTION pg_active_session_history(
    OUT ash_time timestamptz,
    OUT datid Oid,
    OUT datname text,
    OUT pid integer,
    OUT usesysid Oid,
    OUT usename text,
    OUT application_name text,
    OUT client_addr text,
    OUT client_hostname text,
    OUT client_port integer,
    OUT backend_start timestamptz,
    OUT xact_start timestamptz,
    OUT query_start timestamptz,
    OUT state_change timestamptz,
    OUT wait_event_type text,
    OUT wait_event text,
    OUT state text,
    OUT backend_xid xid,
    OUT backend_xmin xid,
    OUT top_level_query text,
    OUT query text,
    OUT queryid bigint,
    OUT backend_type text
   
)
RETURNS SETOF record
AS 'MODULE_PATHNAME', 'pg_active_session_history'
LANGUAGE C STRICT VOLATILE PARALLEL SAFE;

-- Register a view on the function for ease of use.
CREATE VIEW pg_active_session_history AS
  SELECT * FROM pg_active_session_history();

GRANT SELECT ON pg_active_session_history TO PUBLIC;

The other solution is to ask PostgreSQL directly:

bench=# \dx+ pgsentinel 
  Objects in extension "pgsentinel"
          Object description          
--------------------------------------
 function pg_active_session_history()
 view pg_active_session_history
(2 rows)

Basically we got a function and a view over that function. Lets have a look at the view then:

postgres=# \d pg_active_session_history
                   View "public.pg_active_session_history"
      Column      |           Type           | Collation | Nullable | Default 
------------------+--------------------------+-----------+----------+---------
 ash_time         | timestamp with time zone |           |          | 
 datid            | oid                      |           |          | 
 datname          | text                     |           |          | 
 pid              | integer                  |           |          | 
 usesysid         | oid                      |           |          | 
 usename          | text                     |           |          | 
 application_name | text                     |           |          | 
 client_addr      | text                     |           |          | 
 client_hostname  | text                     |           |          | 
 client_port      | integer                  |           |          | 
 backend_start    | timestamp with time zone |           |          | 
 xact_start       | timestamp with time zone |           |          | 
 query_start      | timestamp with time zone |           |          | 
 state_change     | timestamp with time zone |           |          | 
 wait_event_type  | text                     |           |          | 
 wait_event       | text                     |           |          | 
 state            | text                     |           |          | 
 backend_xid      | xid                      |           |          | 
 backend_xmin     | xid                      |           |          | 
 top_level_query  | text                     |           |          | 
 query            | text                     |           |          | 
 queryid          | bigint                   |           |          | 
 backend_type     | text                     |           |          | 

Most of the columns are already in pg_stat_activity but there is more. Before going further lets generate some load:

postgres@pgbox:/home/postgres/ [PG10] pgbench -c 5 -j 4 -T 60 bench 
starting vacuum...end.
transaction type: 
scaling factor: 10
query mode: simple
number of clients: 5
number of threads: 4
duration: 60 s
number of transactions actually processed: 151397
latency average = 1.982 ms
tps = 2522.898859 (including connections establishing)
tps = 2523.280694 (excluding connections establishing)

Now we should see sampled data in the pg_active_session_history view:

bench=# select ash_time,top_level_query,query,queryid,wait_event_type,wait_event from pg_active_session_history limit 10;
           ash_time            |                               top_level_query                               |                                   query                                    |  queryid   | wait_event_type |  wait_event   
-------------------------------+-----------------------------------------------------------------------------+----------------------------------------------------------------------------+------------+-----------------+---------------
 2018-07-09 14:51:48.883599+02 | create database bench                                                       | create database bench                                                      | 3222771996 | CPU             | CPU
 2018-07-09 14:52:37.291115+02 | copy pgbench_accounts from stdin                                            | copy pgbench_accounts from stdin                                           | 4164808321 | CPU             | CPU
 2018-07-09 14:52:38.292674+02 | alter table pgbench_accounts add primary key (aid)                          | alter table pgbench_accounts add primary key (aid)                         | 4164808321 | CPU             | CPU
 2018-07-09 14:55:51.111621+02 | UPDATE pgbench_branches SET bbalance = bbalance + 2228 WHERE bid = 4;       | UPDATE pgbench_branches SET bbalance = bbalance + 2228 WHERE bid = 4       |  553956422 | Lock            | transactionid
 2018-07-09 14:55:51.111621+02 | END;                                                                        | END                                                                        | 3376944276 | CPU             | CPU
 2018-07-09 14:55:51.111621+02 | UPDATE pgbench_accounts SET abalance = abalance + -2408 WHERE aid = 973208; | UPDATE pgbench_accounts SET abalance = abalance + -2408 WHERE aid = 973208 | 2992934481 | CPU             | CPU
 2018-07-09 14:55:52.112507+02 | UPDATE pgbench_tellers SET tbalance = tbalance + -4957 WHERE tid = 87;      | UPDATE pgbench_tellers SET tbalance = tbalance + -4957 WHERE tid = 87      | 3459630226 | Client          | ClientRead
 2018-07-09 14:55:52.112507+02 | END;                                                                        | END                                                                        | 3376944276 | LWLock          | WALWriteLock
 2018-07-09 14:55:52.112507+02 | END;                                                                        | END                                                                        | 3376944276 | CPU             | CPU
 2018-07-09 14:55:52.112507+02 | UPDATE pgbench_branches SET bbalance = bbalance + -3832 WHERE bid = 8;      | UPDATE pgbench_branches SET bbalance = bbalance + -3832 WHERE bid = 8      |  553956422 | Lock            | transactionid
(10 rows)

The important point here is that we have the queryid which we can use to map that back to pg_stat_statements. So if we want to know what the shared_blks_* statistics for the update statement with query id 553956422 are, we can do that (or write a join over the two views, of course):

bench=# select shared_blks_hit,shared_blks_read,shared_blks_dirtied,shared_blks_written from pg_stat_statements where queryid = 553956422;
 shared_blks_hit | shared_blks_read | shared_blks_dirtied | shared_blks_written 
-----------------+------------------+---------------------+---------------------
          453201 |               29 |                  37 |                   0
(1 row)

Really looks promising, automatic session sampling in PostgreSQL. More tests to come …

Leave a Reply

Daniel Westermann
Daniel Westermann

Senior Consultant and Technology Leader Open Infrastructure