Infrastructure at your Service

Daniel Westermann

When you execute the same statement thousands of times -> prepare it

Every time you send a SQL statement to PostgreSQL it must be parsed. Parsing is fast, yes, but parsing the same statement a thousand times can quickly sum up to quite some time the database could spend for something else. To avoid that PostgreSQL comes with the prepare statement. Using that you can avoid parsing of statements and only planning and executing will happen. Lets see how that works.

To generate some sample data I used pgbench with a scale factor of 100 which gives me 10’000’000 rows in the pgbench_accounts table:

postgres@pgbox:/home/postgres/ [PGDEV] pgbench -i -s 100 bench
postgres@pgbox:/home/postgres/ [PGDEV] psql bench

psql (11devel)
Type "help" for help.

bench=# select count(*) from pgbench_accounts;
  count   
----------
 10000000
(1 row)

bench=# \d pgbench_accounts
              Table "public.pgbench_accounts"
  Column  |     Type      | Collation | Nullable | Default 
----------+---------------+-----------+----------+---------
 aid      | integer       |           | not null | 
 bid      | integer       |           |          | 
 abalance | integer       |           |          | 
 filler   | character(84) |           |          | 
Indexes:
    "pgbench_accounts_pkey" PRIMARY KEY, btree (aid)

Lets assume we have the following simple query:

bench=# select count(*) from pgbench_accounts where aid = 11111;
 count 
-------
     1
(1 row)

As said at the beginning of this post PostgreSQL will need to parse that statement. Using explain with the right options you are able to see how much time was spend on planning:

bench=# explain (analyze,buffers) select count(*) from pgbench_accounts where aid = 11111;
                                                                    QUERY PLAN                                                                     
---------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=4.46..4.46 rows=1 width=8) (actual time=0.060..0.063 rows=1 loops=1)
   Buffers: shared hit=4
   ->  Index Only Scan using pgbench_accounts_pkey on pgbench_accounts  (cost=0.43..4.45 rows=1 width=0) (actual time=0.034..0.039 rows=1 loops=1)
         Index Cond: (aid = 11111)
         Heap Fetches: 0
         Buffers: shared hit=4
 Planning Time: 0.150 ms
 Execution Time: 0.133 ms
(8 rows)

Planning this statement takes more time then executing it. Now assume you want to execute that very same statement one thousand times:

bench=# \t
bench=# select 'select count(*) from pgbench_accounts where aid = 11111;' from generate_series(1,1000) i; \g test.sql
bench=# \! cat test.sql | head
 select count(*) from pgbench_accounts where aid = 11111;
 select count(*) from pgbench_accounts where aid = 11111;
 select count(*) from pgbench_accounts where aid = 11111;
 select count(*) from pgbench_accounts where aid = 11111;
 select count(*) from pgbench_accounts where aid = 11111;
 select count(*) from pgbench_accounts where aid = 11111;
 select count(*) from pgbench_accounts where aid = 11111;
 select count(*) from pgbench_accounts where aid = 11111;
 select count(*) from pgbench_accounts where aid = 11111;
 select count(*) from pgbench_accounts where aid = 11111;
...

When we execute that we force PostgreSQL to parse all of those 1000 statements:

bench=# \timing
Timing is on.
bench=# \! time psql -f test.sql bench

real	0m0.148s
user	0m0.031s
sys	0m0.015s

What you can do to avoid that is to prepare the statement:

bench=# prepare tt as select count(*) from pgbench_accounts where aid = 11111;
PREPARE

Once it is prepared you can execute it:

bench=# execute tt;
 count 
-------
     1
(1 row)

… and you can also explain it:

bench=# explain(analyze,buffers) execute tt;
                                                                    QUERY PLAN                                                                     
---------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=4.46..4.46 rows=1 width=8) (actual time=0.066..0.069 rows=1 loops=1)
   Buffers: shared hit=4
   ->  Index Only Scan using pgbench_accounts_pkey on pgbench_accounts  (cost=0.43..4.45 rows=1 width=0) (actual time=0.037..0.043 rows=1 loops=1)
         Index Cond: (aid = 11111)
         Heap Fetches: 0
         Buffers: shared hit=4
 Planning Time: 0.021 ms
 Execution Time: 0.269 ms
(8 rows)

You’ll notice that the planning time went down quite a lot compared to the not prepared statement:

bench=# explain (analyze,buffers) select count(*) from pgbench_accounts where aid = 11111;
                                                                    QUERY PLAN                                                                     
---------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=4.46..4.46 rows=1 width=8) (actual time=0.167..0.172 rows=1 loops=1)
   Buffers: shared hit=4
   ->  Index Only Scan using pgbench_accounts_pkey on pgbench_accounts  (cost=0.43..4.45 rows=1 width=0) (actual time=0.030..0.037 rows=1 loops=1)
         Index Cond: (aid = 11111)
         Heap Fetches: 0
         Buffers: shared hit=4
 Planning Time: 0.167 ms
 Execution Time: 0.248 ms
(8 rows)

When you do that one thousand times now:

bench=# \t
bench=# select 'execute tt;' from generate_series(1,1000) i; \g test.sql
bench=# \! sed -i '1s/^/prepare tt as select count(*) from pgbench_accounts where aid = 11111;/' test.sql
bench=# \! time psql -f test.sql bench

real	0m0.095s
user	0m0.031s
sys	0m0.010s

… execution time will come down. It is not much in that simple case but this is due to the simple statement. Btw: As prepared statements only life for the time of the session the sed command adds the prepare statement to the top of file, and preparing takes time as well. Without that execution time would be even less.

When your values change in the where clause you can do it like this:

bench=# prepare tt as select count(*) from pgbench_accounts where aid = $1;
PREPARE
Time: 0.571 ms
bench=# execute tt (123);
     1

For completeness: When you want to get rid of a prepared statement without losing your session there is deallocate:

bench=# deallocate tt;
DEALLOCATE
Time: 0.623 ms
 

Leave a Reply

Daniel Westermann
Daniel Westermann

Senior Consultant and Technology Leader Open Infrastructure