Over the last years I’ve seen many different applications running against PostgreSQL. The use cases vary from simple applications, which only use basic data types and a few tables and views, to complex applications with custom types, more specific data types like jsonb or range types, that use hundreds of schemas/tables/views/materialized views etc. Surprisingly only a few of them make use of functions and procedures in PostgreSQL. I’ve always told people to process the data where it is, and usually the data is in the database, as this is best for performance and enables you to make us of advanced features of PostgreSQL. This does, of course, not only apply to PostgreSQL but to all other databases systems as well. This usually leads to discussion about being independent of the underlying databases and then it gets religious. While I understand the developer/business owner’s point of view of being as much independent of the database as possible, makes it easier to migrate from one database system to another in the future, this decision closes many doors when it comes to getting the maximum out of the database.

The most simple function you can create in PostgreSQL is a SQL function. A SQL functions contain one or more SQL statements and either return nothing, one row or a set of rows. To get started lets create a standard pgbench schema with a couple of rows:

postgres=# ! pgbench -i -s 10 postgres
dropping old tables...
NOTICE:  table "pgbench_accounts" does not exist, skipping
NOTICE:  table "pgbench_branches" does not exist, skipping
NOTICE:  table "pgbench_history" does not exist, skipping
NOTICE:  table "pgbench_tellers" does not exist, skipping
creating tables...
generating data (client-side)...
1000000 of 1000000 tuples (100%) done (elapsed 2.20 s, remaining 0.00 s)
vacuuming...
creating primary keys...
done in 3.52 s (drop tables 0.00 s, create tables 0.04 s, client-side generate 2.27 s, vacuum 0.30 s, primary keys 0.90 s).
postgres=# d
              List of relations
 Schema |       Name       | Type  |  Owner   
--------+------------------+-------+----------
 public | pgbench_accounts | table | postgres
 public | pgbench_branches | table | postgres
 public | pgbench_history  | table | postgres
 public | pgbench_tellers  | table | postgres
(4 rows)

One of the most simple SQL functions is something like this:

postgres=# create or replace function get_bid_for_aid_1 () returns int
           as $$
           select bid from pgbench_accounts where aid = 1;
           $$ language SQL;
CREATE FUNCTION

This really is not rocket science but it is a good starting point to understand the concepts. PostgreSQL uses the so called dollar quoting. That means, everything between the first “$$” and the last “$$” is the body of the function. This makes it easier to write sting literals without escaping like, e.g. single or double quotes (a more detailed description can be found here). If you want to use a tag with dollar quoting you can do that as well:

postgres=# create or replace function get_bid_for_aid_1_tmp () returns int
           as $my_tag$
           select bid from pgbench_accounts where aid = 1;
           $my_tag$ language SQL;
CREATE FUNCTION

Side note: There is nothing like in “invalid object” as you might know that from Oracle. If your function body contains errors it will not be stored in the database, you’ll have to fix that first:

postgres=# create or replace function get_bid_for_aid_1_tmp_1 () returns int
           as $my_tag$
           select bid from pgbench_accountsXX where aid = 1;
           $my_tag$ language SQL;
ERROR:  relation "pgbench_accountsxx" does not exist
LINE 3: select bid from pgbench_accountsXX where aid = 1;

More information about this behavior can be found here. Coming back to our initial function: As soon as the function is stored in PostgreSQL you can reference it in your SQL queries:

postgres=# select * from get_bid_for_aid_1();
 get_bid_for_aid_1 
-------------------
                 1
(1 row)

Right now this function is quite limited as it returns the bid only for aid=1. To make that a bit more re-usable can you make use of parameters:

postgres=# drop function get_bid_for_aid_1();
DROP FUNCTION
postgres=# create or replace function get_bid_for_aid (int) returns int
           as $my_tag$
           select bid from pgbench_accounts where aid = $1;
           $my_tag$ language SQL;
CREATE FUNCTION
postgres=# select * from get_bid_for_aid(1);
 get_bid_for_aid 
-----------------
               1
(1 row)

Now the function is more flexible, as you can ask for any bid for a given aid. But there is even more we can do, to make the function more readable. Instead of specifying just the data type for the input parameter we can give the parameter a name:

postgres=# create or replace function get_bid_for_aid (pn_aid int) returns int
           as $my_tag$
           select bid from pgbench_accounts where aid = pn_aid;
           $my_tag$ language SQL;
CREATE FUNCTION
postgres=# select * from get_bid_for_aid(1);
 get_bid_for_aid 
-----------------
               1
(1 row)

Although this does not change the behavior of the function, it makes the function easier to read and we recommend to work with those named parameters instead of only specifying the data types. Especially when the body of a function becomes quite large and it uses many parameters, it saves you quite some scrolling forward and backward because you already know the data type from the parameter name (or at least you should be able to guess it).

You are not limited to use a function in the from clause of your statement. Actually you can use the function wherever you want, as long as the result is valid SQL, e.g. in the where clause:

postgres=# select count(*) 
             from pgbench_accounts 
            where bid = get_bid_for_aid (1);
 count  
--------
 100000
(1 row)

By now we used one simple select with our function, but you can also modify your data using SQL functions:

postgres=# create or replace function upd_pgbench_account_for_aid ( pn_aid int, pv_filler character ) returns void
           as $$
             update pgbench_accounts
                set filler = pv_filler
              where aid = pn_aid;
           $$ language SQL;
CREATE FUNCTION
postgres=# select * from upd_pgbench_account_for_aid (1,'dummy'); 
 upd_pgbench_account_for_aid 
-----------------------------
 
(1 row)

postgres=# select filler from pgbench_accounts where aid = 1;
                                        filler                                        
--------------------------------------------------------------------------------------
 dummy                                                                               
(1 row)
postgres=# select * from upd_pgbench_account_for_aid (pv_filler=>'dummy2',pn_aid=>1); 
 upd_pgbench_account_for_aid 
-----------------------------
 
(1 row)

postgres=# select filler from pgbench_accounts where aid = 1;
                                        filler                                        
--------------------------------------------------------------------------------------
 dummy2                                                                              
(1 row)

There are a couple of things to note here:

  • A function does not need to return anything and you can specify that by using “void” as the return type
  • You can use DML statements in SQL functions</p
  • You can use more than one parameter (100 by default)
  • You can use named parameters when calling the function as well, and if you do that, the order of the parameters does not matter. This again improves readability.

But there is more you can do, consider this simple example:

postgres=# create or replace function delete_and_add ( pn_aid int, pv_filler character ) returns int
           as $$
             delete
               from pgbench_accounts
              where aid = pn_aid;
             insert into pgbench_accounts 
                    select max(aid)+1, null, null, pv_filler
                      from pgbench_accounts
             returning aid;
           $$ language sql;
CREATE FUNCTION
postgres=# select * from delete_and_add (1,'xxxx');
 delete_and_add 
----------------
        1000001
(1 row)

postgres=# select * from pgbench_accounts where aid = 1000001;
   aid   | bid | abalance |                                        filler                                        
---------+-----+----------+--------------------------------------------------------------------------------------
 1000001 |     |          | xxxx                                                                                
(1 row)

A function is not restricted to a single statement, you can combine more than one statement, as you like and as it makes sense for you. Using the returning clause to get the new value(s) back from the function. This is actually not specific to functions but to the insert statement, but it is anyway good to know.

If you combine multiple statements into SQL function there are some rules that apply:

postgres=# create or replace function f_dummy() returns int
postgres-# as $$ 
postgres$#   select 1;
postgres$#   select 2;
postgres$#   select 3;
postgres$# $$ language sql;
CREATE FUNCTION
postgres=# select * from f_dummy();
 f_dummy 
---------
       3
(1 row)

Combining multiple statements into one function in such a way only gives you the last result. If the last statement of a SQL function does not return anything but the signature of the function specifies a return type you’ll run into an error (obviously):

postgres=# create or replace function f_dummy_2() returns int
postgres-# as $$ 
postgres$#   select 1;
postgres$#   select 2;
postgres$#   select 3;
postgres$#   insert into pgbench_accounts 
postgres$#          values ( -1,1,1,'aaaaa');
postgres$# $$ language sql;
CREATE FUNCTION
postgres=# select * from f_dummy_2();
ERROR:  return type mismatch in function declared to return integer
DETAIL:  Function's final statement must be SELECT or INSERT/UPDATE/DELETE RETURNING.
CONTEXT:  SQL function "f_dummy_2" during startup
postgres=# 

Note that transaction control is not allowed in functions, this needs to happen outside, e.g.:

postgres=# begin;
BEGIN
postgres=*# select delete_and_add ( pn_aid=>5, pv_filler=>'ddd' );
 delete_and_add 
----------------
        1000002
(1 row)

postgres=*# rollback;
ROLLBACK
postgres=# select * from pgbench_accounts where aid = 1000002;
 aid | bid | abalance | filler 
-----+-----+----------+--------
(0 rows)

The last point for today: If you want to edit/view your functions in psql there are the “df” and “ef” shortcuts. The first one will show you all your functions:

postgres-# df
                                          List of functions
 Schema |            Name             | Result data type |         Argument data types         | Type 
--------+-----------------------------+------------------+-------------------------------------+------
 public | delete_and_add              | integer          | pn_aid integer, pv_filler character | func
 public | f_dumm                      | integer          |                                     | func
 public | f_dummy                     | integer          |                                     | func
 public | f_dummy_2                   | integer          |                                     | func
 public | f_dummy_3                   | void             |                                     | func
 public | get_bid_for_aid             | integer          | pn_aid integer                      | func
 public | get_bid_for_aid_1_tmp       | integer          |                                     | func
 public | get_bid_for_aid_1_tmp_1     | integer          |                                     | func
 public | upd_pgbench_account_for_aid | void             | pn_aid integer, pv_filler character | func

The second one is used to directly edit the function:

postgres=# ef f_dummy
postgres=# CREATE OR REPLACE FUNCTION public.f_dummy()
 RETURNS integer
 LANGUAGE sql
AS $function$ 
  select 1;
  select 2;
  select 3;
  select 4;
$function$
postgres-# 
postgres-# ;
CREATE FUNCTION

In the next post we’ll look into more advanced topics when it comes to SQL functions.