Infrastructure at your Service

Daniel Westermann

Incremental materialized view maintenance for PostgreSQL 14?

Since PostgreSQL 9.3 there is the possibility to create materialized views in PostgreSQL. PostgreSQL 9.4 (one year later) brought concurrent refresh which already is a major step forward as this allowed querying the materialized view while it is being refreshed. What still is missing are materialized views which refresh themselves, as soon as there are changed to the underlying tables. This might change with PostgreSQL 14, as this patch is in active development (at least since middle of 2019). Lets have a look at how that currently works and what the limitations are. If you want to play with this for yourself and do not want to apply the patches: There is a Docker container you can use for your testing as well.

If you want to have a materialized view that is incrementally updated you need to specify this when the materialized view is created:

postgres=# \h create materialized view
Command:     CREATE MATERIALIZED VIEW
Description: define a new materialized view
Syntax:
CREATE [ INCREMENTAL ] MATERIALIZED VIEW [ IF NOT EXISTS ] table_name
    [ (column_name [, ...] ) ]
    [ USING method ]
    [ WITH ( storage_parameter [= value] [, ... ] ) ]
    [ TABLESPACE tablespace_name ]
    AS query
    [ WITH [ NO ] DATA ]

URL: https://www.postgresql.org/docs/devel/sql-creatematerializedview.html

If you skip “INCREMENTAL”, the materialized view will not be updated automatically and you get the behavior as it is now. As we want to have a look at the new feature lets create a base table and then add an incrementally updated materialized view on top of it:

postgres=# create table t1 ( a int, b text, c date );
CREATE TABLE
postgres=# insert into t1 select x, x::text, now() from generate_series(1,1000000) x;
INSERT 0 1000000
postgres=# create incremental materialized view mv1 as select * from t1 with data;
SELECT 1000000
postgres=# 

“\d+” will show you that this materialized view is incrementally updated:

postgres=# \d+ mv1
                              Materialized view "public.mv1"
 Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description 
--------+---------+-----------+----------+---------+----------+--------------+-------------
 a      | integer |           |          |         | plain    |              | 
 b      | text    |           |          |         | extended |              | 
 c      | date    |           |          |         | plain    |              | 
View definition:
 SELECT t1.a,
    t1.b,
    t1.c
   FROM t1;
Access method: heap
Incremental view maintenance: yes

If we update the underlying table, the materialized view gets updated automatically:

postgres=# insert into t1 (a,b,c) values(-1,'aaa',now());
INSERT 0 1
postgres=# select * from mv1 where a = -1;
 a  |  b  |     c      
----+-----+------------
 -1 | aaa | 2020-11-23
(1 row)

postgres=# update t1 set a = -2 where a = -1;
UPDATE 1
postgres=# select * from mv1 where a = -2;
 a  |  b  |     c      
----+-----+------------
 -2 | aaa | 2020-11-23
(1 row)

postgres=# 

That’s really cool but you need to be aware that this comes with a cost: Modifying (insert/update/delete) the underlying table(s) becomes more expensive. Lets compare a small bulk load into a table without a materialized view on top of it against the same load into a table with a materialized view on top:

postgres=# truncate table t1;
TRUNCATE TABLE
postgres=# create table t2 ( a int, b text, c date );
CREATE TABLE
postgres=# \timing
Timing is on.
postgres=# insert into t1 select x, x::text, now() from generate_series(1,1000000) x;
INSERT 0 1000000
Time: 3214.712 ms (00:03.215)
postgres=# insert into t2 select x, x::text, now() from generate_series(1,1000000) x;
INSERT 0 1000000
Time: 1285.578 ms (00:01.286)
postgres=# insert into t1 select x, x::text, now() from generate_series(1,1000000) x;
INSERT 0 1000000
Time: 4117.097 ms (00:04.117)
postgres=# insert into t2 select x, x::text, now() from generate_series(1,1000000) x;
INSERT 0 1000000
Time: 1511.681 ms (00:01.512)
postgres=# insert into t1 select x, x::text, now() from generate_series(1,1000000) x;
INSERT 0 1000000
Time: 3844.273 ms (00:03.844)
postgres=# insert into t2 select x, x::text, now() from generate_series(1,1000000) x;
INSERT 0 1000000
Time: 1463.377 ms (00:01.463)

Without a materialized view, the load time is around 3 times faster, so you have to decide what is more important to you: Fast loading or up to date materialized views.

Finally: Here is the Wiki page that summarizes the feature and also lists some limitations.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Daniel Westermann
Daniel Westermann

Principal Consultant & Technology Leader Open Infrastructure