Infrastructure at your Service

Daniel Westermann

PostgreSQL partitioning (8): Sub-partitioning

We are slowly coming to the end of this little series about partitioning in PostgreSQL. In the last post we had a look at indexing and constraints and today we will have a look at sub partitioning. Sub partitioning means you go one step further and partition the partitions as well. Although it is not required to read all the posts of this series to follow this one: If you want, here they are:

  1. PostgreSQL partitioning (1): Preparing the data set
  2. PostgreSQL partitioning (2): Range partitioning
  3. PostgreSQL partitioning (3): List partitioning
  4. PostgreSQL partitioning (4) : Hash partitioning
  5. PostgreSQL partitioning (5): Partition pruning
  6. PostgreSQL partitioning (6): Attaching and detaching partitions
  7. PostgreSQL partitioning (7): Indexing and constraints

Coming back to our range partitioned table this is how it looks like currently:

postgres=# \d+ traffic_violations_p
                                      Partitioned table "public.traffic_violations_p"
         Column          |          Type          | Collation | Nullable | Default | Storage  | Stats target | Description 
-------------------------+------------------------+-----------+----------+---------+----------+--------------+-------------
 seqid                   | text                   |           |          |         | extended |              | 
 date_of_stop            | date                   |           |          |         | plain    |              | 
 time_of_stop            | time without time zone |           |          |         | plain    |              | 
 agency                  | text                   |           |          |         | extended |              | 
 subagency               | text                   |           |          |         | extended |              | 
 description             | text                   |           |          |         | extended |              | 
 location                | text                   |           |          |         | extended |              | 
 latitude                | numeric                |           |          |         | main     |              | 
 longitude               | numeric                |           |          |         | main     |              | 
 accident                | text                   |           |          |         | extended |              | 
 belts                   | boolean                |           |          |         | plain    |              | 
 personal_injury         | boolean                |           |          |         | plain    |              | 
 property_damage         | boolean                |           |          |         | plain    |              | 
 fatal                   | boolean                |           |          |         | plain    |              | 
 commercial_license      | boolean                |           |          |         | plain    |              | 
 hazmat                  | boolean                |           |          |         | plain    |              | 
 commercial_vehicle      | boolean                |           |          |         | plain    |              | 
 alcohol                 | boolean                |           |          |         | plain    |              | 
 workzone                | boolean                |           |          |         | plain    |              | 
 state                   | text                   |           |          |         | extended |              | 
 vehicletype             | text                   |           |          |         | extended |              | 
 year                    | smallint               |           |          |         | plain    |              | 
 make                    | text                   |           |          |         | extended |              | 
 model                   | text                   |           |          |         | extended |              | 
 color                   | text                   |           |          |         | extended |              | 
 violation_type          | text                   |           |          |         | extended |              | 
 charge                  | text                   |           |          |         | extended |              | 
 article                 | text                   |           |          |         | extended |              | 
 contributed_to_accident | boolean                |           |          |         | plain    |              | 
 race                    | text                   |           |          |         | extended |              | 
 gender                  | text                   |           |          |         | extended |              | 
 driver_city             | text                   |           |          |         | extended |              | 
 driver_state            | text                   |           |          |         | extended |              | 
 dl_state                | text                   |           |          |         | extended |              | 
 arrest_type             | text                   |           |          |         | extended |              | 
 geolocation             | point                  |           |          |         | plain    |              | 
 council_districts       | smallint               |           |          |         | plain    |              | 
 councils                | smallint               |           |          |         | plain    |              | 
 communities             | smallint               |           |          |         | plain    |              | 
 zip_codes               | smallint               |           |          |         | plain    |              | 
 municipalities          | smallint               |           |          |         | plain    |              | 
Partition key: RANGE (date_of_stop)
Partitions: traffic_violations_p_2013 FOR VALUES FROM ('2013-01-01') TO ('2014-01-01'),
            traffic_violations_p_2014 FOR VALUES FROM ('2014-01-01') TO ('2015-01-01'),
            traffic_violations_p_2015 FOR VALUES FROM ('2015-01-01') TO ('2016-01-01'),
            traffic_violations_p_2016 FOR VALUES FROM ('2016-01-01') TO ('2017-01-01'),
            traffic_violations_p_2017 FOR VALUES FROM ('2017-01-01') TO ('2018-01-01'),
            traffic_violations_p_2018 FOR VALUES FROM ('2018-01-01') TO ('2019-01-01'),
            traffic_violations_p_2019 FOR VALUES FROM ('2019-01-01') TO ('2020-01-01'),
            traffic_violations_p_2020 FOR VALUES FROM ('2020-01-01') TO ('2021-01-01'),
            traffic_violations_p_2021 FOR VALUES FROM ('2021-01-01') TO ('2022-01-01'),
            traffic_violations_p_default DEFAULT

Lets assume that you expect that traffic violations will grow exponentially in 2022 because more and more cars will be on the road and when there will be more cars there will be more traffic violations. To be prepared for that you do not only want to partition by year but also by month. In other words: Add a new partition for 2022 but sub partition that by month. First of all you need a new partition for 2022 that itself is partitioned as well:

create table traffic_violations_p_2022
partition of traffic_violations_p
for values from ('2022-01-01') to ('2023-01-01') partition by range(date_of_stop);

Now we can add partitions to the just created partitioned partition:

create table traffic_violations_p_2022_jan
partition of traffic_violations_p_2022
for values from ('2022-01-01') to ('2022-02-01');

create table traffic_violations_p_2022_feb
partition of traffic_violations_p_2022
for values from ('2022-02-01') to ('2022-03-01');

create table traffic_violations_p_2022_mar
partition of traffic_violations_p_2022
for values from ('2022-03-01') to ('2022-04-01');

create table traffic_violations_p_2022_apr
partition of traffic_violations_p_2022
for values from ('2022-04-01') to ('2022-05-01');

create table traffic_violations_p_2022_may
partition of traffic_violations_p_2022
for values from ('2022-05-01') to ('2022-06-01');

create table traffic_violations_p_2022_jun
partition of traffic_violations_p_2022
for values from ('2022-06-01') to ('2022-07-01');

create table traffic_violations_p_2022_jul
partition of traffic_violations_p_2022
for values from ('2022-07-01') to ('2022-08-01');

create table traffic_violations_p_2022_aug
partition of traffic_violations_p_2022
for values from ('2022-08-01') to ('2022-09-01');

create table traffic_violations_p_2022_sep
partition of traffic_violations_p_2022
for values from ('2022-09-01') to ('2022-10-01');

create table traffic_violations_p_2022_oct
partition of traffic_violations_p_2022
for values from ('2022-10-01') to ('2022-11-01');

create table traffic_violations_p_2022_nov
partition of traffic_violations_p_2022
for values from ('2022-11-01') to ('2022-12-01');

create table traffic_violations_p_2022_dec
partition of traffic_violations_p_2022
for values from ('2022-12-01') to ('2023-01-01');

Looking at psql’s output when we describe the partitioned table not very much changed, just the keyword “PARTITIONED” is showing up beside our new partition for 2022:

postgres=# \d+ traffic_violations_p
                                      Partitioned table "public.traffic_violations_p"
         Column          |          Type          | Collation | Nullable | Default | Storage  | Stats target | Description 
-------------------------+------------------------+-----------+----------+---------+----------+--------------+-------------
 seqid                   | text                   |           |          |         | extended |              | 
 date_of_stop            | date                   |           |          |         | plain    |              | 
 time_of_stop            | time without time zone |           |          |         | plain    |              | 
 agency                  | text                   |           |          |         | extended |              | 
 subagency               | text                   |           |          |         | extended |              | 
 description             | text                   |           |          |         | extended |              | 
 location                | text                   |           |          |         | extended |              | 
 latitude                | numeric                |           |          |         | main     |              | 
 longitude               | numeric                |           |          |         | main     |              | 
 accident                | text                   |           |          |         | extended |              | 
 belts                   | boolean                |           |          |         | plain    |              | 
 personal_injury         | boolean                |           |          |         | plain    |              | 
 property_damage         | boolean                |           |          |         | plain    |              | 
 fatal                   | boolean                |           |          |         | plain    |              | 
 commercial_license      | boolean                |           |          |         | plain    |              | 
 hazmat                  | boolean                |           |          |         | plain    |              | 
 commercial_vehicle      | boolean                |           |          |         | plain    |              | 
 alcohol                 | boolean                |           |          |         | plain    |              | 
 workzone                | boolean                |           |          |         | plain    |              | 
 state                   | text                   |           |          |         | extended |              | 
 vehicletype             | text                   |           |          |         | extended |              | 
 year                    | smallint               |           |          |         | plain    |              | 
 make                    | text                   |           |          |         | extended |              | 
 model                   | text                   |           |          |         | extended |              | 
 color                   | text                   |           |          |         | extended |              | 
 violation_type          | text                   |           |          |         | extended |              | 
 charge                  | text                   |           |          |         | extended |              | 
 article                 | text                   |           |          |         | extended |              | 
 contributed_to_accident | boolean                |           |          |         | plain    |              | 
 race                    | text                   |           |          |         | extended |              | 
 gender                  | text                   |           |          |         | extended |              | 
 driver_city             | text                   |           |          |         | extended |              | 
 driver_state            | text                   |           |          |         | extended |              | 
 dl_state                | text                   |           |          |         | extended |              | 
 arrest_type             | text                   |           |          |         | extended |              | 
 geolocation             | point                  |           |          |         | plain    |              | 
 council_districts       | smallint               |           |          |         | plain    |              | 
 councils                | smallint               |           |          |         | plain    |              | 
 communities             | smallint               |           |          |         | plain    |              | 
 zip_codes               | smallint               |           |          |         | plain    |              | 
 municipalities          | smallint               |           |          |         | plain    |              | 
Partition key: RANGE (date_of_stop)
Partitions: traffic_violations_p_2013 FOR VALUES FROM ('2013-01-01') TO ('2014-01-01'),
            traffic_violations_p_2014 FOR VALUES FROM ('2014-01-01') TO ('2015-01-01'),
            traffic_violations_p_2015 FOR VALUES FROM ('2015-01-01') TO ('2016-01-01'),
            traffic_violations_p_2016 FOR VALUES FROM ('2016-01-01') TO ('2017-01-01'),
            traffic_violations_p_2017 FOR VALUES FROM ('2017-01-01') TO ('2018-01-01'),
            traffic_violations_p_2018 FOR VALUES FROM ('2018-01-01') TO ('2019-01-01'),
            traffic_violations_p_2019 FOR VALUES FROM ('2019-01-01') TO ('2020-01-01'),
            traffic_violations_p_2020 FOR VALUES FROM ('2020-01-01') TO ('2021-01-01'),
            traffic_violations_p_2021 FOR VALUES FROM ('2021-01-01') TO ('2022-01-01'),
            traffic_violations_p_2022 FOR VALUES FROM ('2022-01-01') TO ('2023-01-01'), PARTITIONED,
            traffic_violations_p_default DEFAULT

The is where the new functions in PostgreSQL 12 become very handy:

postgres=# select * from pg_partition_tree('traffic_violations_p');
             relid             |        parentrelid        | isleaf | level 
-------------------------------+---------------------------+--------+-------
 traffic_violations_p          |                           | f      |     0
 traffic_violations_p_default  | traffic_violations_p      | t      |     1
 traffic_violations_p_2013     | traffic_violations_p      | t      |     1
 traffic_violations_p_2014     | traffic_violations_p      | t      |     1
 traffic_violations_p_2015     | traffic_violations_p      | t      |     1
 traffic_violations_p_2016     | traffic_violations_p      | t      |     1
 traffic_violations_p_2017     | traffic_violations_p      | t      |     1
 traffic_violations_p_2018     | traffic_violations_p      | t      |     1
 traffic_violations_p_2019     | traffic_violations_p      | t      |     1
 traffic_violations_p_2020     | traffic_violations_p      | t      |     1
 traffic_violations_p_2021     | traffic_violations_p      | t      |     1
 traffic_violations_p_2022     | traffic_violations_p      | f      |     1
 traffic_violations_p_2022_jan | traffic_violations_p_2022 | t      |     2
 traffic_violations_p_2022_feb | traffic_violations_p_2022 | t      |     2
 traffic_violations_p_2022_mar | traffic_violations_p_2022 | t      |     2
 traffic_violations_p_2022_apr | traffic_violations_p_2022 | t      |     2
 traffic_violations_p_2022_may | traffic_violations_p_2022 | t      |     2
 traffic_violations_p_2022_jun | traffic_violations_p_2022 | t      |     2
 traffic_violations_p_2022_jul | traffic_violations_p_2022 | t      |     2
 traffic_violations_p_2022_aug | traffic_violations_p_2022 | t      |     2
 traffic_violations_p_2022_sep | traffic_violations_p_2022 | t      |     2
 traffic_violations_p_2022_oct | traffic_violations_p_2022 | t      |     2
 traffic_violations_p_2022_nov | traffic_violations_p_2022 | t      |     2
 traffic_violations_p_2022_dec | traffic_violations_p_2022 | t      |     2

To verify if data is routed correctly to the sub partitions let’s add some data for 2022:

insert into traffic_violations_p (date_of_stop)
       select * from generate_series ( date('01-01-2022')
                                     , date('12-31-2022')
                                     , interval '1 day' );

If we did the partitioning correctly we should see data in the new partitions:

postgres=# select count(*) from traffic_violations_p_2022_nov;
 count 
-------
    30
(1 row)

postgres=# select count(*) from traffic_violations_p_2022_dec;
 count 
-------
    31
(1 row)

postgres=# select count(*) from traffic_violations_p_2022_feb;
 count 
-------
    28
(1 row)

Here we go. Of course you could go even further and sub-partition the monthly partitions further by day or week. You can also partition by list and then sub-partition the list partitions by range. Or partition by range and then sub-partition by list, e.g.:

postgres=# create table traffic_violations_p_list_dummy partition of traffic_violations_p_list for values in ('dummy') partition by range (date_of_stop);
CREATE TABLE
postgres=# create table traffic_violations_p_list_dummy_2019 partition of traffic_violations_p_list_dummy for values from ('2022-01-01') to ('2023-01-01');
CREATE TABLE
postgres=# insert into traffic_violations_p_list (seqid, violation_type , date_of_stop) values (-1,'dummy',date('2022-12-01'));
INSERT 0 1
postgres=# select date_of_stop,violation_type from traffic_violations_p_list_dummy_2019;
 date_of_stop | violation_type 
--------------+----------------
 2022-12-01   | dummy
(1 row)

That’s it for sub-partitioning. In the final post we will look at some corner cases with partitioning in PostgreSQL.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Daniel Westermann
Daniel Westermann

Principal Consultant & Technology Leader Open Infrastructure