Infrastructure at your Service

Daniel Westermann

PostgreSQL partitioning (3): List partitioning

In the last posts of this series we prepared the data set and had a look at range partitioning. In this post we look at another partitioning strategy: List partitioning. Instead of partitioning by a range (typically based on day, year, month) list partitioning is used to partition on an explicit list with key values that define the partitions.

Coming back to the materialized view which holds our initial data set:

postgres=# \d mv_traffic_violations 
                 Materialized view "public.mv_traffic_violations"
         Column          |          Type          | Collation | Nullable | Default 
-------------------------+------------------------+-----------+----------+---------
 seqid                   | text                   |           |          | 
 date_of_stop            | date                   |           |          | 
 time_of_stop            | time without time zone |           |          | 
 agency                  | text                   |           |          | 
 subagency               | text                   |           |          | 
 description             | text                   |           |          | 
 location                | text                   |           |          | 
 latitude                | numeric                |           |          | 
 longitude               | numeric                |           |          | 
 accident                | text                   |           |          | 
 belts                   | boolean                |           |          | 
 personal_injury         | boolean                |           |          | 
 property_damage         | boolean                |           |          | 
 fatal                   | boolean                |           |          | 
 commercial_license      | boolean                |           |          | 
 hazmat                  | boolean                |           |          | 
 commercial_vehicle      | boolean                |           |          | 
 alcohol                 | boolean                |           |          | 
 workzone                | boolean                |           |          | 
 state                   | text                   |           |          | 
 vehicletype             | text                   |           |          | 
 year                    | smallint               |           |          | 
 make                    | text                   |           |          | 
 model                   | text                   |           |          | 
 color                   | text                   |           |          | 
 violation_type          | text                   |           |          | 
 charge                  | text                   |           |          | 
 article                 | text                   |           |          | 
 contributed_to_accident | boolean                |           |          | 
 race                    | text                   |           |          | 
 gender                  | text                   |           |          | 
 driver_city             | text                   |           |          | 
 driver_state            | text                   |           |          | 
 dl_state                | text                   |           |          | 
 arrest_type             | text                   |           |          | 
 geolocation             | point                  |           |          | 
 council_districts       | smallint               |           |          | 
 councils                | smallint               |           |          | 
 communities             | smallint               |           |          | 
 zip_codes               | smallint               |           |          | 
 municipalities          | smallint               |           |          | 

The application might want to ask for all traffic violations that match a specific violation type:

postgres=# select distinct violation_type from mv_traffic_violations;
 violation_type 
----------------
 Warning
 SERO
 Citation
 ESERO
(4 rows)

Partitioning by that would result in 4 partitions (plus one default partition for any other potential violation type in the future). The definition of the table and it’s partitions is pretty much the same as for range partitioning but this time we go for list partitioning:

create table traffic_violations_p_list
( seqid text
, date_of_stop date  
, time_of_stop time
, agency text
, subagency text
, description text 
, location text
, latitude numeric
, longitude numeric
, accident text
, belts boolean
, personal_injury boolean 
, property_damage boolean
, fatal boolean
, commercial_license boolean
, hazmat boolean
, commercial_vehicle boolean
, alcohol boolean
, workzone boolean
, state text
, vehicletype text 
, year smallint
, make text
, model text
, color text
, violation_type text 
, charge text
, article text
, contributed_to_accident boolean
, race text
, gender text
, driver_city text
, driver_state text
, dl_state text
, arrest_type text
, geolocation point
, council_districts smallint 
, councils smallint
, communities smallint
, zip_codes smallint
, municipalities smallint
)
partition by list (violation_type);

psql will show the list partitioning strategy as well:

postgres=# \d traffic_violations_p_list
               Partitioned table "public.traffic_violations_p_list"
         Column          |          Type          | Collation | Nullable | Default 
-------------------------+------------------------+-----------+----------+---------
 seqid                   | text                   |           |          | 
 date_of_stop            | date                   |           |          | 
 time_of_stop            | time without time zone |           |          | 
 agency                  | text                   |           |          | 
 subagency               | text                   |           |          | 
 description             | text                   |           |          | 
 location                | text                   |           |          | 
 latitude                | numeric                |           |          | 
 longitude               | numeric                |           |          | 
 accident                | text                   |           |          | 
 belts                   | boolean                |           |          | 
 personal_injury         | boolean                |           |          | 
 property_damage         | boolean                |           |          | 
 fatal                   | boolean                |           |          | 
 commercial_license      | boolean                |           |          | 
 hazmat                  | boolean                |           |          | 
 commercial_vehicle      | boolean                |           |          | 
 alcohol                 | boolean                |           |          | 
 workzone                | boolean                |           |          | 
 state                   | text                   |           |          | 
 vehicletype             | text                   |           |          | 
 year                    | smallint               |           |          | 
 make                    | text                   |           |          | 
 model                   | text                   |           |          | 
 color                   | text                   |           |          | 
 violation_type          | text                   |           |          | 
 charge                  | text                   |           |          | 
 article                 | text                   |           |          | 
 contributed_to_accident | boolean                |           |          | 
 race                    | text                   |           |          | 
 gender                  | text                   |           |          | 
 driver_city             | text                   |           |          | 
 driver_state            | text                   |           |          | 
 dl_state                | text                   |           |          | 
 arrest_type             | text                   |           |          | 
 geolocation             | point                  |           |          | 
 council_districts       | smallint               |           |          | 
 councils                | smallint               |           |          | 
 communities             | smallint               |           |          | 
 zip_codes               | smallint               |           |          | 
 municipalities          | smallint               |           |          | 
Partition key: LIST (violation_type)
Number of partitions: 0

Now we can create the partitions:

create table traffic_violations_p_list_warning
partition of traffic_violations_p_list
for values in ('Warning');

create table traffic_violations_p_list_sero
partition of traffic_violations_p_list
for values in ('SERO');

create table traffic_violations_p_list_Citation
partition of traffic_violations_p_list
for values in ('Citation');

create table traffic_violations_p_list_ESERO
partition of traffic_violations_p_list
for values in ('ESERO');

create table traffic_violations_p_list_default
    partition of traffic_violations_p_list DEFAULT;

Again, psql will display the partitions nicely:

postgres=# \d+ traffic_violations_p_list
                                   Partitioned table "public.traffic_violations_p_list"
         Column          |          Type          | Collation | Nullable | Default | Storage  | Stats target | Description 
-------------------------+------------------------+-----------+----------+---------+----------+--------------+-------------
 seqid                   | text                   |           |          |         | extended |              | 
 date_of_stop            | date                   |           |          |         | plain    |              | 
 time_of_stop            | time without time zone |           |          |         | plain    |              | 
 agency                  | text                   |           |          |         | extended |              | 
 subagency               | text                   |           |          |         | extended |              | 
 description             | text                   |           |          |         | extended |              | 
 location                | text                   |           |          |         | extended |              | 
 latitude                | numeric                |           |          |         | main     |              | 
 longitude               | numeric                |           |          |         | main     |              | 
 accident                | text                   |           |          |         | extended |              | 
 belts                   | boolean                |           |          |         | plain    |              | 
 personal_injury         | boolean                |           |          |         | plain    |              | 
 property_damage         | boolean                |           |          |         | plain    |              | 
 fatal                   | boolean                |           |          |         | plain    |              | 
 commercial_license      | boolean                |           |          |         | plain    |              | 
 hazmat                  | boolean                |           |          |         | plain    |              | 
 commercial_vehicle      | boolean                |           |          |         | plain    |              | 
 alcohol                 | boolean                |           |          |         | plain    |              | 
 workzone                | boolean                |           |          |         | plain    |              | 
 state                   | text                   |           |          |         | extended |              | 
 vehicletype             | text                   |           |          |         | extended |              | 
 year                    | smallint               |           |          |         | plain    |              | 
 make                    | text                   |           |          |         | extended |              | 
 model                   | text                   |           |          |         | extended |              | 
 color                   | text                   |           |          |         | extended |              | 
 violation_type          | text                   |           |          |         | extended |              | 
 charge                  | text                   |           |          |         | extended |              | 
 article                 | text                   |           |          |         | extended |              | 
 contributed_to_accident | boolean                |           |          |         | plain    |              | 
 race                    | text                   |           |          |         | extended |              | 
 gender                  | text                   |           |          |         | extended |              | 
 driver_city             | text                   |           |          |         | extended |              | 
 driver_state            | text                   |           |          |         | extended |              | 
 dl_state                | text                   |           |          |         | extended |              | 
 arrest_type             | text                   |           |          |         | extended |              | 
 geolocation             | point                  |           |          |         | plain    |              | 
 council_districts       | smallint               |           |          |         | plain    |              | 
 councils                | smallint               |           |          |         | plain    |              | 
 communities             | smallint               |           |          |         | plain    |              | 
 zip_codes               | smallint               |           |          |         | plain    |              | 
 municipalities          | smallint               |           |          |         | plain    |              | 
Partition key: LIST (violation_type)
Partitions: traffic_violations_p_list_citation FOR VALUES IN ('Citation'),
            traffic_violations_p_list_esero FOR VALUES IN ('ESERO'),
            traffic_violations_p_list_sero FOR VALUES IN ('SERO'),
            traffic_violations_p_list_warning FOR VALUES IN ('Warning'),
            traffic_violations_p_list_default DEFAULT

Loading the data and checking the amount of data for each partition:

postgres=# insert into traffic_violations_p_list
postgres-#        select * from mv_traffic_violations;
INSERT 0 1528078
postgres=# select count(*) from traffic_violations_p_list_warning;
 count  
--------
 759747
(1 row)

postgres=# select count(*) from traffic_violations_p_list_sero;
 count 
-------
   899
(1 row)

postgres=# select count(*) from traffic_violations_p_list_Citation;
 count  
--------
 693226
(1 row)

postgres=# select count(*) from traffic_violations_p_list_ESERO;
 count 
-------
 74206
(1 row)

postgres=# select count(*) from traffic_violations_p_list_default;
 count 
-------
     0
(1 row)

Done. This is list partitioning. Instead of partitioning by a range, list partitioning is for partitioning by key values. In the next post we will have a look at hash partitioning.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Daniel Westermann
Daniel Westermann

Principal Consultant & Technology Leader Open Infrastructure