Infrastructure at your Service

Daniel Westermann

PGDay Amsterdam – follow up 2 – Where do null values go to in a hash partitioned table?

This is the second follow up which covers this question: When you hash partition a table in PostgreSQL 11 where do null values for the partitioned column go to? Lets go…

In the demo I used this little table:

postgres=# select version();
                                                            version                                                          
-----------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 11beta1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bi
(1 row)
postgres=# create table part2 ( a int, list varchar(10) ) partition by hash (a);
CREATE TABLE
postgres=# create table part2_1 partition of part2 FOR VALUES WITH (MODULUS 3, REMAINDER 0);
CREATE TABLE
postgres=# create table part2_2 partition of part2 FOR VALUES WITH (MODULUS 3, REMAINDER 1);
CREATE TABLE
postgres=# create table part2_3 partition of part2 FOR VALUES WITH (MODULUS 3, REMAINDER 2);
CREATE TABLE
postgres=# \d+ part2
                                          Table "public.part2"
 Column |         Type          | Collation | Nullable | Default | Storage  | Stats target | Description 
--------+-----------------------+-----------+----------+---------+----------+--------------+-------------
 a      | integer               |           |          |         | plain    |              | 
 list   | character varying(10) |           |          |         | extended |              | 
Partition key: HASH (a)
Partitions: part2_1 FOR VALUES WITH (modulus 3, remainder 0),
            part2_2 FOR VALUES WITH (modulus 3, remainder 1),
            part2_3 FOR VALUES WITH (modulus 3, remainder 2)

The data we played with was this:

postgres=# insert into part2 (a,list) values (1,'beer');
INSERT 0 1
postgres=# insert into part2 (a,list) values (2,'whine');
INSERT 0 1
postgres=# insert into part2 (a,list) values (3,'schnaps');
INSERT 0 1
postgres=# select * from only part2_1;
 a | list  
---+-------
 2 | whine
(1 row)

postgres=# select * from only part2_2;
 a |  list   
---+---------
 3 | schnaps
(1 row)

postgres=# select * from only part2_3;
 a | list 
---+------
 1 | beer
(1 row)

We have the data evenly distributed over the three partitions. When we insert a row which contains a NULL value for the column we partitioned on:

postgres=# insert into part2 (a,list) values (null,'cocktail');
INSERT 0 1

… where does that column go to?

postgres=# select * from only part2_1;
 a |   list   
---+----------
 2 | whine
   | cocktail
(2 rows)

postgres=# select * from only part2_2;
 a |  list   
---+---------
 3 | schnaps
(1 row)

postgres=# select * from only part2_3;
 a | list 
---+------
 1 | beer
(1 row)

It goes to the first partition and every following NULL key row gets there as well:

postgres=# insert into part2 (a,list) values (null,'rum');
INSERT 0 1
postgres=# select * from only part2_1;
 a |   list   
---+----------
 2 | whine
   | cocktail
   | rum
(3 rows)

I couldn’t find anything in the documentation about that so I did send a mail to the general mailing list and here is the answer: “The calculated hash value for the null value will be zero, therefore, it will fall to the partition having remainder zero.”

Leave a Reply

Daniel Westermann
Daniel Westermann

Senior Consultant and Technology Leader Open Infrastructure