Yesterday, while being at a customer, an interesting question popped up: What is the maximum of in list values in PostgreSQL? I couldn’t answer although I never read somewhere that there is a limit. The following is for fun only and I am not saying that creating huge in lists is a good idea. Lets go.
The version I tested is PostgreSQL 10:
1 2 3 4 5 6 7 8 | postgres=# select version(), now(); version | now ------------------------------------------------------------------------------------------------------------+------------------------------- PostgreSQL 10.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16), 64- bit | 2018-03-21 18:29:50.269369+01 (1 row) postgres=# create table t1 ( a varchar (10)); CREATE TABLE |
We will use a very simple bash script to build the in list and execute the result in PostgreSQL:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | postgres@pgbox: /home/postgres/ [PG10] cat t.sh #!/bin/bash count=$1 statement= 'select * from t1 where a in (' for (( i=1; i<=$count; i++ )) do if [ "${i}" -lt "${count}" ]; then statement= "${statement} '${i}'," elif [ "${i}" == "${count}" ]; then statement= "${statement} '${i}');" fi done psql -c "${statement}" postgres |
Lets start with 100:
1 2 3 4 5 6 | postgres@pgbox: /home/postgres/ [PG10] . /t .sh 100 a --- (0 rows) Time: 0.983 ms |
1000:
1 2 3 4 5 6 | postgres@pgbox: /home/postgres/ [PG10] . /t .sh 1000 a --- (0 rows) Time: 1.525 ms |
10000:
1 2 3 4 5 6 | postgres@pgbox: /home/postgres/ [PG10] . /t .sh 10000 a --- (0 rows) Time: 11.549 ms |
… takes a much longer (because of the bash script which fully occupies my virtual core):
1 | . /t .sh: line 15: /u01/app/postgres/product/10/db_0/bin/psql : Argument list too long |
So there at least is a limit with psql. Lets try by creating a sql script and execute that instead:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | postgres@pgbox:/home/postgres/ [PG10] cat t.sh #!/bin/bash count =$1 echo 'select * from t1 where a in (' > ttt.sql for (( i=1; i<=$ count ; i++ )) do if [ "${i}" -lt "${count}" ]; then echo "'${i}'," >> ttt.sql elif [ "${i}" == "${count}" ]; then echo "'${i}');" >> ttt.sql fi done psql -f ttt.sql postgres |
This way of stringing together the statement is much more efficient than building the list by concatenating everything into one variable. Does it still work?
1 2 3 4 5 6 | postgres@pgbox:/home/postgres/ [PG10] ./t.sh 100000 a --- (0 rows ) Time : 155.986 ms |
Not a problem, one more:
1 2 3 4 5 6 | postgres@pgbox:/home/postgres/ [PG10] ./t.sh 1000000 a --- (0 rows ) Time : 14211.766 ms (00:14.212) |
Still works. So now we could say: lets stop, who in the world will pass one million values into an in list. On the other hand, lets have fun and double:
1 2 3 4 5 6 | postgres@pgbox:/home/postgres/ [PG10] ./t.sh 2000000 a --- (0 rows ) Time : 3993.091 ms (00:03.993) |
One more:
1 2 3 4 | postgres@pgbox:/home/postgres/ [PG10] ./t.sh 3000000 psql:ttt.sql:3000001: ERROR: out of memory DETAIL: Failed on request of size 524288. Time : 3026.925 ms (00:03.027) |
Ok, now I am hitting some limits, but probably not those of PostgreSQL. I’ll test further when I have more time for that