Infrastructure at your Service

Daniel Westermann

Working with files on the filesystem in PostgreSQL

PostgreSQL comes with various helper functions that support you with working with files on the filesystem on the host PostgreSQL is running on. You might ask yourself why that is important but there are use cases for that. Maybe you want to list the contents of a directory because new files that showed up since the last check do trigger something. Maybe you want to load a file into the database (which you also can (and event should) do using copy if it is text based and somehow well formatted, but that is not the scope of this post).

For listing files in a directory there is this one:

postgres=# select * from pg_ls_dir('.');
      pg_ls_dir       
----------------------
 pg_wal
 global
 pg_commit_ts
 pg_dynshmem
 pg_notify
 pg_serial
 pg_snapshots
 pg_subtrans
 pg_twophase
 pg_multixact
 base
 pg_replslot
 pg_tblspc
 pg_stat
 pg_stat_tmp
 pg_xact
 pg_logical
 PG_VERSION
 postgresql.conf
 postgresql.auto.conf
 pg_hba.conf
 pg_ident.conf
 pg_log
 postmaster.opts
 autoprewarm.blocks
 postmaster.pid
 current_logfiles
(27 rows)

By default the ‘.’ listings are omitted by you can control this:

postgres=# select * from pg_ls_dir('.',true,true);
      pg_ls_dir       
----------------------
 .
 ..
 pg_wal
 global
 pg_commit_ts
 pg_dynshmem
 pg_notify
 pg_serial
 pg_snapshots
 pg_subtrans
 pg_twophase
 pg_multixact
 base
 pg_replslot
 pg_tblspc
 pg_stat
 pg_stat_tmp
 pg_xact
 pg_logical
 PG_VERSION
 postgresql.conf
 postgresql.auto.conf
 pg_hba.conf
 pg_ident.conf
 pg_log
 postmaster.opts
 autoprewarm.blocks
 postmaster.pid
 current_logfiles
(29 rows)

There is no option to control sorting but of course you can add a where clause to do this:

postgres=# select * from pg_ls_dir('.',true,true) order by 1;
      pg_ls_dir       
----------------------
 .
 ..
 autoprewarm.blocks
 base
 current_logfiles
 global
 pg_commit_ts
 pg_dynshmem
 pg_hba.conf
 pg_ident.conf
 pg_log
 pg_logical
 pg_multixact
 pg_notify
 pg_replslot
 pg_serial
 pg_snapshots
 pg_stat
 pg_stat_tmp
 pg_subtrans
 pg_tblspc
 pg_twophase
 PG_VERSION
 pg_wal
 pg_xact
 postgresql.auto.conf
 postgresql.conf
 postmaster.opts
 postmaster.pid
(29 rows)

You could load that into an array and then do whatever you want to do with it for further processing:

postgres=# \x
Expanded display is on.
postgres=# with dirs as (select pg_ls_dir('.'::text,true,true) dir order by 1)
                select array_agg(dir) from dirs;
-[ RECORD 1 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
array_agg | {.,..,autoprewarm.blocks,base,current_logfiles,global,pg_commit_ts,pg_dynshmem,pg_hba.conf,pg_ident.conf,pg_log,pg_logical,pg_multixact,pg_notify,pg_replslot,pg_serial,pg_snapshots,pg_stat,pg_stat_tmp,pg_subtrans,pg_tblspc,pg_twophase,PG_VERSION,pg_wal,pg_xact,postgresql.auto.conf,postgresql.conf,postmaster.opts,postmaster.pid}

When you try to list the files of a directory you do not have the permissions to do so of course that fails:

postgres=# select pg_ls_dir('/root');
ERROR:  could not open directory "/root": Permission denied

All other directories the PostgreSQL operating system user has access to can be listed:

postgres=# \x
Expanded display is off.
postgres=# select pg_ls_dir('/var/tmp');
                                pg_ls_dir                                
-------------------------------------------------------------------------
 yum-postgres-uSpYMT
 systemd-private-f706224b798a404a8b1b7efbbb7137c9-chronyd.service-saK1Py
 systemd-private-bcd40d1946c94f1fbcb73d1047ee2fc2-chronyd.service-Fr7WgV
 systemd-private-798725e073664df6bbc5c6041151ef61-chronyd.service-kRvvJa
(4 rows)

When you need to get some statistics about a file there is pg_stat_file:

postgres=# select pg_stat_file('postgresql.conf');
                                     pg_stat_file                                      
---------------------------------------------------------------------------------------
 (26343,"2019-02-21 17:35:22+01","2019-02-05 15:41:11+01","2019-02-05 15:41:11+01",,f)
(1 row)
postgres=# select pg_size_pretty((pg_stat_file('postgresql.conf')).size);
 pg_size_pretty 
----------------
 26 kB
(1 row)

Loading a file into the database is possible as well:

postgres=# create table t1 ( a text );
CREATE TABLE
postgres=# insert into t1 select pg_read_file('postgresql.conf');
INSERT 0 1
postgres=# select * from t1;
                                                        a                                                        
-----------------------------------------------------------------------------------------------------------------
 # -----------------------------                                                                                +
 # PostgreSQL configuration file                                                                                +
 # -----------------------------                                                                                +
 #                                                                                                              +
 # This file consists of lines of the form:                                                                     +
 #                                                                                                              +
 #   name = value                                                                                               +
...

This works even with binary files (but do you really want to have binary files in the database?):

postgres=# create table t2 ( a bytea );
CREATE TABLE
postgres=# insert into t2 select pg_read_binary_file('/bin/cp');
INSERT 0 1
postgres=# select * from t2;
                                                                                                                                                                                                                   
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 \x7f454c4602010100000000000000000002003e0001000000293e4000000000004000000000000000c0560200000000000000000040003800090040001f001e000600000005000000400000000000000040004000000000004000400000000000f801000000000000
(1 row)
postgres=# drop table t1,t2;
DROP TABLE

As usual this is all very well documented in the PostgreSQL documentation.

Daniel Westermann
Daniel Westermann

Principal Consultant & Technology Leader Open Infrastructure