A recent blog post from Franck and a tweet around that topic is the inspiration for this blog post, thanks Jan for requesting :). In short it is about how small you can get the binaries. Is that important? At least when it comes to Docker images it might get important as you usually try make the image as small as possible. Well, comparing PostgreSQL and Oracle in that field is unfair as Oracle comes with many stuff by default which PostgreSQL is just not shipping (e.g. Apex, SQL Developer, …), so please treat this more a as fun post, please.
The way we usually compile PostgreSQL is this (not in /var/tmp in real life):
postgres@pgbox:/home/postgres/ [pg103] cd /var/tmp/ postgres@pgbox:/var/tmp/ [pg103] wget https://ftp.postgresql.org/pub/source/v10.5/postgresql-10.5.tar.bz2 postgres@pgbox:/var/tmp/ [pg103] tar -axf postgresql-10.5.tar.bz2 postgres@pgbox:/var/tmp/ [pg103] cd postgresql-10.5/ postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] PGHOME=/var/tmp/pg105 postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] SEGSIZE=2 postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] BLOCKSIZE=8 postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] WALSEGSIZE=16 postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] ./configure --prefix=${PGHOME} --exec-prefix=${PGHOME} --bindir=${PGHOME}/bin --libdir=${PGHOME}/lib --sysconfdir=${PGHOME}/etc --includedir=${PGHOME}/include --datarootdir=${PGHOME}/share --datadir=${PGHOME}/share --with-pgport=5432 --with-perl --with-python --with-openssl --with-pam --with-ldap --with-libxml --with-libxslt --with-segsize=${SEGSIZE} --with-blocksize=${BLOCKSIZE} --with-wal-segsize=${WALSEGSIZE} --with-systemd postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] make -j 4 all postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] make install postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] cd contrib postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] make -j 4 install
When we do this against the PostgreSQL 10.5 source code the result is this (without the documentation, of course, but containing all the extensions ):
postgres@pgbox:/var/tmp/postgresql-10.5/contrib/ [pg103] du -sh /var/tmp/pg105/ 28M /var/tmp/pg105/
Can we get that even smaller? Let’s try to skip the extensions:
postgres@pgbox:/var/tmp/postgresql-10.5/contrib/ [pg103] cd .. postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] make clean postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] ./configure --prefix=${PGHOME} --exec-prefix=${PGHOME} --bindir=${PGHOME}/bin --libdir=${PGHOME}/lib --sysconfdir=${PGHOME}/etc --includedir=${PGHOME}/include --datarootdir=${PGHOME}/share --datadir=${PGHOME}/share --with-pgport=5432 --with-perl --with-python --with-openssl --with-pam --with-ldap --with-libxml --with-libxslt --with-segsize=${SEGSIZE} --with-blocksize=${BLOCKSIZE} --with-wal-segsize=${WALSEGSIZE} --with-systemd postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] make -j 4 all postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] rm -rf /var/tmp/pg105/ postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] make install
What do we have now?
postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] du -sh /var/tmp/pg105/ 25M /var/tmp/pg105/
We saved another 3MB. Can we do more? Let’s try to skip all the “–with” flags that enable perl and so on for the configure command:
postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] make clean postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] ./configure --prefix=${PGHOME} --exec-prefix=${PGHOME} --bindir=${PGHOME}/bin --libdir=${PGHOME}/lib --sysconfdir=${PGHOME}/etc --includedir=${PGHOME}/include --datarootdir=${PGHOME}/share --datadir=${PGHOME}/share --with-pgport=5432 --with-segsize=${SEGSIZE} --with-blocksize=${BLOCKSIZE} --with-wal-segsize=${WALSEGSIZE} --with-systemd postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] make -j 4 all postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] rm -rf /var/tmp/pg105/ postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] make install
Do we see a change?
postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] du -sh /var/tmp/pg105/ 25M /var/tmp/pg105/
No, that does not change anything. Franck stripped the Oracle binaries and libraries, so lets try to do the same (although I am not sure right now if that is supported):
postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] du -sh /var/tmp/pg105/ 25M /var/tmp/pg105/ postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] strip /var/tmp/pg105/bin/* postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] strip /var/tmp/pg105/lib/* strip: Warning: '/var/tmp/pg105/lib/pkgconfig' is not an ordinary file strip: Warning: '/var/tmp/pg105/lib/postgresql' is not an ordinary file postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] du -sh /var/tmp/pg105/ 24M /var/tmp/pg105/
So, another 1MB less. Can we still initialize and start PostgreSQL?
postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] /var/tmp/pg105/bin/initdb -D /var/tmp/testpg The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locales COLLATE: en_US.utf8 CTYPE: en_US.utf8 MESSAGES: en_US.utf8 MONETARY: de_CH.UTF-8 NUMERIC: de_CH.UTF-8 TIME: en_US.UTF-8 The default database encoding has accordingly been set to "UTF8". The default text search configuration will be set to "english". Data page checksums are disabled. creating directory /var/tmp/testpg ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting dynamic shared memory implementation ... posix creating configuration files ... ok running bootstrap script ... ok performing post-bootstrap initialization ... ok syncing data to disk ... ok WARNING: enabling "trust" authentication for local connections You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb. Success. You can now start the database server using: /var/tmp/pg105/bin/pg_ctl -D /var/tmp/testpg -l logfile start postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] /var/tmp/pg105/bin/pg_ctl -D /var/tmp/testpg/ start waiting for server to start....2018-08-17 18:57:50.329 CEST [8528] LOG: listening on IPv6 address "::1", port 5432 2018-08-17 18:57:50.329 CEST [8528] LOG: listening on IPv4 address "127.0.0.1", port 5432 2018-08-17 18:57:50.334 CEST [8528] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" 2018-08-17 18:57:50.354 CEST [8529] LOG: database system was shut down at 2018-08-17 18:57:31 CEST 2018-08-17 18:57:50.358 CEST [8528] LOG: database system is ready to accept connections done server started
Looks good and we are able to connect:
postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] /var/tmp/pg105/bin/psql -c "select version()" postgres version --------------------------------------------------------------------------------------------------------- PostgreSQL 10.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit (1 row) Time: 1.428 ms
What else can we do? When you do not need the utilities on the server you could just remove them (as said, this is a fun post, don’t do this):
postgres@pgbox:/var/tmp/postgresql-10.5/ [pg103] cd /var/tmp/pg105/bin postgres@pgbox:/var/tmp/pg105/bin/ [pg103] rm clusterdb createdb createuser dropdb dropuser pg_archivecleanup pg_basebackup pg_dump pg_dumpall pg_isready pg_receivewal pg_recvlogical pg_resetwal pg_restore pg_rewind pg_test_fsync pg_test_timing pg_upgrade pg_waldump reindexdb vacuumdb
We could probably even remove pgbench and psql but these two I will need to show that the server is still working. What do we have now?
postgres@pgbox:/var/tmp/pg105/bin/ [pg103] du -sh /var/tmp/pg105/ 21M /var/tmp/pg105/
Another 3MB less. Can we still restart and connect?
postgres@pgbox:/var/tmp/pg105/bin/ [pg103] /var/tmp/pg105/bin/pg_ctl -D /var/tmp/testpg/ stop waiting for server to shut down....2018-08-17 19:08:49.588 CEST [9144] LOG: received fast shutdown request 2018-08-17 19:08:49.593 CEST [9144] LOG: aborting any active transactions 2018-08-17 19:08:49.597 CEST [9144] LOG: worker process: logical replication launcher (PID 9151) exited with exit code 1 2018-08-17 19:08:49.598 CEST [9146] LOG: shutting down 2018-08-17 19:08:49.625 CEST [9144] LOG: database system is shut down done server stopped postgres@pgbox:/var/tmp/pg105/bin/ [pg103] /var/tmp/pg105/bin/pg_ctl -D /var/tmp/testpg/ start waiting for server to start....2018-08-17 19:08:51.949 CEST [9368] LOG: listening on IPv6 address "::1", port 9999 2018-08-17 19:08:51.949 CEST [9368] LOG: listening on IPv4 address "127.0.0.1", port 9999 2018-08-17 19:08:51.953 CEST [9368] LOG: listening on Unix socket "/tmp/.s.PGSQL.9999" 2018-08-17 19:08:51.966 CEST [9369] LOG: database system was shut down at 2018-08-17 19:08:49 CEST 2018-08-17 19:08:51.969 CEST [9368] LOG: database system is ready to accept connections done server started postgres@pgbox:/var/tmp/pg105/bin/ [pg103] /var/tmp/pg105/bin/psql -c "select version()" postgres version --------------------------------------------------------------------------------------------------------- PostgreSQL 10.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit (1 row) Time: 2.043 ms
Looks good. Now lets do the final step and remove the rest which is not required for the server, but before that we do an initdb as we can not do that afterwards:
postgres@pgbox:/var/tmp/pg105/bin/ [pg103] /var/tmp/pg105/bin/pg_ctl -D /var/tmp/testpg/ stop waiting for server to shut down....2018-08-17 19:10:31.693 CEST [9368] LOG: received fast shutdown request 2018-08-17 19:10:31.696 CEST [9368] LOG: aborting any active transactions 2018-08-17 19:10:31.696 CEST [9368] LOG: worker process: logical replication launcher (PID 9375) exited with exit code 1 2018-08-17 19:10:31.697 CEST [9370] LOG: shutting down 2018-08-17 19:10:31.712 CEST [9368] LOG: database system is shut down done server stopped postgres@pgbox:/var/tmp/pg105/bin/ [pg103] rm -rf /var/tmp/testpg/ postgres@pgbox:/var/tmp/pg105/bin/ [pg103] /var/tmp/pg105/bin/initdb -D /var/tmp/testpg The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locales COLLATE: en_US.utf8 CTYPE: en_US.utf8 MESSAGES: en_US.utf8 MONETARY: de_CH.UTF-8 NUMERIC: de_CH.UTF-8 TIME: en_US.UTF-8 The default database encoding has accordingly been set to "UTF8". The default text search configuration will be set to "english". Data page checksums are disabled. creating directory /var/tmp/testpg ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting dynamic shared memory implementation ... posix creating configuration files ... ok running bootstrap script ... ok performing post-bootstrap initialization ... ok syncing data to disk ... ok WARNING: enabling "trust" authentication for local connections You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb. Success. You can now start the database server using: /var/tmp/pg105/bin/pg_ctl -D /var/tmp/testpg -l logfile start
So, remove the rest:
postgres@pgbox:/var/tmp/pg105/bin/ [pg103] rm pg_config pg_controldata psql pgbench initdb ecpg pgbench pg_ctl postgres@pgbox:/var/tmp/pg105/bin/ [pg103] du -sh /var/tmp/pg105/ 20M /var/tmp/pg105/
We are down to 20MB but we can still start the instance:
postgres@pgbox:/var/tmp/pg105/bin/ [pg103] /var/tmp/pg105/bin/postgres -D /var/tmp/testpg/ & [1] 9486 postgres@pgbox:/var/tmp/pg105/bin/ [pg103] 2018-08-17 19:13:54.917 CEST [9486] LOG: listening on IPv6 address "::1", port 9999 2018-08-17 19:13:54.917 CEST [9486] LOG: listening on IPv4 address "127.0.0.1", port 9999 2018-08-17 19:13:54.924 CEST [9486] LOG: listening on Unix socket "/tmp/.s.PGSQL.9999" 2018-08-17 19:13:54.955 CEST [9487] LOG: database system was shut down at 2018-08-17 19:10:56 CEST 2018-08-17 19:13:54.960 CEST [9486] LOG: database system is ready to accept connections postgres@pgbox:/var/tmp/pg105/bin/ [pg103] ps -ef | grep postgres root 1061 941 0 18:26 ? 00:00:00 sshd: postgres [priv] postgres 1064 1061 0 18:26 ? 00:00:02 sshd: postgres@pts/0 postgres 1065 1064 0 18:26 pts/0 00:00:01 -bash postgres 9486 1065 0 19:13 pts/0 00:00:00 /var/tmp/pg105/bin/postgres -D /var/tmp/testpg/ postgres 9488 9486 0 19:13 ? 00:00:00 postgres: checkpointer process postgres 9489 9486 0 19:13 ? 00:00:00 postgres: writer process postgres 9490 9486 0 19:13 ? 00:00:00 postgres: wal writer process postgres 9491 9486 0 19:13 ? 00:00:00 postgres: autovacuum launcher process postgres 9492 9486 0 19:13 ? 00:00:00 postgres: stats collector process postgres 9493 9486 0 19:13 ? 00:00:00 postgres: bgworker: logical replication launcher postgres 9496 1065 0 19:14 pts/0 00:00:00 ps -ef postgres 9497 1065 0 19:14 pts/0 00:00:00 grep --color=auto postgres
Using another psql on that box we can confirm that we can connect:
postgres@pgbox:/var/tmp/pg105/bin/ [pg103] /u01/app/postgres/product/10/db_4/bin/psql -c "select version()" postgres version --------------------------------------------------------------------------------------------------------- PostgreSQL 10.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit (1 row)
Still too much? What else can we? What is consuming space:
postgres@pgbox:/var/tmp/pg105/bin/ [pg103] du -sh /var/tmp/pg105/* 6.6M /var/tmp/pg105/bin 5.9M /var/tmp/pg105/include 4.1M /var/tmp/pg105/lib 2.9M /var/tmp/pg105/share
We can not do more in the “bin” directory, nothing left to delete:
postgres@pgbox:/var/tmp/pg105/bin/ [pg103] ls -l /var/tmp/pg105/bin total 6660 -rwxr-xr-x. 1 postgres postgres 6817480 Aug 17 18:56 postgres lrwxrwxrwx. 1 postgres postgres 8 Aug 17 18:54 postmaster -> postgres
Everything else will probably safe us a few bytes such as the sample files:
postgres@pgbox:/var/tmp/pg105/ [pg103] find . -name *sample* ./share/postgresql/tsearch_data/synonym_sample.syn ./share/postgresql/tsearch_data/thesaurus_sample.ths ./share/postgresql/tsearch_data/hunspell_sample.affix ./share/postgresql/tsearch_data/ispell_sample.affix ./share/postgresql/tsearch_data/ispell_sample.dict ./share/postgresql/tsearch_data/hunspell_sample_long.affix ./share/postgresql/tsearch_data/hunspell_sample_long.dict ./share/postgresql/tsearch_data/hunspell_sample_num.affix ./share/postgresql/tsearch_data/hunspell_sample_num.dict ./share/postgresql/pg_hba.conf.sample ./share/postgresql/pg_ident.conf.sample ./share/postgresql/postgresql.conf.sample ./share/postgresql/recovery.conf.sample ./share/postgresql/pg_service.conf.sample ./share/postgresql/psqlrc.sample
So how much space do we consume for the PostgreSQL installation and the files which make up the instance?
postgres@pgbox:/var/tmp/pg105/bin/ [pg103] du -sh /var/tmp/pg105/ 20M /var/tmp/pg105/ postgres@pgbox:/var/tmp/pg105/bin/ [pg103] du -sh /var/tmp/testpg/ 41M /var/tmp/testpg/
… 61MB. When we add the wal file Jan mentioned in his tweet we come the 77MB. Not much.
The final question is if PostgreSQL is still working. Let’s use pgbench from another installation on the same server against this:
postgres@pgbox:/var/tmp/pg105/ [pg103] /u01/app/postgres/product/10/db_3/bin/pgbench -i -s 10 postgres NOTICE: table "pgbench_history" does not exist, skipping NOTICE: table "pgbench_tellers" does not exist, skipping NOTICE: table "pgbench_accounts" does not exist, skipping NOTICE: table "pgbench_branches" does not exist, skipping creating tables... 100000 of 1000000 tuples (10%) done (elapsed 0.08 s, remaining 0.75 s) 200000 of 1000000 tuples (20%) done (elapsed 0.24 s, remaining 0.95 s) 300000 of 1000000 tuples (30%) done (elapsed 0.42 s, remaining 0.98 s) 400000 of 1000000 tuples (40%) done (elapsed 0.49 s, remaining 0.74 s) 500000 of 1000000 tuples (50%) done (elapsed 0.70 s, remaining 0.70 s) 600000 of 1000000 tuples (60%) done (elapsed 0.88 s, remaining 0.58 s) 700000 of 1000000 tuples (70%) done (elapsed 0.95 s, remaining 0.41 s) 800000 of 1000000 tuples (80%) done (elapsed 1.14 s, remaining 0.29 s) 900000 of 1000000 tuples (90%) done (elapsed 1.32 s, remaining 0.15 s) 1000000 of 1000000 tuples (100%) done (elapsed 1.41 s, remaining 0.00 s) vacuum... set primary keys... done. postgres@pgbox:/var/tmp/pg105/ [pg103] /u01/app/postgres/product/10/db_3/bin/pgbench -s 10 postgres scale option ignored, using count from pgbench_branches table (10) starting vacuum...end. transaction type: scaling factor: 10 query mode: simple number of clients: 1 number of threads: 1 number of transactions per client: 10 number of transactions actually processed: 10/10 latency average = 4.436 ms tps = 225.435296 (including connections establishing) tps = 285.860401 (excluding connections establishing)
Looks good. So you can come down to 20MB for the PostgreSQL installation and another 41Mb for the files you need to start the instance. You could even drop the postgres database to save another 7MB. But remember: Please don’t do that, you are still fine with around 30MB 🙂