Infrastructure at your Service

Daniel Westermann

The size of Oracle Home: from 9GB to 600MB – What about PostgreSQL?

A recent blog post from Franck and a tweet around that topic is the inspiration for this blog post, thanks Jan for requesting :). In short it is about how small you can get the binaries. Is that important? At least when it comes to Docker images it might get important as you usually try make the image as small as possible. Well, comparing PostgreSQL and Oracle in that field is unfair as Oracle comes with many stuff by default which PostgreSQL is just not shipping (e.g. Apex, SQL Developer, …), so please treat this more a as fun post, please.

The way we usually compile PostgreSQL is this (not in /var/tmp in real life):

[email protected]:/home/postgres/ [pg103] cd /var/tmp/
[email protected]:/var/tmp/ [pg103] wget https://ftp.postgresql.org/pub/source/v10.5/postgresql-10.5.tar.bz2
[email protected]:/var/tmp/ [pg103] tar -axf postgresql-10.5.tar.bz2
[email protected]:/var/tmp/ [pg103] cd postgresql-10.5/
[email protected]:/var/tmp/postgresql-10.5/ [pg103] PGHOME=/var/tmp/pg105
[email protected]:/var/tmp/postgresql-10.5/ [pg103] SEGSIZE=2
[email protected]:/var/tmp/postgresql-10.5/ [pg103] BLOCKSIZE=8
[email protected]:/var/tmp/postgresql-10.5/ [pg103] WALSEGSIZE=16
[email protected]:/var/tmp/postgresql-10.5/ [pg103] ./configure --prefix=${PGHOME} \
                                                             --exec-prefix=${PGHOME} \
                                                             --bindir=${PGHOME}/bin \
                                                             --libdir=${PGHOME}/lib \
                                                             --sysconfdir=${PGHOME}/etc \
                                                             --includedir=${PGHOME}/include \
                                                             --datarootdir=${PGHOME}/share \
                                                             --datadir=${PGHOME}/share \
                                                             --with-pgport=5432 \
                                                             --with-perl \
                                                             --with-python \
                                                             --with-openssl \
                                                             --with-pam \
                                                             --with-ldap \
                                                             --with-libxml \
                                                             --with-libxslt \
                                                             --with-segsize=${SEGSIZE} \
                                                             --with-blocksize=${BLOCKSIZE} \
                                                             --with-wal-segsize=${WALSEGSIZE}  \
                                                             --with-systemd
[email protected]:/var/tmp/postgresql-10.5/ [pg103] make -j 4 all
[email protected]:/var/tmp/postgresql-10.5/ [pg103] make install
[email protected]:/var/tmp/postgresql-10.5/ [pg103] cd contrib
[email protected]:/var/tmp/postgresql-10.5/ [pg103] make -j 4 install

When we do this against the PostgreSQL 10.5 source code the result is this (without the documentation, of course, but containing all the extensions ):

[email protected]:/var/tmp/postgresql-10.5/contrib/ [pg103] du -sh /var/tmp/pg105/
28M	/var/tmp/pg105/

Can we get that even smaller? Let’s try to skip the extensions:

[email protected]:/var/tmp/postgresql-10.5/contrib/ [pg103] cd ..
[email protected]:/var/tmp/postgresql-10.5/ [pg103] make clean
[email protected]:/var/tmp/postgresql-10.5/ [pg103] ./configure --prefix=${PGHOME} \
                                                             --exec-prefix=${PGHOME} \
                                                             --bindir=${PGHOME}/bin \
                                                             --libdir=${PGHOME}/lib \
                                                             --sysconfdir=${PGHOME}/etc \
                                                             --includedir=${PGHOME}/include \
                                                             --datarootdir=${PGHOME}/share \
                                                             --datadir=${PGHOME}/share \
                                                             --with-pgport=5432 \
                                                             --with-perl \
                                                             --with-python \
                                                             --with-openssl \
                                                             --with-pam \
                                                             --with-ldap \
                                                             --with-libxml \
                                                             --with-libxslt \
                                                             --with-segsize=${SEGSIZE} \
                                                             --with-blocksize=${BLOCKSIZE} \
                                                             --with-wal-segsize=${WALSEGSIZE}  \
                                                             --with-systemd
[email protected]:/var/tmp/postgresql-10.5/ [pg103] make -j 4 all
[email protected]:/var/tmp/postgresql-10.5/ [pg103] rm -rf /var/tmp/pg105/
[email protected]:/var/tmp/postgresql-10.5/ [pg103] make install

What do we have now?

[email protected]:/var/tmp/postgresql-10.5/ [pg103] du -sh /var/tmp/pg105/
25M	/var/tmp/pg105/

We saved another 3MB. Can we do more? Let’s try to skip all the “–with” flags that enable perl and so on for the configure command:

[email protected]:/var/tmp/postgresql-10.5/ [pg103] make clean
[email protected]:/var/tmp/postgresql-10.5/ [pg103] ./configure --prefix=${PGHOME} \
                                                             --exec-prefix=${PGHOME} \
                                                             --bindir=${PGHOME}/bin \
                                                             --libdir=${PGHOME}/lib \
                                                             --sysconfdir=${PGHOME}/etc \
                                                             --includedir=${PGHOME}/include \
                                                             --datarootdir=${PGHOME}/share \
                                                             --datadir=${PGHOME}/share \
                                                             --with-pgport=5432 \
                                                             --with-segsize=${SEGSIZE} \
                                                             --with-blocksize=${BLOCKSIZE} \
                                                             --with-wal-segsize=${WALSEGSIZE}  \
                                                             --with-systemd
[email protected]:/var/tmp/postgresql-10.5/ [pg103] make -j 4 all
[email protected]:/var/tmp/postgresql-10.5/ [pg103] rm -rf /var/tmp/pg105/
[email protected]:/var/tmp/postgresql-10.5/ [pg103] make install

Do we see a change?

[email protected]:/var/tmp/postgresql-10.5/ [pg103] du -sh /var/tmp/pg105/
25M	/var/tmp/pg105/

No, that does not change anything. Franck stripped the Oracle binaries and libraries, so lets try to do the same (although I am not sure right now if that is supported):

[email protected]:/var/tmp/postgresql-10.5/ [pg103] du -sh /var/tmp/pg105/
25M	/var/tmp/pg105/
[email protected]:/var/tmp/postgresql-10.5/ [pg103] strip /var/tmp/pg105/bin/*
[email protected]:/var/tmp/postgresql-10.5/ [pg103] strip /var/tmp/pg105/lib/*
strip: Warning: '/var/tmp/pg105/lib/pkgconfig' is not an ordinary file
strip: Warning: '/var/tmp/pg105/lib/postgresql' is not an ordinary file
[email protected]:/var/tmp/postgresql-10.5/ [pg103] du -sh /var/tmp/pg105/
24M	/var/tmp/pg105/

So, another 1MB less. Can we still initialize and start PostgreSQL?

[email protected]:/var/tmp/postgresql-10.5/ [pg103] /var/tmp/pg105/bin/initdb -D /var/tmp/testpg
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locales
  COLLATE:  en_US.utf8
  CTYPE:    en_US.utf8
  MESSAGES: en_US.utf8
  MONETARY: de_CH.UTF-8
  NUMERIC:  de_CH.UTF-8
  TIME:     en_US.UTF-8
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory /var/tmp/testpg ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    /var/tmp/pg105/bin/pg_ctl -D /var/tmp/testpg -l logfile start

[email protected]:/var/tmp/postgresql-10.5/ [pg103] /var/tmp/pg105/bin/pg_ctl -D /var/tmp/testpg/ start
waiting for server to start....2018-08-17 18:57:50.329 CEST [8528] LOG:  listening on IPv6 address "::1", port 5432
2018-08-17 18:57:50.329 CEST [8528] LOG:  listening on IPv4 address "127.0.0.1", port 5432
2018-08-17 18:57:50.334 CEST [8528] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2018-08-17 18:57:50.354 CEST [8529] LOG:  database system was shut down at 2018-08-17 18:57:31 CEST
2018-08-17 18:57:50.358 CEST [8528] LOG:  database system is ready to accept connections
 done
server started

Looks good and we are able to connect:

[email protected]:/var/tmp/postgresql-10.5/ [pg103] /var/tmp/pg105/bin/psql -c "select version()" postgres
                                                 version                                                 
---------------------------------------------------------------------------------------------------------
 PostgreSQL 10.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit
(1 row)

Time: 1.428 ms

What else can we do? When you do not need the utilities on the server you could just remove them (as said, this is a fun post, don’t do this):

[email protected]:/var/tmp/postgresql-10.5/ [pg103] cd /var/tmp/pg105/bin
[email protected]:/var/tmp/pg105/bin/ [pg103] rm clusterdb createdb createuser dropdb dropuser pg_archivecleanup pg_basebackup pg_dump pg_dumpall pg_isready pg_receivewal pg_recvlogical pg_resetwal pg_restore pg_rewind pg_test_fsync pg_test_timing pg_upgrade pg_waldump reindexdb vacuumdb

We could probably even remove pgbench and psql but these two I will need to show that the server is still working. What do we have now?

[email protected]:/var/tmp/pg105/bin/ [pg103] du -sh /var/tmp/pg105/
21M	/var/tmp/pg105/

Another 3MB less. Can we still restart and connect?

[email protected]:/var/tmp/pg105/bin/ [pg103] /var/tmp/pg105/bin/pg_ctl -D /var/tmp/testpg/ stop
waiting for server to shut down....2018-08-17 19:08:49.588 CEST [9144] LOG:  received fast shutdown request
2018-08-17 19:08:49.593 CEST [9144] LOG:  aborting any active transactions
2018-08-17 19:08:49.597 CEST [9144] LOG:  worker process: logical replication launcher (PID 9151) exited with exit code 1
2018-08-17 19:08:49.598 CEST [9146] LOG:  shutting down
2018-08-17 19:08:49.625 CEST [9144] LOG:  database system is shut down
 done
server stopped
[email protected]:/var/tmp/pg105/bin/ [pg103] /var/tmp/pg105/bin/pg_ctl -D /var/tmp/testpg/ start
waiting for server to start....2018-08-17 19:08:51.949 CEST [9368] LOG:  listening on IPv6 address "::1", port 9999
2018-08-17 19:08:51.949 CEST [9368] LOG:  listening on IPv4 address "127.0.0.1", port 9999
2018-08-17 19:08:51.953 CEST [9368] LOG:  listening on Unix socket "/tmp/.s.PGSQL.9999"
2018-08-17 19:08:51.966 CEST [9369] LOG:  database system was shut down at 2018-08-17 19:08:49 CEST
2018-08-17 19:08:51.969 CEST [9368] LOG:  database system is ready to accept connections
 done
server started
[email protected]:/var/tmp/pg105/bin/ [pg103] /var/tmp/pg105/bin/psql -c "select version()" postgres
                                                 version                                                 
---------------------------------------------------------------------------------------------------------
 PostgreSQL 10.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit
(1 row)

Time: 2.043 ms

Looks good. Now lets do the final step and remove the rest which is not required for the server, but before that we do an initdb as we can not do that afterwards:

[email protected]:/var/tmp/pg105/bin/ [pg103] /var/tmp/pg105/bin/pg_ctl -D /var/tmp/testpg/ stop
waiting for server to shut down....2018-08-17 19:10:31.693 CEST [9368] LOG:  received fast shutdown request
2018-08-17 19:10:31.696 CEST [9368] LOG:  aborting any active transactions
2018-08-17 19:10:31.696 CEST [9368] LOG:  worker process: logical replication launcher (PID 9375) exited with exit code 1
2018-08-17 19:10:31.697 CEST [9370] LOG:  shutting down
2018-08-17 19:10:31.712 CEST [9368] LOG:  database system is shut down
 done
server stopped
[email protected]:/var/tmp/pg105/bin/ [pg103] rm -rf /var/tmp/testpg/
[email protected]:/var/tmp/pg105/bin/ [pg103] /var/tmp/pg105/bin/initdb -D /var/tmp/testpg
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locales
  COLLATE:  en_US.utf8
  CTYPE:    en_US.utf8
  MESSAGES: en_US.utf8
  MONETARY: de_CH.UTF-8
  NUMERIC:  de_CH.UTF-8
  TIME:     en_US.UTF-8
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory /var/tmp/testpg ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    /var/tmp/pg105/bin/pg_ctl -D /var/tmp/testpg -l logfile start

So, remove the rest:

[email protected]:/var/tmp/pg105/bin/ [pg103] rm pg_config pg_controldata psql pgbench initdb ecpg pgbench pg_ctl
[email protected]:/var/tmp/pg105/bin/ [pg103] du -sh /var/tmp/pg105/
20M	/var/tmp/pg105/

We are down to 20MB but we can still start the instance:

[email protected]:/var/tmp/pg105/bin/ [pg103] /var/tmp/pg105/bin/postgres -D /var/tmp/testpg/ &
[1] 9486
[email protected]:/var/tmp/pg105/bin/ [pg103] 2018-08-17 19:13:54.917 CEST [9486] LOG:  listening on IPv6 address "::1", port 9999
2018-08-17 19:13:54.917 CEST [9486] LOG:  listening on IPv4 address "127.0.0.1", port 9999
2018-08-17 19:13:54.924 CEST [9486] LOG:  listening on Unix socket "/tmp/.s.PGSQL.9999"
2018-08-17 19:13:54.955 CEST [9487] LOG:  database system was shut down at 2018-08-17 19:10:56 CEST
2018-08-17 19:13:54.960 CEST [9486] LOG:  database system is ready to accept connections

[email protected]:/var/tmp/pg105/bin/ [pg103] ps -ef | grep postgres
root      1061   941  0 18:26 ?        00:00:00 sshd: postgres [priv]
postgres  1064  1061  0 18:26 ?        00:00:02 sshd: [email protected]/0
postgres  1065  1064  0 18:26 pts/0    00:00:01 -bash
postgres  9486  1065  0 19:13 pts/0    00:00:00 /var/tmp/pg105/bin/postgres -D /var/tmp/testpg/
postgres  9488  9486  0 19:13 ?        00:00:00 postgres: checkpointer process   
postgres  9489  9486  0 19:13 ?        00:00:00 postgres: writer process   
postgres  9490  9486  0 19:13 ?        00:00:00 postgres: wal writer process   
postgres  9491  9486  0 19:13 ?        00:00:00 postgres: autovacuum launcher process   
postgres  9492  9486  0 19:13 ?        00:00:00 postgres: stats collector process   
postgres  9493  9486  0 19:13 ?        00:00:00 postgres: bgworker: logical replication launcher  
postgres  9496  1065  0 19:14 pts/0    00:00:00 ps -ef
postgres  9497  1065  0 19:14 pts/0    00:00:00 grep --color=auto postgres

Using another psql on that box we can confirm that we can connect:

[email protected]:/var/tmp/pg105/bin/ [pg103] /u01/app/postgres/product/10/db_4/bin/psql -c "select version()" postgres
                                                 version                                                 
---------------------------------------------------------------------------------------------------------
 PostgreSQL 10.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit
(1 row)

Still too much? What else can we? What is consuming space:

[email protected]:/var/tmp/pg105/bin/ [pg103] du -sh /var/tmp/pg105/*
6.6M	/var/tmp/pg105/bin
5.9M	/var/tmp/pg105/include
4.1M	/var/tmp/pg105/lib
2.9M	/var/tmp/pg105/share

We can not do more in the “bin” directory, nothing left to delete:

[email protected]:/var/tmp/pg105/bin/ [pg103] ls -l /var/tmp/pg105/bin
total 6660
-rwxr-xr-x. 1 postgres postgres 6817480 Aug 17 18:56 postgres
lrwxrwxrwx. 1 postgres postgres       8 Aug 17 18:54 postmaster -> postgres

Everything else will probably safe us a few bytes such as the sample files:

[email protected]:/var/tmp/pg105/ [pg103] find . -name *sample*
./share/postgresql/tsearch_data/synonym_sample.syn
./share/postgresql/tsearch_data/thesaurus_sample.ths
./share/postgresql/tsearch_data/hunspell_sample.affix
./share/postgresql/tsearch_data/ispell_sample.affix
./share/postgresql/tsearch_data/ispell_sample.dict
./share/postgresql/tsearch_data/hunspell_sample_long.affix
./share/postgresql/tsearch_data/hunspell_sample_long.dict
./share/postgresql/tsearch_data/hunspell_sample_num.affix
./share/postgresql/tsearch_data/hunspell_sample_num.dict
./share/postgresql/pg_hba.conf.sample
./share/postgresql/pg_ident.conf.sample
./share/postgresql/postgresql.conf.sample
./share/postgresql/recovery.conf.sample
./share/postgresql/pg_service.conf.sample
./share/postgresql/psqlrc.sample

So how much space do we consume for the PostgreSQL installation and the files which make up the instance?

[email protected]:/var/tmp/pg105/bin/ [pg103] du -sh /var/tmp/pg105/
20M	/var/tmp/pg105/
[email protected]:/var/tmp/pg105/bin/ [pg103] du -sh /var/tmp/testpg/
41M	/var/tmp/testpg/

… 61MB. When we add the wal file Jan mentioned in his tweet we come the 77MB. Not much.

The final question is if PostgreSQL is still working. Let’s use pgbench from another installation on the same server against this:

[email protected]:/var/tmp/pg105/ [pg103] /u01/app/postgres/product/10/db_3/bin/pgbench -i -s 10 postgres
NOTICE:  table "pgbench_history" does not exist, skipping
NOTICE:  table "pgbench_tellers" does not exist, skipping
NOTICE:  table "pgbench_accounts" does not exist, skipping
NOTICE:  table "pgbench_branches" does not exist, skipping
creating tables...
100000 of 1000000 tuples (10%) done (elapsed 0.08 s, remaining 0.75 s)
200000 of 1000000 tuples (20%) done (elapsed 0.24 s, remaining 0.95 s)
300000 of 1000000 tuples (30%) done (elapsed 0.42 s, remaining 0.98 s)
400000 of 1000000 tuples (40%) done (elapsed 0.49 s, remaining 0.74 s)
500000 of 1000000 tuples (50%) done (elapsed 0.70 s, remaining 0.70 s)
600000 of 1000000 tuples (60%) done (elapsed 0.88 s, remaining 0.58 s)
700000 of 1000000 tuples (70%) done (elapsed 0.95 s, remaining 0.41 s)
800000 of 1000000 tuples (80%) done (elapsed 1.14 s, remaining 0.29 s)
900000 of 1000000 tuples (90%) done (elapsed 1.32 s, remaining 0.15 s)
1000000 of 1000000 tuples (100%) done (elapsed 1.41 s, remaining 0.00 s)
vacuum...
set primary keys...
done.
[email protected]:/var/tmp/pg105/ [pg103] /u01/app/postgres/product/10/db_3/bin/pgbench -s 10 postgres
scale option ignored, using count from pgbench_branches table (10)
starting vacuum...end.
transaction type: 
scaling factor: 10
query mode: simple
number of clients: 1
number of threads: 1
number of transactions per client: 10
number of transactions actually processed: 10/10
latency average = 4.436 ms
tps = 225.435296 (including connections establishing)
tps = 285.860401 (excluding connections establishing)

Looks good. So you can come down to 20MB for the PostgreSQL installation and another 41Mb for the files you need to start the instance. You could even drop the postgres database to save another 7MB. But remember: Please don’t do that, you are still fine with around 30MB 🙂

2 Comments

  • i'm not a robot says:

    Debian lib required:

    sudo apt-get install libreadline-dev
    sudo apt-get install libpam0g-dev
    sudo apt-get install libxml2-dev
    sudo apt-get install libxslt1-dev
    sudo apt-get install libldap2-dev
    sudo apt-get install tcl-dev
    sudo apt-get install libperl-dev
    sudo apt-get install python-dev

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Daniel Westermann
Daniel Westermann

Principal Consultant & Technology Leader Open Infrastructure