While giving the last PostgreSQL DBA Essentials workshop this question came up: When we create a table or an index: are the statistics available automatically? To be more precise: When we create and load a table in one step, create an index on that table afterwards: Do we have the statistics available by default or do we need to wait for autovacuum to kick in or analyze manually? Lets see …

First of all lets disable autovacuum so it does not kick off analyze in the background:

postgres=# ! ps -ef | grep autov | grep -v grep
postgres  1641  1635  0 07:08 ?        00:00:00 postgres: MY_CLUSTER: autovacuum launcher process   
postgres=# alter system set autovacuum=off;
ALTER SYSTEM
postgres=# select * from pg_reload_conf();
 pg_reload_conf 
----------------
 t
(1 row)

postgres=# ! ps -ef | grep autov | grep -v grep

Create and populate the table:

postgres=# ! cat a.sql
drop table if exists t;
create table t
as select a.*, md5(a::varchar) from generate_series(1,5000000) a;
postgres=# i a.sql
psql:a.sql:1: NOTICE:  table "t" does not exist, skipping
DROP TABLE
SELECT 5000000

Create an index:

postgres=# create index i1 on t(a);
CREATE INDEX
postgres=# d+ t
                                     Table "public.t"
 Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description 
--------+---------+-----------+----------+---------+----------+--------------+-------------
 a      | integer |           |          |         | plain    |              | 
 md5    | text    |           |          |         | extended |              | 
Indexes:
    "i1" btree (a)

Do we have statistics already? Lets check:

postgres=# select stanullfrac,stawidth,stadistinct from pg_statistic where starelid = 't'::regclass;
 stanullfrac | stawidth | stadistinct 
-------------+----------+-------------
(0 rows)

No, at least not for the table. What about the index?

postgres=# select stanullfrac,stawidth,stadistinct from pg_statistic where starelid = 'i1'::regclass;
 stanullfrac | stawidth | stadistinct 
-------------+----------+-------------
(0 rows)

No. Lets analyze:

postgres=# analyze t;
ANALYZE
postgres=# analyze i1;
WARNING:  skipping "i1" --- cannot analyze non-tables or special system tables
ANALYZE

Apparently we can not analyze an index. What do we see now?

postgres=# select stanullfrac,stawidth,stadistinct from pg_statistic where starelid = 't'::regclass;
 stanullfrac | stawidth | stadistinct 
-------------+----------+-------------
           0 |        4 |          -1
           0 |       33 |          -1
(2 rows)

postgres=# select stanullfrac,stawidth,stadistinct from pg_statistic where starelid = 'i1'::regclass;
 stanullfrac | stawidth | stadistinct 
-------------+----------+-------------
(0 rows)

We do see statistics for the table but not for the index. The reason is that “analyze” works on the tables, but not on the indexes. For regular indexes there will be nothing in pg_statistic because that information would be redundant with the underlying table columns. But there will be statistics for function based indexes:

postgres=# create index i2 on t(lower(a::text));
CREATE INDEX
postgres=# select stanullfrac,stawidth,stadistinct from pg_statistic where starelid = 'i2'::regclass;
 stanullfrac | stawidth | stadistinct 
-------------+----------+-------------
(0 rows)

postgres=# analyze t;
ANALYZE
postgres=# select stanullfrac,stawidth,stadistinct from pg_statistic where starelid = 'i2'::regclass;
 stanullfrac | stawidth | stadistinct 
-------------+----------+-------------
           0 |       10 |          -1
(1 row)

So, when autovacuum is off we do not get statistics when we do not kick off a manual analyze (which is not a surprise). What happens when autovacuum is on?

postgres=# alter system set autovacuum=on;
ALTER SYSTEM
postgres=# select * from pg_reload_conf();
 pg_reload_conf 
----------------
 t
(1 row)
postgres=# i a.sql
DROP TABLE
SELECT 5000000
postgres=# select stanullfrac,stawidth,stadistinct from pg_statistic where starelid = 't'::regclass;
 stanullfrac | stawidth | stadistinct 
-------------+----------+-------------
(0 rows)

Nope, same picture here. But some seconds later:

postgres=# select stanullfrac,stawidth,stadistinct from pg_statistic where starelid = 't'::regclass;
 stanullfrac | stawidth | stadistinct 
-------------+----------+-------------
           0 |        4 |          -1
           0 |       33 |          -1
(2 rows)

… statistics are there. Conclusion: When you require current statistics directly after loading a table you’d better kick of a manual analyze right after. Otherwise autovacuum will take care about that, but not immediately.