Infrastructure at your Service

Creating a standard B-tree index in PostgreSQL is nothing special and usually we do it without thinking about what happens in the background. For the standard data types like e.g. int or text we do not need to know anything about operators or operator classes. In those cases the standard operator class for the data type is applied and all is fine. But sometimes we need to know about the possibilities PostgreSQL comes with, and this is the topic of this post.

As usual, let’s start with a simple setup:

postgres=# create table t1 ( a int, b text );
CREATE TABLE
postgres=# insert into t1 select i, md5(i::text) from generate_series(1,1000000) i;
INSERT 0 1000000
postgres=# select * from t1 limit 5;
 a |                b                 
---+----------------------------------
 1 | c4ca4238a0b923820dcc509a6f75849b
 2 | c81e728d9d4c2f636f067f89cc14862c
 3 | eccbc87e4b5ce2fe28308fd9f2a7baf3
 4 | a87ff679a2f3e71d9181a67b7542122c
 5 | e4da3b7fbbce2345d7772b0674a318d5
(5 rows)

We now have a simple table with two columns and can create standard B-tree indexes like this:

postgres=# create index i1 on t1(a);
CREATE INDEX
postgres=# create index i2 on t1(b);
CREATE INDEX
postgres=# \d t1
                 Table "public.t1"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 a      | integer |           |          | 
 b      | text    |           |          | 
Indexes:
    "i1" btree (a)
    "i2" btree (b)

These indexes work perfectly fine, when you ask for rows like e.g. this:

postgres=# explain select * from t1 where a = 5;
                          QUERY PLAN                          
--------------------------------------------------------------
 Index Scan using i1 on t1  (cost=0.42..8.44 rows=1 width=37)
   Index Cond: (a = 5)
(2 rows)

postgres=# explain select * from t1 where b = 'e4da3b7fbbce2345d7772b0674a318d5';
                          QUERY PLAN                          
--------------------------------------------------------------
 Index Scan using i2 on t1  (cost=0.42..8.44 rows=1 width=37)
   Index Cond: (b = 'e4da3b7fbbce2345d7772b0674a318d5'::text)
(2 rows)

But how can we know which operators are supported by a specific index and data type? This can easily be answered by asking the PostgreSQL catalog:

postgres=# select am.amname AS index_method
                , opc.opcname AS opclass_name
                , opc.opcintype::regtype AS indexed_type
                , opc.opcdefault AS is_default
             from pg_am am
                , pg_opclass opc
            where opc.opcmethod = am.oid 
              and am.amname = 'btree'
         order by index_method, opclass_name;
 index_method |    opclass_name     |        indexed_type         | is_default 
--------------+---------------------+-----------------------------+------------
 btree        | array_ops           | anyarray                    | t
 btree        | bit_ops             | bit                         | t
 btree        | bool_ops            | boolean                     | t
 btree        | bpchar_ops          | character                   | t
 btree        | bpchar_pattern_ops  | character                   | f
 btree        | bytea_ops           | bytea                       | t
 btree        | char_ops            | "char"                      | t
 btree        | cidr_ops            | inet                        | f
 btree        | date_ops            | date                        | t
 btree        | enum_ops            | anyenum                     | t
 btree        | float4_ops          | real                        | t
 btree        | float8_ops          | double precision            | t
 btree        | inet_ops            | inet                        | t
 btree        | int2_ops            | smallint                    | t
 btree        | int4_ops            | integer                     | t
 btree        | int8_ops            | bigint                      | t
 btree        | interval_ops        | interval                    | t
 btree        | jsonb_ops           | jsonb                       | t
 btree        | macaddr8_ops        | macaddr8                    | t
 btree        | macaddr_ops         | macaddr                     | t
 btree        | money_ops           | money                       | t
 btree        | name_ops            | name                        | t
 btree        | numeric_ops         | numeric                     | t
 btree        | oid_ops             | oid                         | t
 btree        | oidvector_ops       | oidvector                   | t
 btree        | pg_lsn_ops          | pg_lsn                      | t
 btree        | range_ops           | anyrange                    | t
 btree        | record_image_ops    | record                      | f
 btree        | record_ops          | record                      | t
 btree        | text_ops            | text                        | t
 btree        | text_pattern_ops    | text                        | f
 btree        | tid_ops             | tid                         | t
 btree        | time_ops            | time without time zone      | t
 btree        | timestamp_ops       | timestamp without time zone | t
 btree        | timestamptz_ops     | timestamp with time zone    | t
 btree        | timetz_ops          | time with time zone         | t
 btree        | tsquery_ops         | tsquery                     | t
 btree        | tsvector_ops        | tsvector                    | t
 btree        | uuid_ops            | uuid                        | t
 btree        | varbit_ops          | bit varying                 | t
 btree        | varchar_ops         | text                        | f
 btree        | varchar_pattern_ops | text                        | f
 btree        | xid8_ops            | xid8                        | t
(43 rows)

A B-tree index, as you can see above, supports all the data types listed in the “indexed_type” column. If you want to know which operators are supported by a operator family you can also ask the PostgreSQL catalog, e.g. for integer and text:

postgres=# select am.amname as index_method
                , opf.opfname as opfamily_name
                , amop.amopopr::regoperator as opfamily_operator
           from pg_am am
              , pg_opfamily opf
              , pg_amop amop
          where opf.opfmethod = am.oid 
            and amop.amopfamily = opf.oid 
            and am.amname = 'btree'
            and opf.opfname in ('integer_ops','text_ops')
          order by 1,2,3;
 index_method | opfamily_name |   opfamily_operator   
--------------+---------------+-----------------------
 btree        | integer_ops   | =(integer,bigint)
 btree        | integer_ops   | (integer,bigint)
 btree        | integer_ops   | =(integer,bigint)
 btree        | integer_ops   | =(smallint,smallint)
 btree        | integer_ops   | <(smallint,smallint)
 btree        | integer_ops   | =(integer,integer)
 btree        | integer_ops   | <(integer,integer)
 btree        | integer_ops   | =(bigint,bigint)
 btree        | integer_ops   | (bigint,bigint)
 btree        | integer_ops   | =(bigint,bigint)
 btree        | integer_ops   | =(bigint,integer)
 btree        | integer_ops   | (bigint,integer)
 btree        | integer_ops   | =(bigint,integer)
 btree        | integer_ops   | >(smallint,smallint)
 btree        | integer_ops   | >(integer,integer)
 btree        | integer_ops   | <=(smallint,smallint)
 btree        | integer_ops   | =(smallint,smallint)
 btree        | integer_ops   | >=(integer,integer)
 btree        | integer_ops   | =(smallint,integer)
 btree        | integer_ops   | =(integer,smallint)
 btree        | integer_ops   | <(smallint,integer)
 btree        | integer_ops   | (smallint,integer)
 btree        | integer_ops   | >(integer,smallint)
 btree        | integer_ops   | <=(smallint,integer)
 btree        | integer_ops   | =(smallint,integer)
 btree        | integer_ops   | >=(integer,smallint)
 btree        | integer_ops   | =(smallint,bigint)
 btree        | integer_ops   | (smallint,bigint)
 btree        | integer_ops   | =(smallint,bigint)
 btree        | integer_ops   | =(bigint,smallint)
 btree        | integer_ops   | (bigint,smallint)
 btree        | integer_ops   | =(bigint,smallint)
 btree        | text_ops      | =(name,name)
 btree        | text_ops      | =(text,text)
 btree        | text_ops      | =(name,text)
 btree        | text_ops      | <(name,text)
 btree        | text_ops      | =(name,text)
 btree        | text_ops      | >(name,text)
 btree        | text_ops      | =(text,name)
 btree        | text_ops      | <(text,name)
 btree        | text_ops      | =(text,name)
 btree        | text_ops      | >(text,name)
 btree        | text_ops      | <(name,name)
 btree        | text_ops      | (name,name)
 btree        | text_ops      | >=(name,name)
 btree        | text_ops      | <(text,text)
 btree        | text_ops      | (text,text)
 btree        | text_ops      | >=(text,text)
(65 rows)

Using a B-tree index for these data types the index will support “>”, “=”, “<=" and "=". This can be annoying if you want to query a text column like this, as the index cannot support this:

postgres=# explain select * from t1 where b like 'e4da3b7fbbce2345d77%';
                              QUERY PLAN                               
-----------------------------------------------------------------------
 Gather  (cost=1000.00..14552.33 rows=100 width=37)
   Workers Planned: 2
   ->  Parallel Seq Scan on t1  (cost=0.00..13542.33 rows=42 width=37)
         Filter: (b ~~ 'e4da3b7fbbce2345d77%'::text)
(4 rows)

For “text” PostgreSQL comes with another build-in operator class you can use exactly for this purpose:

postgres=# select am.amname AS index_method
                , opc.opcname AS opclass_name
                , opc.opcintype::regtype AS indexed_type
                , opc.opcdefault AS is_default
             from pg_am am
                , pg_opclass opc
            where opc.opcmethod = am.oid 
              and am.amname = 'btree' 
              and opc.opcintype::regtype::text = 'text'
         order by index_method, opclass_name;
 index_method |    opclass_name     | indexed_type | is_default 
--------------+---------------------+--------------+------------
 btree        | text_ops            | text         | t
 btree        | text_pattern_ops    | text         | f
 btree        | varchar_ops         | text         | f
 btree        | varchar_pattern_ops | text         | f
(4 rows)

The default for “text ” is “text_ops”, but there is also “text_pattern_ops”, which you can use like this:

postgres=# select am.amname as index_method
                , opf.opfname as opfamily_name
                , amop.amopopr::regoperator as opfamily_operator
             from pg_am am
                , pg_opfamily opf
                , pg_amop amop
            where opf.opfmethod = am.oid 
              and amop.amopfamily = opf.oid 
              and am.amname = 'btree'
              and opf.opfname = 'text_pattern_ops'
        order by 1,2,3;
 index_method |  opfamily_name   | opfamily_operator 
--------------+------------------+-------------------
 btree        | text_pattern_ops | =(text,text)
 btree        | text_pattern_ops | ~<~(text,text)
 btree        | text_pattern_ops | ~=~(text,text)
 btree        | text_pattern_ops | ~>~(text,text)
(5 rows)

postgres=# create index i3 on t1(b text_pattern_ops);
CREATE INDEX
postgres=# explain select * from t1 where b like 'e4da3b7fbbce2345d77%';
                                          QUERY PLAN                                          
----------------------------------------------------------------------------------------------
 Index Scan using i3 on t1  (cost=0.42..8.45 rows=100 width=37)
   Index Cond: ((b ~>=~ 'e4da3b7fbbce2345d77'::text) AND (b ~<~ 'e4da3b7fbbce2345d78'::text))
   Filter: (b ~~ 'e4da3b7fbbce2345d77%'::text)
(3 rows)

Here is more detailed description in the documentation. Finally, if you know what you need, you can create your own operators, operator families and operator classes:

postgres=# \h create operator
Command:     CREATE OPERATOR
Description: define a new operator
Syntax:
CREATE OPERATOR name (
    {FUNCTION|PROCEDURE} = function_name
    [, LEFTARG = left_type ] [, RIGHTARG = right_type ]
    [, COMMUTATOR = com_op ] [, NEGATOR = neg_op ]
    [, RESTRICT = res_proc ] [, JOIN = join_proc ]
    [, HASHES ] [, MERGES ]
)

URL: https://www.postgresql.org/docs/13/sql-createoperator.html

postgres=# \h create operator family
Command:     CREATE OPERATOR FAMILY
Description: define a new operator family
Syntax:
CREATE OPERATOR FAMILY name USING index_method

URL: https://www.postgresql.org/docs/13/sql-createopfamily.html

postgres=# \h create operator class
Command:     CREATE OPERATOR CLASS
Description: define a new operator class
Syntax:
CREATE OPERATOR CLASS name [ DEFAULT ] FOR TYPE data_type
  USING index_method [ FAMILY family_name ] AS
  {  OPERATOR strategy_number operator_name [ ( op_type, op_type ) ] [ FOR SEARCH | FOR ORDER BY sort_family_name ]
   | FUNCTION support_number [ ( op_type [ , op_type ] ) ] function_name ( argument_type [, ...] )
   | STORAGE storage_type
  } [, ... ]

URL: https://www.postgresql.org/docs/13/sql-createopclass.html

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Daniel Westermann
Daniel Westermann

Principal Consultant & Technology Leader Open Infrastructure