By Franck Pachot

.
Various methods of table compression have been introduced at each release. Some require a specific storage system Some requires specific options. Some are only for static data. And it’s not always very clear for the simple reason that their name has changed.

Name change for technical reasons (ROW/COLUMN STORE precision when a columnar compression has been introduced) or for marketing reason (COMPRESS FOR OLTP gave the idea that other – Exadata – compression level may not be suited for OLTP).

Of course that brings a lot of ambiguity such as:

  • HCC is called ‘COLUMN STORE’ even if it has nothing to do with the In-Memory columns store
  • COMPRESS ADVANCED is only one part of Advanced Compression Option
  • EHCC (Exadata Hybrid Columnar Compression) is not only for Exadata
  • COMPRESS FOR OLTP is not called like that anymore, but is still the only compression suitable for OLTP
  • HCC Row-Level Locking is not for ROW STORE but for COLUMN STORE. It’s suited for DML operation but is different than FOR OLTP. Anyway COLUMN STORE compression can be transformed to ROW STORE compression during updates. And that locking feature is licenced with the Advanced Compression Option, and available in Exadata only…
  • When do you need ACO (Advanced Compression Option) or not?

Let’s make it clear here.

NOCOMPRESS is the only possibility in Standard Edition

In NOCOMPRESS – the default – each datatype takes it’s full length. Number takes about 2 significan digits per bytes, dates takes 8 bytes, characters string takes one byte (or more in multibyte charactersets) per character, plus the length. Null takes 1 bytes. The only compression that happens is that nulls at the end of rows are not stored.

I’m created a few tables storing characters, numbers and dates:

SQL> create table TEST_NUM_22 NOCOMPRESS
     as select dbms_random.value(1,1e6) x , dbms_random.value(1,1e6) y
SQL> create table TEST_NUM_05 NOCOMPRESS
     as select round(dbms_random.value(1,1e6)) x , round(dbms_random.value(1,1e6)) y ...
SQL> create table TEST_STR_U4 NOCOMPRESS
     as select dbms_random.string('U',3) x , dbms_random.string('U',3) y ...
SQL> create table TEST_STR_A4 NOCOMPRESS
     as select dbms_random.string('A',3) x , dbms_random.string('A',3) y ...
SQL> create table TEST_STR_50 NOCOMPRESS
     as select dbms_random.string('P',49) x , dbms_random.string('P',49) y ...
SQL> create table TEST_DAT_08 NOCOMPRESS
     as select sysdate+rownum x, sysdate+rownum y ...
SQL> create table TEST_DAT_V8 NOCOMPRESS
     as select sysdate+mod(rownum,5) x, sysdate+mod(rownum,5) y ...

Here are the tables I have:

  • TEST_STR_A4 has 3 random character strings, TEST_STR_U4 has the same but only uppercase (so more duplicates).
  • TEST_STR_50 has 49 random character strings with any printable characters – low duplicate probability
  • TEST_NUM_05 and TEST_NUM_22 are integer and decimal numbers from 1 to one million
  • TEST_DAT_08 is all different dates and TEST_DAT_V8 has only 5 distinct dates

Here is the size in blocks (blocks are 8k)

SQL> select table_name,blocks,compression,compress_for from user_tables order by 1;

TABLE_NAME      BLOCKS COMPRESS COMPRESS_FOR
----------- ---------- -------- ------------------------------
TEST_DAT_08        305 DISABLED
TEST_DAT_V8        305 DISABLED
TEST_NUM_05        219 DISABLED
TEST_NUM_22        701 DISABLED
TEST_STR_50       1503 DISABLED
TEST_STR_A4        192 DISABLED
TEST_STR_U4        192 DISABLED

About licensing, nocompress is availaible in all editions:

 FEATURE USAGE: In-Memory Column Store Not Detected
 FEATURE USAGE: ADVANCED Table Compression Not Detected

BASIC COMPRESSION for bulk load in Enterprise Edition

This is the compression that we know as ‘COMPRESS’ since 9i. It is a block compression where duplicate values are not repeated: the value is put in the block header and each column has only a pointer to it. Deduplication is not only for same column. That compression occurs only when filling full blocks with bulk load (direct-path load, insert append). As soon as there is an update the block is decompressed.

In 12c the syntax is: ROW STORE COMPRESS BASIC

SQL> select table_name,blocks,compression,compress_for from user_tables order by 1;
TABLE_NAME BLOCKS COMPRESS COMPRESS_FOR
----------- ---------- -------- ------------------------------
TEST_DAT_08 276 ENABLED BASIC
TEST_DAT_V8 151 ENABLED BASIC
TEST_NUM_05 199 ENABLED BASIC
TEST_NUM_22 628 ENABLED BASIC
TEST_STR_50 1346 ENABLED BASIC
TEST_STR_A4 174 ENABLED BASIC
TEST_STR_U4 174 ENABLED BASIC

Here I don’t have a lot of duplicates for large columns, so the compression ratio is not very good.

About licensing, COMPRESS BASIC is availaible in Enterprise Edition only, but without any option:

 FEATURE USAGE: In-Memory Column Store Not Detected
 FEATURE USAGE: ADVANCED Table Compression Not Detected

ADVANCED COMPRESSION for OLTP with Advanced Compression Option

Because basic compression is not available for OLTP with conventional DML, 11g introduced the same compression but automatically triggered when a block becomes full. No need to schedule regular reorganization to compress the blocks that are full. 12c removed the ‘FOR OLTP’ and named it with the name of the option.

The full syntax is: ROW STORE COMPRESS ADVANCED

It about the same compression ratio as the basic compression:

TABLE_NAME      BLOCKS COMPRESS COMPRESS_FOR
----------- ---------- -------- ------------------------------
TEST_DAT_08        305 ENABLED  ADVANCED
TEST_DAT_V8        167 ENABLED  ADVANCED
TEST_NUM_05        219 ENABLED  ADVANCED
TEST_NUM_22        701 ENABLED  ADVANCED
TEST_STR_50       1503 ENABLED  ADVANCED
TEST_STR_A4        192 ENABLED  ADVANCED
TEST_STR_U4        192 ENABLED  ADVANCED

But it is licenced as an additional option: Advanced Compression Option:

 FEATURE USAGE: In-Memory Column Store Not Detected
 FEATURE USAGE:  ADVANCED Table Compression Feature Usage:  Tables compressed for ADVAN...

Note that in 12c a lot of features has been introduced as part of ACO. You can use Automatic Data Optimization for ILM only with that option – even if you don’t compress and ieven if you compress it is not an ADVANCED COMPRESS.

EXADATA COMPRESSION queries and archiving into Oracle provided storage

Exadata introduced Hybrid Columnar Compression. Multiple blocks are grouped into Compression Unique where table data is stored in column. Then a compression algorithm is applied. There are 4 levels. This compression is not suited for updates. In QUERY levels it’s suited for data that read frequently. The gain in I/O (because of the reduced size) compensates the CPU overhead to decompress. In ‘ARCHIVE’ mode the CPU usage makes it no optimal for queries, so it’s only for non-frequently accessed data.

Those EHCC are available in Oracle supplied storage: Exadata database machine, Oracle SuperCluster , ZFS storage appliances, Pillar axiom array. But not in ODA. Note that this is not a technical choice but a maketing one. The compression library is in the Oracle database binaries and the feature availability is enabled depending on storage type (visible through SNMP).

The compression levels are detailed by @KovachevPro in the progeeking blog. I’ll only show the 12c syntax here and the compression obtained on the previous table:

LZO algorithm

In 12c the syntax is: COMPRESS FOR QUERY LOW

TABLE_NAME      BLOCKS COMPRESS COMPRESS_FOR
----------- ---------- -------- ------------------------------
TEST_DAT_08        110 ENABLED  QUERY LOW
TEST_DAT_V8          7 ENABLED  QUERY LOW
TEST_NUM_05        116 ENABLED  QUERY LOW
TEST_NUM_22        552 ENABLED  QUERY LOW
TEST_STR_50       1260 ENABLED  QUERY LOW
TEST_STR_A4         87 ENABLED  QUERY LOW
TEST_STR_U4         87 ENABLED  QUERY LOW

Compression ratio is good especially when we have a lot of repeated values.

 FEATURE USAGE: Number of Hybrid Columnar Compressed Segments: 7,  Segments Analyzed: 7
 FEATURE USAGE: In-Memory Column Store Not Detected
 FEATURE USAGE: ADVANCED Table Compression Not Detected

ZLIB algorithm

 In 12c the syntax is: COMPRESS FOR QUERY HIGH
TABLE_NAME      BLOCKS COMPRESS COMPRESS_FOR
----------- ---------- -------- ------------------------------
TEST_DAT_08          7 ENABLED  QUERY HIGH
TEST_DAT_V8          7 ENABLED  QUERY HIGH
TEST_NUM_05         71 ENABLED  QUERY HIGH
TEST_NUM_22        461 ENABLED  QUERY HIGH
TEST_STR_50       1052 ENABLED  QUERY HIGH
TEST_STR_A4         51 ENABLED  QUERY HIGH
TEST_STR_U4         38 ENABLED  QUERY HIGH
 FEATURE USAGE: Number of Hybrid Columnar Compressed Segments: 7,  Segments Analyzed: 7
 FEATURE USAGE: In-Memory Column Store Not Detected
 FEATURE USAGE: ADVANCED Table Compression Not Detected

In 12c the syntax for higher ZLIB level is: COMPRESS FOR ARCHIVE LOW

TABLE_NAME      BLOCKS COMPRESS COMPRESS_FOR
----------- ---------- -------- ------------------------------
TEST_DAT_08          7 ENABLED  ARCHIVE LOW
TEST_DAT_V8          7 ENABLED  ARCHIVE LOW
TEST_NUM_05         70 ENABLED  ARCHIVE LOW
TEST_NUM_22        475 ENABLED  ARCHIVE LOW
TEST_STR_50       1073 ENABLED  ARCHIVE LOW
TEST_STR_A4         51 ENABLED  ARCHIVE LOW
TEST_STR_U4         38 ENABLED  ARCHIVE LOW

In this exemple, I don’t get an higher ratio, but that depends on data pettern.

 FEATURE USAGE: Number of Hybrid Columnar Compressed Segments: 7,  Segments Analyzed: 7
 FEATURE USAGE: In-Memory Column Store Not Detected
 FEATURE USAGE: ADVANCED Table Compression Not Detected

BZIP2 algorithm

In 12c the syntax is: COLUMN STORE COMPRESS FOR ARCHIVE HIGH

TABLE_NAME      BLOCKS COMPRESS COMPRESS_FOR
----------- ---------- -------- ------------------------------
TEST_DAT_08          7 ENABLED  ARCHIVE HIGH
TEST_DAT_V8          7 ENABLED  ARCHIVE HIGH
TEST_NUM_05         61 ENABLED  ARCHIVE HIGH
TEST_NUM_22        478 ENABLED  ARCHIVE HIGH
TEST_STR_50       1066 ENABLED  ARCHIVE HIGH
TEST_STR_A4         42 ENABLED  ARCHIVE HIGH
TEST_STR_U4         32 ENABLED  ARCHIVE HIGH

This level is better, but requires lot of CPU both to compress and uncompress

 FEATURE USAGE: Number of Hybrid Columnar Compressed Segments: 0,  Segments Analyzed: 7
 FEATURE USAGE: In-Memory Column Store Not Detected
 FEATURE USAGE: ADVANCED Table Compression Not Detected

Note that I don’t know why HCC is not flagged by feature usage here. Probably a bug in catfusrg.sql but anyway you don’t need to buy license for that – just by your storage at Oracle.

One important thing to know is that even in QUERY levels it’s optimized for bul reads. Reading row-by-row from any COLUMN STORE is not very efficient, especially when we need to decompress the whole compression unit.

Information Lifecycle Management and licensing

In 12c you have the ADO (Automatic Data Optimization) option to compress automatically cold data.

Without ACO option you cannot use it. Then compression must be done manually and you will probably use ALTER TABLE … MOVE PARTITION … COMPRESS BASIC ONLINE. This is always possible in Enterprise Edition. If you have an Oracle storage, then you can use the EHCC compression. This does not need additional option when you do it manually.

If you want to do it automatically – the A of ADO – you need to buy ACO option. But once again the EHCC is available only for Oracle storage. Then, without it, you can use only the BASIC COMPRESSION. (COMPRESS ADVANCED is not needed here because ADO does bulk loads anyway). So the ‘Advanced’ in ‘Advanced Compression Option’ rather mean ‘Automatic’ when related to ILM…