Infrastructure at your Service

Oracle Team

From 8.0 extended rowid to 12c pluggable db: Why Oracle Database is still a great software

By Franck Pachot

.
Do you know why Oracle Database is a leader in the database market since 30 years? Do you know any other software that is still the leading solution after decades? I think that it is because Oracle has been a good software from the get-go. Many early decisions in the software architecture have revealed themselves later to be the right decision. Several decisions, such as the C language that made it portable to all platforms that were relevant during those 30 years, or the parallel server that has brought RAC when standalone servers were not able to scale anymore. Here, I will illustrate a decision made 15 years ago that has made the whole 12c pluggable databases stuff possible.

Oracle 7 rowid

The ROWID is the physical address of a row in the database. Physically, a database is just a set of files where data is stored as rows in blocks. So, at the time of Oracle 7, the ROWID was just:

  • the file number
  • the block number in that file
  • the row number in that block
That was printed in hexadecimal when we selected the ROWID pseudo-column:

CaptureOracle7Rowid.PNG

Here my rows are numeroted from 0 to 3 and are all in the file 2 at block offset 58 (0x3a)
At that time, the maximum number of datafiles in a database was 1022. That was enough until the datawarehouse days brought the need for Very Large DataBases.

Relative file number

In Oracle 8 they wanted to increase the possible number of datafile without changing the rowid (which would have involved updating all blocks during an upgrade). And here is how they did:

1. they introduced the ‘relative file number’. The file identification part is not unique for the database but only for the tablespace. That means that you can have 1022 datafiles per tablespace instead of 1022 datafiles per database.
2. they kept the relative_fno equal to the absolute file_id for the first 1022 datafiles, so that an Oracle 7 datafile is still compatible: the number that was the absolute file_id being now the relative_fno.
3. when going over the 1022 datafiles per database, the file_id can go beyond (it is not stored in the rowid) but the relative_fno just recycles to 1.

The consequence is that a rowid is not unique anymore. When we have to look for a row by its rowid, we must know on which tablespace it is, because the rowid identification is related to the tablespace.

No problem. Where are the rowid used? In chained rows, to get the other piece. A table cannot store parts of its row in different tablespace, so we know that if we are in tablespace USERS and have a pointer to a chained row, it has to be in the USER tablespace as well. They are used in indexes as well. And that’s the same: if we unique scan the index EMP_PK to get the rowid of the EMP row, we know that it is in the USERS tablespace because we know that EMP table is in the USERS tablespace.

However, a new feature appeared at that time: a table can be partitioned. We cannot say anymore that EMP rows are in USERS tablespace because EMP may be partitioned over several tablespaces. That does not concern chained rows. That does not concern non-partitioned tables. And with partitioned tables, that does not concern local indexes because local index partitions always address rows in one table partition.

This is where Oracle has introduced an extended rowid, for global indexes on partitioned tables, which is larger and is able to give the tablespace information in addition to the relative file number. It is called extended as opposite to ‘restricted’ rowid which is restricted to cases where we know the tablespace.

They could have choosen to store only the tablespace number. But they choose to store the object id instead, which – from the dictionary – can be used to know the tablespace. People were puzzled by that choice at that time, but it has been revealed later as a great software architecture decision because this is what allows us to have pluggable databases 15 years later.

But before that, I have to introduce another 8.0 change, which is the data object id.

Data object id

In Oracle 7, there is only one object_id. Because there is a one-to-one relation between the logical object (table) and the physical object (segment). That has changed with the introduction of partitioning, where one table may have several partitions.

Partitions are also logical objects, and each of them has a unique object_id. But once again, the software architects made a very good decision by separating the identification of the physical object: they introduced the data_object_id. When you create a table, the data_object_id of the segment is equal to the object_id of the table. But that can change.

For example, what happens when you truncate a table? You just reset the high water mark without going to all the blocks. And you can insert new rows that may go into the same place. But how are you sure that concurrent users will not mix old blocks with new ones ? Remember that the reader do not lock anything, so they didn’t notice your truncate. The truncate changes the data object_id as if it were a new segment, but still related to the same logical table.

And with partitioning you can exchange partition: the logical object_id changes but the physical data_object_id is still the same because it is still the same segment.

It’s not always easy to know if we are dealing with an object_id or a data_object_id. Basically, things that are logical (for example lock table or lock partition) show the object_id and they are – depending on the place – called OBJECT_ID, OBJ# or OBJN. When it is related to the physical segment (blocks in buffer cache, block corruption) we see DATA_OBJECT_ID, DATAOBJ# or OBJD or even OBJ or OBJECT_NUMBER.

When I’m not sure about what to use in my scripts, I test it on a table that has been truncated at least once, so that both values are different. I advise you to always test your scripts on a table that has been truncated and on a tablespace that has been transported.

Oracle 8 extended rowid

So I said that the extended rowid does not store the tablespace id. It stores the data_object_id, which is associated with one and only one tablespace. The format of the extended rowid is:

  • the data object id (which gives the tablespace from the data dictionary)
  • the relative file number (relative to the tablespace)
  • the block number in that file
  • the row number in that block

and it is needed only when we don’t navigate from an object which can be used to define exactly which tablespace can contain the segment.

Now let’s understand why the Oracle software architects have decided to store the data_object_id instead of the tablespace number. By doing that, they made the tablespaces physically independant of the database.

Pluggable tablespaces

Yes, I know it’s called transportable tablespaces. But it was the premise of pluggable database. Anyway, pluggable databases are just transportable tablespaces that include the system tablespace (which contain the metadata for the other tablespaces).

You can transport a tablespace from another database and plug it in you database. Of course, the absolute file number will change as it is assigned by your database. The tablespace number will change. But the relative file numbers – relative to the tablespace – do not have to change.

And of course the object_id will change: a new one will be used when importing the metadata. But the data_object_id do not have to change. The reason is that data_object_id is not expected to be unique in the database. It must be unique only whithin the object (two partitions of the same table cannot have the same data_object_id).

Oracle 8 extended rowid was designed for pluggable databases

And this is where those early decisions have all their meaning. You can plug a tablespace and the rowid of the rows in that tablespace do not have to be updated. This is what makes it a quick operation because only the dictionary and the file headers have to be updated. The time depends on the number of objects, but not on the volume of data. The agility brought by pluggable databases in 2013 were actually designed in 1997.

rowid is not unique

So I said that rowid is not unique? Let’s prove it. I create a table DEMO2 in tablespace DEMO1, export that tablespace, rename the table to DEMO2 to DEMO1, import that tablespace as DEMO2, so that I have now two tables DEMO1 and DEMO2 respectively in tablespaces DEMO1 and DEMO1.

Here is my table:

 

SQL> select dname,rowid
  2   ,dbms_rowid.rowid_object(rowid) data_object_id
  3   ,dbms_rowid.rowid_relative_fno(rowid) relative_fno
  4   ,dbms_rowid.rowid_block_number(rowid) block_id
  5   ,dbms_rowid.rowid_row_number(rowid) row_number
  6   from DEMO2
  7  /

DNAME          ROWID              DATA_OBJECT_ID RELATIVE_FNO   BLOCK_ID ROW_NUMBER
-------------- ------------------ -------------- ------------ ---------- ----------
ACCOUNTING     AAAX7uAACAAAACEAAA          98030            2        132          0
RESEARCH       AAAX7uAACAAAACEAAB          98030            2        132          1
SALES          AAAX7uAACAAAACEAAC          98030            2        132          2
OPERATIONS     AAAX7uAACAAAACEAAD          98030            2        132          3

I export the tablespace with transportable tablespaces:

SQL> host expdp '"/ as sysdba"' transport_tablespaces=DEMO1
Export: Release 12.1.0.2.0 - Production on Fri Oct 24 15:33:35 2014
Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
Starting "SYS"."SYS_EXPORT_TRANSPORTABLE_01":  "/******** AS SYSDBA" transport_tablespaces=DEMO1
Processing object type TRANSPORTABLE_EXPORT/PLUGTS_BLK
Processing object type TRANSPORTABLE_EXPORT/TABLE
Processing object type TRANSPORTABLE_EXPORT/TABLE_STATISTICS
Processing object type TRANSPORTABLE_EXPORT/STATISTICS/MARKER
Processing object type TRANSPORTABLE_EXPORT/POST_INSTANCE/PLUGTS_BLK
Master table "SYS"."SYS_EXPORT_TRANSPORTABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for SYS.SYS_EXPORT_TRANSPORTABLE_01 is:
  /u02/app/oracle/admin/DEMO/dpdump/expdat.dmp
******************************************************************************
Datafiles required for transportable tablespace DEMO1:
  /tmp/demo1.dbf
Job "SYS"."SYS_EXPORT_TRANSPORTABLE_01" successfully completed at Fri Oct 24 15:34:35 2014 elapsed 0 00:00:59
SQL> host cp '/tmp/demo1.dbf' '/tmp/demo2.dbf'
SQL> alter tablespace DEMO1 read write;
Tablespace altered.

Then import it to DEMO2 tablespace (after renaming my previous table)

SQL> rename DEMO2 to DEMO1;
SQL> host impdp '"/ as sysdba"' transport_datafiles='/tmp/demo2.dbf ' remap_tablespace='DEMO1:DEMO2';
Import: Release 12.1.0.2.0 - Production on Fri Oct 24 15:34:35 2014

Copyright (c) 1982, 2014, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options
Master table "SYS"."SYS_IMPORT_TRANSPORTABLE_01" successfully loaded/unloaded
Starting "SYS"."SYS_IMPORT_TRANSPORTABLE_01":  "/******** AS SYSDBA" transport_datafiles=/tmp/demo2.dbf  remap_tablespace=DEMO1:DEMO2
Processing object type TRANSPORTABLE_EXPORT/PLUGTS_BLK
Processing object type TRANSPORTABLE_EXPORT/TABLE
Processing object type TRANSPORTABLE_EXPORT/TABLE_STATISTICS
Processing object type TRANSPORTABLE_EXPORT/STATISTICS/MARKER
Processing object type TRANSPORTABLE_EXPORT/POST_INSTANCE/PLUGTS_BLK
Job "SYS"."SYS_IMPORT_TRANSPORTABLE_01" successfully completed at Fri Oct 24 15:34:49 2014 elapsed 0 00:00:13
SQL> alter tablespace DEMO2 read write;
Tablespace altered.

Now I have 2 tables with different object_id but same data_object_id:

SQL> select object_name,object_id,data_object_id from user_objects where object_name like 'DEMO_';

OBJECT_NAM  OBJECT_ID DATA_OBJECT_ID
---------- ---------- --------------
DEMO2           98076          98030
DEMO1           98029          98030

And 2 segments in different files (file_id) but same relative_fno:

SQL> select segment_name,tablespace_name,file_id,relative_fno,block_id from dba_extents where segment_name like 

SEGMENT_NAME    TABLESPACE    FILE_ID RELATIVE_FNO   BLOCK_ID
--------------- ---------- ---------- ------------ ----------
DEMO1           DEMO1               2            2        128
DEMO2           DEMO2               4            2        128

I update the rows so that I be sure to select on different tables

SQL> update DEMO1 set dname=upper(dname);
4 rows updated.
SQL> update DEMO2 set dname=lower(dname);
4 rows updated.

And now showing the ROWID from both tables:

SQL> select dname,rowid
  2   ,dbms_rowid.rowid_object(rowid) data_object_id
  3   ,dbms_rowid.rowid_relative_fno(rowid) relative_fno
  4   ,dbms_rowid.rowid_block_number(rowid) block_id
  5   ,dbms_rowid.rowid_row_number(rowid) row_number
  6   from DEMO1
  7  /

DNAME          ROWID              DATA_OBJECT_ID RELATIVE_FNO   BLOCK_ID ROW_NUMBER
-------------- ------------------ -------------- ------------ ---------- ----------
ACCOUNTING     AAAX7uAACAAAACEAAA          98030            2        132          0
RESEARCH       AAAX7uAACAAAACEAAB          98030            2        132          1
SALES          AAAX7uAACAAAACEAAC          98030            2        132          2
OPERATIONS     AAAX7uAACAAAACEAAD          98030            2        132          3

SQL> select dname,rowid
  2   ,dbms_rowid.rowid_object(rowid) data_object_id
  3   ,dbms_rowid.rowid_relative_fno(rowid) relative_fno
  4   ,dbms_rowid.rowid_block_number(rowid) block_id
  5   ,dbms_rowid.rowid_row_number(rowid) row_number
  6   from DEMO2
  7  /

DNAME          ROWID              DATA_OBJECT_ID RELATIVE_FNO   BLOCK_ID ROW_NUMBER
-------------- ------------------ -------------- ------------ ---------- ----------
accounting     AAAX7uAACAAAACEAAA          98030            2        132          0
research       AAAX7uAACAAAACEAAB          98030            2        132          1
sales          AAAX7uAACAAAACEAAC          98030            2        132          2
operations     AAAX7uAACAAAACEAAD          98030            2        132          3

Conclusion: I have in my database two different tables with sames rowid because it is a physical copy. Only the data dictionary makes the difference.

4 Comments

  • Olivier Berthommé says:

    Hi Frank,

    Very interesting post ! Congratulations for starting an Oracle 7 ; did you virtualize an HP 3000 🙂 ?

    Why not name data_object_id : segment_id ? Is it not the same concept ?

  • Hi Olivier,
    That’s a good question. I think the reason is that data_object_id cannot identify a segment without the tablespace. Another reason would be that a segment type can change without changing the data_object_id (as with exchange partitions).
    As far as I know, ‘segment id’ is used only by dbv (to pass tsn.relfile.block parameter) and in statspack comments for STATS$SEG_STAT_OBJ (key: dataobj#, obj#, ts#, dbid)

  • Hemant K Chitale says:

    Cool ! (that’s all I can say !)

  • Martin Preiss says:

    Hi Franck,
    almost missed that article… – thank you! Having studied history for a lot of years I like this kind of overviews.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Oracle Team
Oracle Team