Infrastructure at your Service

By Franck Pachot

.
Can we shutdown abort a PDB? Let’s try:

SQL> show con_id
CON_ID
------------------------------
3SQL> shutdown abort;
Pluggable Database closed.

But is it really a shutdown abort?

The shutdown abort we know in non-CDB – or in a CDB from CDB$ROOT – is used when the instance cannot checkpoint before closing the file. It leaves the files fuzzy. And shutdown abort crashes the instance – loosing all changes made in buffer cache and not yet checkpointed. This is why we have online redo logs so that an instance recovery can be used to recover those changes.

But if I shutdown abort only a PDB the instance is still there. Let’s see from alert.log what it did:

2015-03-16 06:59:51.487000 +01:00
 ALTER PLUGGABLE DATABASE CLOSE ABORT
 ALTER SYSTEM: Flushing buffer cache inst=0 container=3 local
 Pluggable database PDB closed
 Completed: ALTER PLUGGABLE DATABASE CLOSE ABORT

Ok. even if it’s called ‘close abort’ a checkpoint occured. Look at the file fuzziness:

 SQL> select file#,con_id,fuzzy from v$datafile_header;
     FILE#     CON_ID FUZ
---------- ---------- ---
         4          0 YES
         8          3 NO
         9          3 NO
        10          3 NO

Except for the UNDO which is in CDB$ROOT, none of my PDB files are fuzzy. There were closed cleanly.

This is fine when we can write into the datafiles in order to checkpoint them. But if the file is not there we can’t checkpoint it. Then what happens?

On a non-CDB – or on the CDB$ROOT of a CDB – it’s simple. The instance crashes. By default, since 11.2.0.2, and without changing the “_datafile_write_errors_crash_instance” default value, the instance crashes as soon as a file is missing at checkpoint. If “_datafile_write_errors_crash_instance” is set to true, the instance crashes only when the datafiles is a system one.

Okay, we have to think about that behaviour change (and that’s for a future blog post) but usually the application is down when a datafile is missing, so better to stop everything.

But the problem is in multitenant. As we have seen in previous posts, the whole instance crashes at checkpoint, even when a PDB only datafile is missing. You can set “_datafile_write_errors_crash_instance” but the issue is still there for system datafiles. I had open a bug about it and it’s fixed in latest patchset, but that is not sufficient.

The problem is that offlining a file without checkpoint leaves it in a ‘need recovery’ state. But if the instance is still running then the online redo logs can be overwriten and all changes are lost. Of course, in archivelog mode you should be able to restore them. But by default crashing the instance is chosen.

I’m in archivelog mode and I can shutdown abort the PDB as soon as I know that the datafile is missing (and before next checkpoint):

RMAN> host "rm /u02/oradata/CDB/datafile/o1_mf_system_bjdz9ldm_.dbf";

host command complete

RMAN> alter pluggable database pdb close abort;

Statement processed

and here is the alert.log:

2015-03-16 08:04:37.194000 +01:00
alter pluggable database pdb close abort
**********************************************************
Pluggable Database PDB with id - 3 has a missing or corrupt
datafile belonging to the pluggable database. It will be
closed abnormally. The pdb needs to be dropped or recovered
**********************************************************
Checker run found 1 new persistent data failures
Pluggable database PDB closed
Completed: alter pluggable database pdb close abort

So I’m able to shutdown abort a PDB when I’m in archivelog mode. If I’m not in archivelog mode can’t do it (except if “_enable_pdb_close_noarchivelog”=true but that is very risky ).

But I want that shutdown to occur automatically because consolidation should not decrease databases availability and I’m not sure to be able to manually close abort before the next checkpoint. Then I have to set “_enable_pdb_close_abort”=true and here is what happens in alert.log:

2015-03-16 08:55:18.833000 +01:00
Errors in file /u01/app/oracle/diag/rdbms/cdb/CDB/trace/CDB_ckpt_1698.trc:
ORA-01243: system tablespace file suffered media failure
ORA-01116: error in opening database file 8
ORA-01110: data file 8: '/u02/oradata/CDB/datafile/o1_mf_system_bjf31v1m_.dbf'
ORA-27041: unable to open file
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Internal PDB shutdown abort of PDB (container=3)

 Conclusion

I order to have only the PDB shut down in case of the loss of a system file – instead of the whole CDB – you need:

  • to be in archivelog mode
  • to have latest patchset (with fix for bug 19001390)
  • to set “_enable_pdb_close_abort”=true

As you are in archivelog mode, you should be able to recover, but I suppose you will not wait too long to restore and recover the missing datafiles.

3 Comments

  • Foued says:

    Thanks Franck for the post.

  • Hi Franck

    When I shutdown abort, my pluggable is damaged. Always.

    Sometimes I can save it, once it fails with ORA-600 kcpSetPDBIncompleteRecoverySCN-1 (could not reproduce)


    SQL> grant sysoper to u identified by pw;
    Grant succeeded.
    SQL> conn u/[email protected] as sysoper
    Connected.
    SQL> shu abort
    Pluggable Database closed.
    SQL> startup
    ORA-01113: file 26 needs media recovery
    ORA-01110: data file 26: '/u02/oradata/pdb01/pdb/users01.dbf'
    SQL> conn / as sysdba
    Connected.
    SQL> recover datafile 26;
    Media recovery complete.
    SQL> recover datafile 25;
    Media recovery complete.
    SQL> recover datafile 24;
    Media recovery complete.
    SQL> recover datafile 23;
    Media recovery complete.
    SQL> recover datafile 22;
    Media recovery complete.
    SQL> recover datafile 21;
    Media recovery complete.
    SQL> recover datafile 20;
    Media recovery complete.
    SQL> recover datafile 19;
    Media recovery complete.
    SQL> recover datafile 18;
    Media recovery complete.
    SQL> alter pluggable database pdb01 open;
    Pluggable database altered.

    Is this a bug or expected behavior?

    Thanks
    Laurent

    • Hi Laurent.
      The need for recovery is normal: the abort just closes the files (without checkpoint) and removes PDB objects from SGA. This is why (without an underscore parameter) you can do this only in archivelog mode.
      However, the ORA-600 is not normal. Does that means that you were no able to recover it?
      Regards,
      Franck.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Franck Pachot
Franck Pachot

Principal Consultant / Database Evangelist
Oracle ACE Director, Oracle Database OCM 12c
AWS Database Specialty certified, AWS Data Hero
Oak Table member

RSS for this blog: feed
Twitter: @FranckPachot
LinkedIn : www.linkedin.com/in/franckpachot
Podcast en français: DBPod