By Franck Pachot

.
Can we shutdown abort a PDB? Let’s try:

SQL> show con_id
CON_ID
------------------------------
3SQL> shutdown abort;
Pluggable Database closed.

But is it really a shutdown abort?

The shutdown abort we know in non-CDB – or in a CDB from CDB$ROOT – is used when the instance cannot checkpoint before closing the file. It leaves the files fuzzy. And shutdown abort crashes the instance – loosing all changes made in buffer cache and not yet checkpointed. This is why we have online redo logs so that an instance recovery can be used to recover those changes.

But if I shutdown abort only a PDB the instance is still there. Let’s see from alert.log what it did:

2015-03-16 06:59:51.487000 +01:00
 ALTER PLUGGABLE DATABASE CLOSE ABORT
 ALTER SYSTEM: Flushing buffer cache inst=0 container=3 local
 Pluggable database PDB closed
 Completed: ALTER PLUGGABLE DATABASE CLOSE ABORT

Ok. even if it’s called ‘close abort’ a checkpoint occured. Look at the file fuzziness:

 SQL> select file#,con_id,fuzzy from v$datafile_header;
     FILE#     CON_ID FUZ
---------- ---------- ---
         4          0 YES
         8          3 NO
         9          3 NO
        10          3 NO

Except for the UNDO which is in CDB$ROOT, none of my PDB files are fuzzy. There were closed cleanly.

This is fine when we can write into the datafiles in order to checkpoint them. But if the file is not there we can’t checkpoint it. Then what happens?

On a non-CDB – or on the CDB$ROOT of a CDB – it’s simple. The instance crashes. By default, since 11.2.0.2, and without changing the “_datafile_write_errors_crash_instance” default value, the instance crashes as soon as a file is missing at checkpoint. If “_datafile_write_errors_crash_instance” is set to true, the instance crashes only when the datafiles is a system one.

Okay, we have to think about that behaviour change (and that’s for a future blog post) but usually the application is down when a datafile is missing, so better to stop everything.

But the problem is in multitenant. As we have seen in previous posts, the whole instance crashes at checkpoint, even when a PDB only datafile is missing. You can set “_datafile_write_errors_crash_instance” but the issue is still there for system datafiles. I had open a bug about it and it’s fixed in latest patchset, but that is not sufficient.

The problem is that offlining a file without checkpoint leaves it in a ‘need recovery’ state. But if the instance is still running then the online redo logs can be overwriten and all changes are lost. Of course, in archivelog mode you should be able to restore them. But by default crashing the instance is chosen.

I’m in archivelog mode and I can shutdown abort the PDB as soon as I know that the datafile is missing (and before next checkpoint):

RMAN> host "rm /u02/oradata/CDB/datafile/o1_mf_system_bjdz9ldm_.dbf";

host command complete

RMAN> alter pluggable database pdb close abort;

Statement processed

and here is the alert.log:

2015-03-16 08:04:37.194000 +01:00
alter pluggable database pdb close abort
**********************************************************
Pluggable Database PDB with id - 3 has a missing or corrupt
datafile belonging to the pluggable database. It will be
closed abnormally. The pdb needs to be dropped or recovered
**********************************************************
Checker run found 1 new persistent data failures
Pluggable database PDB closed
Completed: alter pluggable database pdb close abort

So I’m able to shutdown abort a PDB when I’m in archivelog mode. If I’m not in archivelog mode can’t do it (except if “_enable_pdb_close_noarchivelog”=true but that is very risky ).

But I want that shutdown to occur automatically because consolidation should not decrease databases availability and I’m not sure to be able to manually close abort before the next checkpoint. Then I have to set “_enable_pdb_close_abort”=true and here is what happens in alert.log:

2015-03-16 08:55:18.833000 +01:00
Errors in file /u01/app/oracle/diag/rdbms/cdb/CDB/trace/CDB_ckpt_1698.trc:
ORA-01243: system tablespace file suffered media failure
ORA-01116: error in opening database file 8
ORA-01110: data file 8: '/u02/oradata/CDB/datafile/o1_mf_system_bjf31v1m_.dbf'
ORA-27041: unable to open file
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
Internal PDB shutdown abort of PDB (container=3)

 Conclusion

I order to have only the PDB shut down in case of the loss of a system file – instead of the whole CDB – you need:

  • to be in archivelog mode
  • to have latest patchset (with fix for bug 19001390)
  • to set “_enable_pdb_close_abort”=true

As you are in archivelog mode, you should be able to recover, but I suppose you will not wait too long to restore and recover the missing datafiles.