Infrastructure at your Service

Franck Pachot

List all RMAN backups that are needed to recover

This blog post is something I had in draft and Laurent Schneider blog post reminds me to publish it. With the right RMAN configuration you should not have to managed backup files yourself. The RMAN catalog knows them and RMAN should be able to access them. If you want to keep a backup for a long time, you just tell RMAN to keep it.
But sometimes, RMAN is not connected to your tape backup software, or the backups are not shared on all sites, and you have to restore or copy the set of files that is needed for a restore database or a duplicate database.

A customer was in that case, identifying the required files from their names because they are all timestamped with the beginning of the backup job. It’s our DMK default. In order to rely on that, the ‘backup database plus archivelog’ was run. And in order to be sure to have all archived logs in those backup sets, any concurrent RMAN job are blocked during that database backup. Because if a concurrent job is doing some archivelog backups, they will be timestamped differently.

RPO and availability

I don’t like that. I don’t want that anything can block the backup of archived logs.
They are critical for two reasons:

  • The Recovery Point Objective is not fulfilled if some archivelog backups are delayed
  • The frequency of archivelog backup is also defined to prevent a full FRA

But if we allow concurrent backup of archived logs, we need something else to be able to identify the whole set of files that are needed to restore the database at that point in time. then my suggestion was to generate the list of those files after each database backup, and keep that list. When we need to restore that backup, then we can send the list to the backup team ans ask them to restore them.

The script

Here is my script, I’ll explain later:

echo "restore controlfile preview; restore database preview;" | NLS_DATE_FORMAT='yyyymmddhh24miss' rman target / | awk '
/Finished restore at /{timestamp=$4}
/Recovery must be done beyond SCN /{if ($7>scn) scn=$7 }
/^ *(Piece )Name: / { sub(/^ *(Piece )Name: /,"") ; files[$0]=1 }
END{ for (i in files) print i > "files-"timestamp"-SCN-"scn".txt" }
' 

this script generate the following file:

files-20150519013910-SCN-47682382860.txt

which list the files needed to do a RESTORE/RECOVER UNTIL SCN 47682382860

the content of the file is:

oracle@dbzhorap01:/home/oracle/ [DB01PP1] sort files-20150519019910-SCN-47682382860.txt
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168278_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168279_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168280_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168281_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168282_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168283_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168284_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169462_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169463_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169464_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169465_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169466_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169467_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169468_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169469_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169470_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169471_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169472_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169481_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169482_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169473_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169474_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169475_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169476_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169477_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169478_p1.bck
/u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169479_p1.bck

and lists the backup pieces for the incremental 0, incremental 1 and archivelogs needed to recover to a consistent state that can be opened. The script lists only backup sets so we are supposed have have backed up the latest archived logs (with backup database plus archivelog for example).

You can put an ‘until scn’^but my primary goal was to run it just after a backup database in order to know which files have to be restored to get that backup (restore or duplicate).

Restore preview

The idea is to rely on RMAN to find the files that are needed to restore and recover rather than doing it ourselves from the recovery catalog. RMAN provides the PREVIEW restore for that:

RMAN> restore database preview
Starting restore at 20150501390436
using channel ORA_DISK_1
using channel ORA_DISK_2
using channel ORA_DISK_3
using channel ORA_DISK_4

List of Backup Sets
===================

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
166388  Incr 0  10.53G     DISK        00:52:56     20150516031010
        BP Key: 166388   Status: AVAILABLE  Compressed: YES  Tag: WEEKLY
        Piece Name: /u00/app/oracle/admin/DB01PP/backup/20150516_023003_inc0_DB01PP_961537327_s168278_p1.bck
  List of Datafiles in backup set 166388
  File LV Type Ckp SCN    Ckp Time       Name
  ---- -- ---- ---------- -------------- ----
  1    0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/system.329.835812499
  2    0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/undotbs1.525.835803187
  10   0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/cpy.676.835815153
  17   0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/cpy.347.835815677
  23   0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/cpy.277.835814327
  25   0  Incr 47581173986 20150516023945 +U01/DB01Pp/datafile/cpy.342.835811161
...
BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
167586  Incr 1  216.09M    DISK        00:01:34     20150519012830
        BP Key: 167586   Status: AVAILABLE  Compressed: YES  Tag: DAYLY
        Piece Name: /u00/app/oracle/admin/DB01PP/backup/20150519_010013_inc1_DB01PP_961537327_s169479_p1.bck
  List of Datafiles in backup set 167586
  File LV Type Ckp SCN    Ckp Time       Name
  ---- -- ---- ---------- -------------- ----
  43   1  Incr 47681921440 20150519012700 +U01/DB01Pp/datafile/cpy_idx.346.835815097

List of Backup Sets
===================

BS Key  Size       Device Type Elapsed Time Completion Time
------- ---------- ----------- ------------ ---------------
167594  105.34M    DISK        00:00:23     20150519015400
        BP Key: 167594   Status: AVAILABLE  Compressed: YES  Tag: DAYLY
        Piece Name: /u00/app/oracle/admin/DB01PP/backup/20150519_010013_arc_DB01PP_961537327_s169481_p1.bck

  List of Archived Logs in backup set 167594
  Thrd Seq     Low SCN    Low Time       Next SCN   Next Time
  ---- ------- ---------- -------------- ---------- ---------
  3    59406   47681333097 20150519010239 47682617820 20150519014652
  4    46800   47681333143 20150519010240 47682617836 20150519014652
  1    76382   47681333188 20150519010240 47682618254 20150519014655
  2    60967   47681333315 20150519010242 47682385651 20150519013711
...

Media recovery start SCN is 47681637369
Recovery must be done beyond SCN 47682382860 to clear datafile fuzziness
Finished restore at 20150501390440

You see the list of datafiles backupsets and archivelog backupsets and at the end you have information about SCN. Let me explain what are those SCNs.

Recovery SCN

Because it is online backup the datafiles are fuzzy. We need to apply redo generaed during backup.
The ‘media recovery start SCN’ is the begining of the archivelog to be applied:

SQL> select scn_to_timestamp(47681637369) from dual;

SCN_TO_TIMESTAMP(47681637369)
---------------------------------------------------------------------------
19-MAY-15 01.10.38.000000000 AM

The ‘recovery must be done beyond SCN’ is the last redo that must be applied to have datafiles consistent:

SQL> select scn_to_timestamp(47682382860) from dual;

SCN_TO_TIMESTAMP(47682382860)
---------------------------------------------------------------------------
19-MAY-15 01.35.58.000000000 AM

In my example, the backup (incremental level 1 + archivelog) started at 01:00:00 and was completed at 01:35:00

Conclusion

And I have a file with the list of backups that are needed to restore or duplicate the database at that point in time. Why do I need that when RMAN is supposed to be able to retrieve them itself? Because sometimes we backup to disk and the disk is backed up to tape without RMAN knowing it. Of course RMAN can connect directly to the tape backup software but that is not for free. Or we want to duplicate to another site where backups are not shared. We need to know which files we have to bring there. And that sometimes requires a request to another team so it’s better to have the list of all files we need.

As usual, don’t hesitate to comment if you see something to improve in my small script.

 

11 Comments

  • Gascard T says:

    Hello Franck, very interesting as usual.
    For a duplicate can I use the restore preview with the clone database ?
    Thank you

     
  • Hi Gascard,
    Thanks for your feedbacl.
    You need to be connected to the target database, so the answer is no.
    There is no ‘duplicate preview’ but I expect that a ‘restore preview’ on the target database lists all the files required for a duplicate at the same SCN.
    Regards,
    Franck.

     
  • Rishi says:

    Hi frank, when I run your script I get the following error:

    awk: cmd. line:5: (FILENAME=- FNR=369) fatal: can’t redirect to `files-31/03/2016-SCN-112305523.txt’ (No such file or directory)
    I am running in bash shell

    any help, thanks Rish

     
  • Hi Rishi,
    Yes you’re right. If your NLS_DATA_FORMAT contains ‘/’ it cannot build a file name with timestamp.
    I updated the script in the blog post to add: NLS_DATE_FORMAT=’yyyymmddhh24miss’ in front on the RMAN command. Should be better.
    Thanks a lot for your comment.
    Regards,
    Franck.

     
  • phani says:

    Hi frank,

    Nice post and very useful.

    i am beginner can you explain the script used to extract the list of the files needed .

    Regards
    phani

     
  • Idris says:

    Can you please provide the full script for below

    List all RMAN backups that are needed to recover

    echo “restore controlfile preview; restore database preview;” | NLS_DATE_FORMAT=’yyyymmddhh24miss’ rman target / | awk ‘
    /Finished restore at /{timestamp=$4}
    /Recovery must be done beyond SCN /{if ($7>scn) scn=$7 }
    /^ *(Piece )Name: / { sub(/^ *(Piece )Name: /,””) ; files[$0]=1 }
    END{ for (i in files) print i > “files-“timestamp”-SCN-“scn”.txt” }

     
  • Idris says:

    Suppose if we want to do pointing time recovery till last 2 days.
    How to use your script for getting the required backup pieces.

     
  • pianz says:

    try run the script and get these error message

    :/home/oracle:>86948:(pid 1831198): Failed to initialize ports using nsrexecd on : Service not available.

    86939:(pid 1831198): Unable to get port range from local nsrexecd: Failed to initialize ports using nsrexecd on Service not available.

    :/home/oracle:>86948:(pid 2195466): Failed to initialize ports using nsrexecd on Service not available.

    86939:(pid 2195466): Unable to get port range from local nsrexecd: Failed to initialize ports using nsrexecd on : Service not available.

    ^Xcc

     

Leave a Reply


six − 1 =

Franck Pachot
Franck Pachot

Technology Leader