Infrastructure at your Service

Lazhar Felahi

Near Zero Downtime Migration and failback with GoldenGate

Oracle GoldenGate allows to migrate Oracle database with Near Zero Downtime and with failback capability.

Near Zero Downtime migration means very minimal application switchover downtime.

The failback consists to rollback the migration from 19c to 12c.

The goal of this blog is to describe how to migrate an oracle database from 12c to 19c via Oracle GoldenGate with Near Zero Downtime and with failback capability.

 

STEP 1 – CHECK GOLDENGATE SYNCHRONIZATION

The Oracle GoldenGate processes must be up and running (EXTRACT and PUMP on source, REPLICAT on target) and the source and target databases must be synchronized (same data between source and target).

Since GoldenGate is a logical replication, comparison of data must be done regularly via Oracle Veridata (under license) or manually (my preferred method):

  • Compare the number of rows between Source and Target via the sql function COUNT(*) –> must return the same number of rows
  • Compare the data between Source and Target via the sql function MINUS –> must return “No rows selected”
    • MINUS does not support CLOB/BLOB columns
  • Compare the data for CLOB/BLOB between Source and Target via dbms_lob.compare –> must return “0”
--COMPARE NB OF ROWS
SQL>
select count(*) from [email protected]_source
union all
select count(*) from TABLE_A;

COUNT(*)
----------
3754
3754

--COMPARE DATA BETWEEN SOURCE AND TARGET
SQL>
select * from [email protected]_source
minus
select * from TABLE_A;

no rows selected

--COMPARE DATA BETWEEN TARGET AND SOURCE
SQL>
select * TABLE_A
minus
select * from [email protected]_source;

no rows selected

--COMPARE DATA FOR CLOB/BLOB COLUMNS
CREATE OR REPLACE FUNCTION compare_clob(clob_src IN clob, clob_trg IN clob)
RETURN NUMBER
IS v_result number;
clob_src1 long;
clob_trg1 long;
BEGIN
clob_src1 := dbms_lob.substr( clob_src, 32000, 1 );
clob_trg1 := dbms_lob.substr( clob_trg, 32000, 1 );

select dbms_lob.compare(clob_src1,clob_trg1)
into v_result
from dual;

RETURN(v_result);
END;
/

STEP 2 – STOP THE APPLICATION ON SOURCE

At this step, the downtime starts…

  • Stop the Application on the source database (12c)
  • If the application cannot be stopped :
    • Lock the applicatives schemas
    • Check there is no more transaction int v$transaction/gv$transaction
    • Kill session
  • Stop the listener
  • Save the job_queue_processes value (needed later when we will start the application on 19c)
  • Disable the job_queue_processes (“alter system set job_queue_processes = 0 scope = both)

To be sure there is no transaction running, create a dummy table on 12c database and insert 1 row, later we will check into the GoldenGate trail files that this transaction is the last one, that will ensure us that no “real” transaction are lost from Source database before the switchover.

SQL> create table source_schema.zz_dummy2 (SCN varchar2(100), d date, comment_txt varchar2(50));

Table created.

SQL>

--on target
SQL> desc source_schema.zz_dummy2
Name Null? Type
----------------------------------------- -------- ----------------------------
SCN VARCHAR2(100)
D DATE
COMMENT_TXT VARCHAR2(50)

SQL>
alter session set nls_date_format='dd.mm.yyyy hh24:mi:ss';
insert into source_schema.zz_dummy2 select (select to_char(current_scn) from v$database),sysdate, 'RECORD TEST' from dual;

1 row created.

SQL> commit;

Commit complete.

SQL> select * from source_schema.zz_dummy2;

SCN
--------------------------------------------------------------------------------
D
-------------------
COMMENT_TXT
--------------------------------------------------------------------------------
3386436290352
13.04.2022 14:36:41
RECORD TEST

SQL>

Check the table is synchronized by GoldenGate on the target database (19c).

STEP 3 – CREATE RESTORE POINT

On the source database, create a restore point :

SQL> CREATE RESTORE POINT rp_before_switch GUARANTEE FLASHBACK DATABASE;

Restore point created.

If we want to rollback the migration, we will replicate the 12c database from the 19c database via GoldenGate (with new transaction existing since the start of the Application on 19c). If the rollback with GoldenGate fails (for any reason), this restore point will allow to restore the database in point in time.

STEP 4 – CHECK EXTRACT HAS NO MORE TRANSACTION TO PROCESS

Check the EXTRACT process has no record to process.

GGSCI (source_server as [email protected]_source) 42> send extract e1 logend

Sending LOGEND request to EXTRACT E1 ...
YES

GGSCI (source_server) 8> send extract e1 showtrans

Sending SHOWTRANS request to EXTRACT E1...
No transactions found.

GGSCI (source_server) 16> send extract e1 logend
    Sending LOGEND request to EXTRACT E1 ...
    YES

The command “send extract e1 logend” goes to the end of the current trailfile and check the last transaction.

If the command “send extract e1 logend”  send YES, that means the last record has been handled.

If the command “send extract e1 logend” send NO, that means there is always record to handle by the EXTRACT process.

If there is no more transaction to manage by GoldenGate, we are ready to stop the EXTRACT process:

GGSCI (source_server) 9> stop extract e1
Sending STOP request to EXTRACT E1...
Request processed.

GGSCI (source_server) 9> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT STOPPED E1 00:00:02 00:00:02

STEP 5 – CHECK REPLICAT HAS NO MORE TRANSACTION TO PROCESS

Check the REPLICAT process has no record to process:

GGSCI (target_server) 32> send replicat R1 logend

Sending LOGEND request to REPLICAT R1...
YES

The command “send replicat R1 logend” must return “YES”

STEP 6 – CHECK THE LAST TRANSACTION INTO TRAIL FILE

From logdump utility, check into the current trailfile (last trail file used by the EXTRACT or the REPLICAT) if the last transaction concerns the Insert into the dummy table (remember the step2) :

--got the trail file name with the command “view report E1”
Logdump 25 >ghdr on
detail data
ggstoken detailLogdump 26 >Logdump 27 >
Logdump 28 >
Logdump 28 >
Logdump 28 >open /u01/xs/ogg/xs000000000
Current LogTrail is /u01/xs/ogg/xs000000000
Logdump 29 >pos last
Reading forward from RBA 348547
Logdump 30 >pos rev
Reading in reverse from RBA 348547
Logdump 31 >n
___________________________________________________________________
Hdr-Ind    :     E  (x45)     Partition  :     .  (x0c)
UndoFlag   :     .  (x00)     BeforeAfter:     A  (x41)
RecLength  :    65  (x0041)   IO Time    : 2022/04/13 14:36:44.000.000
IOType     :     5  (x05)     OrigNode   :   255  (xff)
TransInd   :     .  (x03)     FormatType :     R  (x52)
SyskeyLen  :     0  (x00)     Incomplete :     .  (x00)
AuditRBA   :      61367       AuditPos   : 522820
Continued  :     N  (x00)     RecCount   :     1  (x01)

2022/04/13 14:36:44.000.000 Insert Len 65 RBA 348360
Name: SOURCE_SCHEMA.ZZ_DUMMY2 (TDR Index: 3)
After Image: Partition x0c G s
0000 1100 0000 0d00 3333 3836 3433 3632 3930 3335 | ........338643629035
3201 0015 0000 0032 3032 322d 3034 2d31 333a 3134 | 2......2022-04-13:14
3a33 363a 3431 0200 0f00 0000 0b00 5245 434f 5244 | :36:41........RECORD
2054 4553 54 | TEST
Column 0 (x0000), Len 17 (x0011)
0000 0d00 3333 3836 3433 3632 3930 3335 32 | ....3386436290352
Column 1 (x0001), Len 21 (x0015)
0000 3230 3232 2d30 342d 3133 3a31 343a 3336 3a34 | ..2022-04-13:14:36:4
31 | 1
Column 2 (x0002), Len 15 (x000f)
0000 0b00 5245 434f 5244 2054 4553 54 | ....RECORD TEST

GGS tokens:
TokenID x52 'R' ORAROWID Info x00 Length 20
4141 4347 5159 4141 4541 4141 4154 4d41 4141 0001 | AACGQYAAEAAAATMAAA..
TokenID x4c 'L' LOGCSN Info x00 Length 13
3333 3836 3433 3632 3930 3336 36 | 3386436290366
TokenID x36 '6' TRANID Info x00 Length 15
302e 3130 2e31 392e 3437 3631 3931 39 | 0.10.19.4761919
TokenID x69 'i' ORATHREADID Info x01 Length 2
0001 | ..

Logdump 32 >

The last transaction must be the INSERT INTO SOURCE_SCHEMA.ZZ_DUMMY2. If this is the case, stop the REPLICAT PROCESS:

GGSCI (target_server) 2> stop replicat R1

Sending STOP request to REPLICAT R1...
Request processed.

GGSCI (target_server) 3> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
REPLICAT STOPPED R1 00:00:00 00:00:11

STEP 7 – DISABLE AUTOSTART/AUTORESTART INTO MANAGER PARAMETER FILE

Remove the AUTOSTART/AUTORESTART parameters into the MANAGER parameter file on source and target:

--FOR EXTRACT
AUTORESTART EXTRACT E1, RETRIES 3, WAITMINUTES 1, RESETMINUTES 60
AUTOSTART EXTRACT E1

--FOR REPLICAT
AUTORESTART EXTRACT R1, RETRIES 3, WAITMINUTES 1, RESETMINUTES 60
AUTOSTART EXTRACT R1

STEP 8 – CREATE THE REVERSE EXTRACT (NEEDED IN CASE OF FAILBACK)

This step is needed in case we want to rollback the migration after the switchover. Rollback means go-back to 12c database without lose data.

Because new transactions has created new rows into the target database after the switchover, we must re-synchronize the 12c database from the first transaction on 19c database. If we rollback the migration and if the application go back to 12c, no transaction will be lost.

Configure the 19c database to be ready to capture data via the REVERSE EXTRACT:

  • Add schematrandata on the schema to be replicated

Register the REVERSE EXTRACT:

GGSCI (target_server) 10> alter credentialstore add user [email protected]_target alias target
Password:

Credential store altered.

GGSCI (target_server) 11> dblogin useridalias target
Successfully logged into database db_target.

GGSCI (target_server as [email protected]_target) 2>

GGSCI (target_server as [email protected]_target) 2>
register extract E2 database

2022-04-13 18:31:19 INFO OGG-02003 Extract E2 successfully registered with database at SCN 3386436450080

 

Save the SCN related the “REGISTER EXTRACT” command and create the REGISTER EXTRACT :

[[email protected]_server:/u01/app/oracle/product/19.1.0.0.4/gg_1]$ mkdir -p /u01/gs_x/ogg/

GGSCI (target_server) 7> add extract E2, integrated tranlog, begin now
EXTRACT (Integrated) added.

GGSCI (target_server) 8> INFO ALL

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT STOPPED E2 00:00:02 00:00:08

The extract E2 captures data from 19c database and the parameter file must be configured to extract data from the schema on 19c database.

GGSCI (target_server as [email protected]_target) 13> edit param e2 
Extract E2 
useridalias target 
Exttrail /u01/gs_x/ogg/gz 
LOGALLSUPCOLS 
UPDATERECORDFORMAT COMPACT 
DDL & INCLUDE MAPPED OBJNAME target_schema.* 
Sequence target_schema.*; 
Table target_schema.* ;

Start the EXTRACT with the SCN got after the REGISTER EXTRACT step:

GGSCI (target_server as [email protected]_target) 12> 
START EXTRACT E2 atcsn 3386436450080

Sending START request to MANAGER ...
EXTRACT E2 starting

GGSCI (TARGET_SERVER as [email protected]_db) 15> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT RUNNING E2 00:00:05 00:00:03

 

STEP 9 – START THE APPLICATION ON 19C DATABASE

Change the parameter job_queue_processes to the value we had in 12c database (step 2).

Start the application on 19c database by changing the DNS alias.

At this step, the downtime ends…

STEP 10 – GOT THE FIRST TRANSACTION ON 19C DATABASE

View the report file related to the REVERSE EXTRACT:

GGSCI (target_server as [email protected]_target) 71> 
view report e2

. . .
2022-04-13 18:38:40 INFO OGG-01872 Transaction delivery commencing with Transaction ID 482517513.14.30.1073,
CSN 3386436508051, 0 transaction(s) skipped.

. .

SCN --> CSN 3386436508051

The SCN related to the first transaction after the switchover is 3386436508051

STEP 11 – CREATE THE REVERSE REPLICAT – (NEEDED IN CASE OF FAILBACK)

The REVERSE REPLICAT replicates the data from 19c database to 12c database from the first transaction occuring on 19c database.

The SCN value is very important because :

  • Before the SCN 3386436508051, we will have duplicate data on 12c database
  • After the SCN 3386436508051, we risk to have missing data on 12c database
GGSCI (source_server as [email protected]_db) 7> 
edit param r2

Replicat R2
DBOPTIONS INTEGRATEDPARAMS ( parallelism 6 )
DISCARDFILE /u01/app/oracle/product/19.1.0.0.4/gg_1/dirrpt/R2_discard.txt, append, megabytes 10
USERIDALIAS ggsource
MAP target_schema.*, TARGET source_schema.*;

add replicat R2 integrated exttrail /u01/gs_x/ogg/gz

GGSCI (source_server as [email protected]_db) 6>
add replicat R2 integrated exttrail /u01/gs_x/ogg/gz
REPLICAT (Integrated) added.

GGSCI (source_server as [email protected]_db) 7> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT STOPPED E1 00:00:02 00:00:02
EXTRACT STOPPED P1 00:00:02 00:00:02
EXTRACT STOPPED R2 00:00:02 00:00:02

Start the REVERSE REPLICAT:

GGSCI (source_server as [email protected]_source) 52> 
start replicat r2 atcsn 3386436508051

GGSCI (source_server as [email protected]_source) 57> info all

Program Status Group Lag at Chkpt Time Since Chkpt

MANAGER RUNNING
EXTRACT STOPPED E1 00:00:02 00:00:00
EXTRACT STOPPED P1 00:00:02 00:00:00
REPLICAT RUNNING R2 00:00:02 00:00:00

From now the 12c database is syncrhonized from 19c database

STEP 12 – CHECK THE SYNCHRONIZATION – 19C –> 12C

Compare the rows like we do in the step 1.

STEP 13 – REMOVE OR NOT ALL GOLDENGATE PROCESS

If the migration has been validated (application is working as expected on 19c), remove all GoldenGate process on source and target server.

If the migration has not been validated (application is not working as expected), and you want to rollback the migration, replay all the step from step1 but in the direction 19c –> 12c, and start the application on 12c.

 

CONCLUSION

Near Zero Downtime migration means very minimal application switchover downtime, so the step from 1 to 9 must be done as faster as possible in order to minimize the application downtime.

It’s difficult to give a downtime estimation because it depends of the complexity of the application and there are a lot of components outside the database and outside GoldenGate which can increase the downtime (jdbc/odbc driver compatibility, dependent application, LDAP connection configuration,…)

Migrate an oracle database in near zero downtime with GoldenGate implies several steps where some are very important :

  • Check no more transaction is running before to switchover, otherwise we will lose data in 19c databse
  • In case of rollback, choose the correct SCN when we create the REVERSE EXTRACT (SCN got after the REGISTER REVERSE EXTRACT on 19c) and the REVERSE REPLICAT (SCN of the 1st transaction on 19c)
  • If we want a minimum of downtime, of course this procedure must be tested on a non production environment as close as possible of PROD in term of activity and architecture to be ready when you will execute it on PROD.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Lazhar Felahi
Lazhar Felahi

Consultant