DR test with Snapshot Standby

This blog post contains the steps required to convert the physical standby to an open (read/write) database and allow the application to perform a DR test without impacting the primary database and the users.

This test was conducted on a four-node primary RAC with a four-node physical standby, both running Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 – 64bit Production.

We did not make any changes to the primary. As a result, users were able to work on the primary and it was able to ship logs to that standby. However, as we reserved all the space on the standby for DB_RECOVERY_FILE_DEST_SIZE the primary was unable to ship and issued the below error messages. For next time, DO NOT reserve all the space for DB_RECOVERY_FILE_DEST_SIZE. Instead, maybe use 50% or 60% of the available space and allow the archive logs from the primary to continue to land on the standby.

These errors were reported in the alert logs of the primary database because we had reserved all the space on the standby:

Tue Jun 09 12:33:44 2020
TT00: Standby redo logfile selected for thread 4 sequence 144397 for destination LOG_ARCHIVE_DEST_2
Tue Jun 09 12:35:13 2020
TT00: Attempting destination LOG_ARCHIVE_DEST_2 network reconnect (3135)
TT00: Destination LOG_ARCHIVE_DEST_2 network reconnect abandoned
Tue Jun 09 12:35:13 2020
Errors in file /oracle_home/product/diag/rdbms/prprimy/PRPRIMY4/trace/PRPRIMY4_tt00_53355.trc:
ORA-03135: connection lost contact
Error 3135 for archive log file 8 to 'PRSTNBY'

All of the below commands/SQLs were run on the standby RAC.

Check the recovery file destination:

SELECT  NAME,FLOOR (space_limit/1024/1024) "Size MB",CEIL (space_used/1024/1024) "Used MB"
FROM    v$recovery_file_dest
;

NAME                              Size MB    Used MB
------------------------------ ---------- ----------
+DG_ARCHIVE                          5535        408

Confirm space in ASM:

ASMCMD [+] > lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  EXTERN  N         512   4096  1048576    208896   204136                0          204136              0             N  DG_ARCHIVE/

Modify the below parameters:

alter system set DB_RECOVERY_FILE_DEST_SIZE= '190G' scope=both sid='*';
alter system set DB_RECOVERY_FILE_DEST='+DG_ARCHIVE' scope=both sid='*';
alter system set undo_retention = 21600 scope=both sid='*';

Stop managed recovery process:

alter database recover managed standby database cancel;

Stop the database via srvctl:

srvctl stop database –d PRSTNBY –o immediate

In older versions of Oracle, we used to turn on flashback and activate the standby database. This is no longer required because the CONVERT TO SNAPSHOT STANDBY will turn on flashback. At this point the value of flashback will be:

select FLASHBACK_ON from gv$database;

FLASHBACK_ON
------------------
NO
NO
NO
NO

alter database flashback on;    <--- this was deemed not needed, so we didn't execute this
select flashback_on from gv$database;
select * from gv$tablespace where FLASHBACK_ON='NO';

On the standby, bring up one node in mount status to convert to snapshot:

startup nomount
alter database mount standby database;
select name, guarantee_flashback_database from gv$restore_point;

At this point, the above SQL will say no rows selected

Convert to snapshot:

ALTER DATABASE CONVERT TO SNAPSHOT STANDBY;

Check the flashback status:

select name, guarantee_flashback_database from gv$restore_point;

NAME                                            GUA
---------------------------------------------   ---
SNAPSHOT_STANDBY_REQUIRED_06/09/2020 12:35:05   YES


select flashback_on from gv$database;

FLASHBACK_ON
------------------
RESTORE POINT ONLY

Sanity check - Stop here !! -- verify that restore point EXISTs, then open database

Try to open the one node that we are working on to make sure that the open works successfully:

ALTER DATABASE OPEN;

select status, instance_name, database_role, open_mode from gv$database, gv$instance;
STATUS       INSTANCE_NAME    DATABASE_ROLE    OPEN_MODE
------------ ---------------- ---------------- --------------------
MOUNTED      PRSTNBY4         SNAPSHOT STANDBY MOUNTED

Shutdown immediate as we had only started one node to convert to snapshot

SHUTDOWN IMMEDIATE;

Bring up the database via SRVCTL in mount mode

srvctl start database –d PRSTNBY -o mount

Log on to each node (instance) and open that instance

ALTER DATABASE OPEN;

Inform users to point their application server to the former standby database and begin the DR Exercise.

After the applications inform us that they have completed their DR Exercise, continue with the below steps.

Stop the database via SRVCTL:

srvctl stop database –d PRSTNBY –o immediate

Confirm all nodes are down

srvctl status database –d PRSTNBY

Bring up one node to convert back to physical standby (Reduce the DB_RECOVERY_FILE_DEST_SIZE else we will face the error:

Waiting for all non-current ORLs to be archived...
Tue Jun 09 17:50:17 2020
Waiting for the ORL for thread 1 sequence 1 to be archived...
ARC3: Archiving not possible: error count exceeded

as the archive destination is being reserved for the DB_RECOVERY_FILE_DEST_SIZE):

startup mount
ALTER DATABASE CONVERT TO PHYSICAL STANDBY;
alter database mount standby database;
alter system set DB_RECOVERY_FILE_DEST_SIZE= '5535M' scope=both sid='*';
alter system set undo_retention = 14400 scope=both sid='*';
shutdown immediate;

Start the standby via SRVCTL

srvctl start database –d PRSTNBY

Confirm that MRP is running and log apply is progressing.

Author: Dean Capps

Database consultant at Amazon Web Services.

One thought on “DR test with Snapshot Standby”

Comments are closed.