One of our standby RACs had fallen behind in the log apply process:
DESTINATION THREAD# APPLIED NOTAPPLD DELTA ------------------ ---------- ---------- ---------- ---------- 1 - From Primary 1 1187 1231 44 2 1520 1572 52 ****************** ---------- ---------- ---------- Hash 2707 2803 96
MRP was waiting on the below log number
INST_ID PROCESS STATUS THREAD# SEQUENCE# BLOCK# ---------- --------- ------------ ---------- ---------- ---------- 2 MRP0 WAIT_FOR_GAP 2 1521 0
This log file was no longer available on the primary and hence I had to use an incremental copy to get the standby in synch with the primary. At this point I cancelled MRP on the standby, shut down the RAC via SRVCTL and started node 2 as a standalone in mount mode.
On the primary, I identified the first change number of the n-1th log that the standby had identified was missing. Using this change number, I created an incremental backup:
backup incremental from scn 14521210213 database format '/backup/dean/backup_%U.bak';
I also created a backup of the control file:
backup current controlfile for standby format '/backup/dean/stnd_ctl_file.ctl';
These files were transferred to the standby via FTP commands.
After the files are transferred to the standby, sign into SQLPLUS and stop MRP with
alter database recover managed standby database cancel;
Stop all nodes of the RAC with
srvctl status databs -d PRODDB srvctl stop database -d PRODDB -o immediate
Bring up the database as a single node in SQLPLUS
startup nomount; Alter database mount standby database;
Exit from SQLPLUS and go start RMAN and catalog the directory where the backups were FTPed with:
catalog start with '/backup/dean';
And then recovered the database with:
recover database noredo;
After the recover was complete, I restored the control file:
shutdown immediate; startup nomount; restore standby controlfile from '/backupb/dean/stnd_ctl_file.ctl';
Since the data file locations on the standby were different from the primary, the standby control file has to be updated. Start the database in mount mode:
alter database mount standby database;
At this point, the instance thinks the data file locations are those of the primary as the control file that was restored was from the primary. Switch the data file locations in RMAN with the below command (The catalog command should point to the data file locations on the standby.):
catalog start with '+DG_CLU_ORA1/D99Xsn'; switch database to copy;
In my experience, the “switch database to copy;” will switch the undo, sysaux, users and system data files successfully. The others (i.e. application-specific database files) tend to fail with the below error:
RMAN> switch database to copy; RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of switch to copy command at 08/27/2018 17:40:53 RMAN-06571: datafile 41 does not have recoverable copy
Check with REPORT SCHEMA to ensure that the data file exists on the standby with:
RMAN> report schema; RMAN-06139: WARNING: control file is not current for REPORT SCHEMA Report of database schema for database with db_unique_name IPTSDV List of Permanent Datafiles =========================== File Size(MB) Tablespace RB segs Datafile Name ---- -------- -------------------- ------- ------------------------ 1 0 SYSTEM *** +DG_CLU_ORA1/D99Xsn/system.259.841589723 2 0 SYSAUX *** +DG_CLU_ORA1/D99Xsn/sysaux.260.841589731 . . . 41 1024 emp_stats_DAT_001_DC *** +DG_CLU_ORA1/D99Xsn/emp_stats_dat_001_dc.2405.983050989
If the file does exist, catalog it with:
RMAN> catalog datafilecopy '+DG_CLU_ORA1/D99Xsn/emp_stats_dat_001_dc.2405.983050989'; cataloged datafile copy datafile copy file name=+DG_CLU_ORA1/D99Xsn/emp_stats_dat_001_dc RECID=197 STAMP=985283856
The switch database to copy should work after that
RMAN> switch database to copy; datafile 1 switched to datafile copy "+DG_CLU_ORA1/D99Xsn/system.1853.983049983" datafile 2 switched to datafile copy "+DG_CLU_ORA1/D99Xsn/sysaux.1793.983050037" datafile 3 switched to datafile copy "+DG_CLU_ORA1/D99Xsn/undotbs1.1805.983050783" . . . datafile 41 switched to datafile copy "+DG_CLU_ORA1/D99Xsn/emp_stats_dat_001_dc.2405.983050989"
After this, I started MRP on the last node with:
alter database recover managed standby database using current logfile disconnect from session;
The standby database was now current with the primary
DESTINATION THREAD# APPLIED NOTAPPLD DELTA ------------------ ---------- ---------- ---------- ---------- 1 - From Primary 1 1233 1234 1 2 1575 1575 0 ****************** ---------- ---------- ---------- Hash 2808 2809 1
To make sure everything was perfect, I cut a log on the primary and monitored to ensure that was applied on the standby.
Another issue I faced was that the log files were pointing to the primary’s log file location. This can be changed by running the output of the below SQL:
set line 200 select 'ALTER DATABASE RENAME FILE ' || '''' || member || '''' || ' to ' || '''' || replace(member,'primary_log_location','standby_log_location') || '''' || ';' from v$logfile where member like '%primary db name%';
Output will look like:
ALTER DATABASE RENAME FILE '+DG1/primary/ONLINELOG/group_2_a.log' to '+DG3/standby/ONLINELOG/group_2_a.log';