May 2018 – A DBA's blog

One of our standby RACs had fallen behind in the log apply process:

DESTINATION           THREAD#    APPLIED   NOTAPPLD      DELTA
------------------ ---------- ---------- ---------- ----------
1 - From Primary            1       1187       1231         44
                            2       1520       1572         52
******************            ---------- ---------- ----------
Hash                                2707       2803         96

MRP was waiting on the below log number

   INST_ID PROCESS   STATUS          THREAD#  SEQUENCE#     BLOCK#
---------- --------- ------------ ---------- ---------- ----------
         2 MRP0      WAIT_FOR_GAP          2       1521          0

This log file was no longer available on the primary and hence I had to use an incremental copy to get the standby in synch with the primary. At this point I cancelled MRP on the standby, shut down the RAC via SRVCTL and started node 2 as a standalone in mount mode.

On the primary, I identified the first change number of the n-1th log that the standby had identified was missing. Using this change number, I created an incremental backup:

backup incremental from scn 14521210213 database format '/backup/dean/backup_%U.bak';

I also created a backup of the control file:

backup current controlfile for standby format '/backup/dean/stnd_ctl_file.ctl';

These files were transferred to the standby via FTP commands.

After the files are transferred to the standby, sign into SQLPLUS and stop MRP with

alter database recover managed standby database cancel;

Stop all nodes of the RAC with

srvctl status databs -d PRODDB
srvctl stop database -d PRODDB -o immediate

Bring up the database as a single node in SQLPLUS

startup nomount;
Alter database mount standby database;

Exit from SQLPLUS and go start RMAN and catalog the directory where the backups were FTPed with:

catalog start with '/backup/dean';

And then recovered the database with:

recover database noredo;

After the recover was complete, I restored the control file:

shutdown immediate;
startup nomount;
restore standby controlfile from '/backupb/dean/stnd_ctl_file.ctl';

Since the data file locations on the standby were different from the primary, the standby control file has to be updated. Start the database in mount mode:

alter database mount standby database;

At this point, the instance thinks the data file locations are those of the primary as the control file that was restored was from the primary. Switch the data file locations in RMAN with the below command (The catalog command should point to the data file locations on the standby.):

catalog start with '+DG_CLU_ORA1/D99Xsn';
switch database to copy;

In my experience, the “switch database to copy;” will switch the undo, sysaux, users and system data files successfully. The others (i.e. application-specific database files) tend to fail with the below error:

RMAN> switch database to copy;

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of switch to copy command at 08/27/2018 17:40:53
RMAN-06571: datafile 41 does not have recoverable copy

Check with REPORT SCHEMA to ensure that the data file exists on the standby with:

RMAN>  report schema;

RMAN-06139: WARNING: control file is not current for REPORT SCHEMA
Report of database schema for database with db_unique_name IPTSDV

List of Permanent Datafiles
===========================
File Size(MB) Tablespace           RB segs Datafile Name
---- -------- -------------------- ------- ------------------------
1    0        SYSTEM               ***     +DG_CLU_ORA1/D99Xsn/system.259.841589723
2    0        SYSAUX               ***     +DG_CLU_ORA1/D99Xsn/sysaux.260.841589731
.
.
.
41   1024     emp_stats_DAT_001_DC  ***     +DG_CLU_ORA1/D99Xsn/emp_stats_dat_001_dc.2405.983050989

If the file does exist, catalog it with:

RMAN> catalog datafilecopy '+DG_CLU_ORA1/D99Xsn/emp_stats_dat_001_dc.2405.983050989';

cataloged datafile copy
datafile copy file name=+DG_CLU_ORA1/D99Xsn/emp_stats_dat_001_dc RECID=197 STAMP=985283856

The switch database to copy should work after that

RMAN> switch database to copy;

datafile 1 switched to datafile copy "+DG_CLU_ORA1/D99Xsn/system.1853.983049983"
datafile 2 switched to datafile copy "+DG_CLU_ORA1/D99Xsn/sysaux.1793.983050037"
datafile 3 switched to datafile copy "+DG_CLU_ORA1/D99Xsn/undotbs1.1805.983050783"
.
.
.
datafile 41 switched to datafile copy "+DG_CLU_ORA1/D99Xsn/emp_stats_dat_001_dc.2405.983050989"

After this, I started MRP on the last node with:

alter database recover managed standby database using current logfile disconnect from session;

The standby database was now current with the primary

DESTINATION           THREAD#    APPLIED   NOTAPPLD      DELTA
------------------ ---------- ---------- ---------- ----------
1 - From Primary            1       1233       1234          1
                            2       1575       1575          0
******************            ---------- ---------- ----------
Hash                                2808       2809          1

To make sure everything was perfect, I cut a log on the primary and monitored to ensure that was applied on the standby.

Another issue I faced was that the log files were pointing to the primary’s log file location. This can be changed by running the output of the below SQL:

set line 200
select 'ALTER DATABASE RENAME FILE ' || '''' || member || '''' || ' to ' || '''' || replace(member,'primary_log_location','standby_log_location') || '''' || ';'
from v$logfile where member like '%primary db name%';

Output will look like:

ALTER DATABASE RENAME FILE '+DG1/primary/ONLINELOG/group_2_a.log' to '+DG3/standby/ONLINELOG/group_2_a.log';

Encountered a strange setup wherein a table was created as partitioned but only with a max partition resulting in basically a single partition table. Extract of the ORIGINAL DDL was as follows:

  CREATE TABLE “SCHEMA”.”TABLE_NAME” 
   (	"VERSION_NUM" VARCHAR2(5 BYTE), 
       .
       .
       .    
	"PARTITION_COLUMN" DATE, 
	.
      .
 (PARTITION "PARTITION_MAX" 
  TABLESPACE "DE_SUSP_IND_001" )  ENABLE
   ) PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 
  STORAGE(
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "TABLE_DAT_001" 
  PARTITION BY RANGE ("PARTITION_COLUMN") 
 (PARTITION "PARTITION_MAX"  VALUES LESS THAN (MAXVALUE) SEGMENT CREATION IMMEDIATE
  ENABLE ROW MOVEMENT ;

I needed to split this into two partitions such that all ros with a value less than 2017-11-12 in a separate partition. I split the partitions as shown:

ALTER TABLE “SCHEMA”.”TABLE_NAME” 
      SPLIT PARTITION "PARTITION_MAX" AT (TO_DATE('2017-11-12','YYYY-MM-DD')) 
      INTO (PARTITION "PARTITION_20171111", PARTITION "PARTITION_MAX") UPDATE INDEXES PARALLEL
;

The above resulted in two partitions,

PARTITION "PARTITION_20171111"  VALUES LESS THAN (TO_DATE(' 2017-11-12 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))

And

PARTITION "PARTITION_MAX"  VALUES LESS THAN (MAXVALUE) SEGMENT CREATION IMMEDIATE

After this was complete, I gathered statistics to check that the row count in the partitions matched the actual distribution of data.

Month: May 2018

Applying an incremental copy to make a physical standby current

Splitting a partition