ORA-01157 while activating a physical standby database

We had previously created a 2 node RAC physical standby database. Today, we were attempting to bring it up as a primary by issuing the commands:

startup mount;
ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE;

In the alert log we saw the error:

ORA-01157: cannot identify/lock data file 2 - see DBWR trace file
ORA-01110: data file 2: '+DG_OLD_SERVER/dbname/ts_name_data_file_01.dbf'
ORA-17503: ksfdopn:2 Failed to open file +DG_OLD_SERVER/dbname/ts_name_data_file_01.dbf
ORA-15001: diskgroup "+DG_OLD_SERVER" does not exist or is not mounted

The database was attempting to lock a file with the name of the old primary’s ASM file systems. I checked in v$datafile:

 
select    a.name
from      v$datafile a
order by  a.file#
;

and all the file names were prefixed by the name of the old primary’s ASM file systems. A check in ASM on the new server revealed that the files existed in ASM prefixed with the new server’s ASM file system names. I suspect that while building the standby the below command was not issued after the recover database:

switch datafile all; 

We brought up the database in mount mode and issued commands to rename the files from the old ASM file system to the new ASM file system:

ALTER DATABASE RENAME FILE '+DG_OLD_SERVER/dbname/ts_name_data_file_01.dbf'        TO '+DG_NEW_SERVER/dbname/ts_name_data_file_01.dbf';

After all the files were renamed, I checked in v$datafile to confirm the new names. After this we were able to activate the physical standby database as a primary.

Update – February 05, 2018
I built another standby and checked the names of the data files after the recover database and the file names were correctly changed as per the DB_FILE_NAME_CONVERT parameter and hence I cannot explain what went wrong in the original build. However, there is no need to issue the “switch datafile all” command.

Tracing TNSPING to identify where the failure is occurring (TNS-12535)

If you need to trace the TNSPING command, add the 2 lines below to the sqlnet.ora file:

TNSPING.TRACE_LEVEL=ADMIN
TNSPING.TRACE_DIRECTORY=/apps/opt/oracle/product/12.1.0.2/db_1/network/admin

Then run the tnsping command:

[oracle@server]$ tnsping DEANDR
 
TNS Ping Utility for Linux: Version 12.1.0.2.0 - Production on 30-JAN-2018 19:57:30
 
Copyright (c) 1997, 2014, Oracle.  All rights reserved.
 
Used parameter files:
oracle_home/network/admin/sqlnet.ora
 
 
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = server.domain.com)(PORT = 2340)) (CONNECT_DATA = (SERVER = DEDICATED) (SID = DEANDR)))
TNS-12535: TNS:operation timed out

In this case I had set the trace files to be written to the ORACLE_HOME/network/admin directory. A file named tnsping.trc will be created and will contain information as to the details of the trace command.

CRS failure – failure occurred at: sskgpcreates (ORA-27302, ORA-27301, ORA-27300, ORA-27154, CRS-5017)

I was trying to bring up a new RAC physical standby node and encountered the below error:

[racnode2-DEVDBTPN2] srvctl start database -d DEVDBTPN 
PRCR-1079 : Failed to start resource ora.devdbtpn.db
CRS-5017: The resource action "ora.devdbtpn.db start" encountered the following error: 
ORA-27154: post/wait create failed
ORA-27300: OS system dependent operation:semget failed with status: 28
ORA-27301: OS failure message: No space left on device
ORA-27302: failure occurred at: sskgpcreates
. For details refer to "(:CLSN00107:)" in "/oracle/base/diag/crs/racnode1/crs/trace/crsd_oraagent_oracle.trc".
 
CRS-2674: Start of 'ora.devdbtpn.db' on 'racnode1' failed
CRS-2632: There are no more servers to try to place resource 'ora.devdbtpn.db' on that would satisfy its placement policy

The OS information was

Linux racnode1 2.6.32-696.13.2.el6.x86_64 #1 SMP Fri Sep 22 12:32:14 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux

There were many other instances running on this RAC node. The issue turned out to be a system parameter setting for the semaphores.

Edit the sysctl.conf file as follows:

vi /etc/sysctl.conf

and change this line

kernel.sem=250 32000 100 128

to this

kernel.sem=250 64000 100 256

Basically double the 2nd and 4th parameter. Then to make the changes go into effect issue the command

sysctl –p

Stop and start the RAC DB with srvctl commands.

Instructions to run the SQL Tuning Advisor

A brief write up on running the SQL Tuning Advisor. It seems that every time, I run an ADDM report I see output that includes “Run SQL Tuning Advisor on the SELECT statement with SQL_ID”

          Findings and Recommendations
          ----------------------------
 
Finding 1: Top SQL Statements
Impact is .22 active sessions, 95% of total activity.
-----------------------------------------------------
SQL statements consuming significant database time were found. These
statements offer a good opportunity for performance improvement.
 
   Recommendation 1: SQL Tuning
   Estimated benefit is .22 active sessions, 95% of total activity.
   ----------------------------------------------------------------
   Action
      Run SQL Tuning Advisor on the SELECT statement with SQL_ID
      "gbwan1836hj3c".
      Related Object
         SQL statement with SQL_ID gbwan1836hj3c.
         SELECT time_key + (:"SYS_B_00" / :"SYS_B_01") period,
         in_value,
         out_value
         FROM  (SELECT time_key,

Create an SQL Tuning task as follows:

DECLARE
  l_sql_tune_task_id  VARCHAR2(100);
BEGIN
  l_sql_tune_task_id := DBMS_SQLTUNE.create_tuning_task (
                          sql_id      => 'gbwan1836hj3c',
                          scope       => DBMS_SQLTUNE.scope_comprehensive,
                          time_limit  => 500,
                          task_name   => 'gbwan1836hj3c_tuning_task',
                          description => 'Tuning task1 for statement gbwan1836hj3c');
  DBMS_OUTPUT.put_line('l_sql_tune_task_id: ' || l_sql_tune_task_id);
END;
/

Run the tuning task

EXEC DBMS_SQLTUNE.execute_tuning_task(task_name => 'gbwan1836hj3c_tuning_task');

The above will take a couple of minutes. Depending on the complexity of the query, the time_limit may need to be increased.

After the task is complete, query the output:

set long 65536
set longchunksize 65536
set linesize 100
select dbms_sqltune.report_tuning_task('gbwan1836hj3c_tuning_task11') from dual;

I was using SQL Developer and the output will look a bit strange. Copy and paste it into Notepad to see it properly.

RMAN-08137: WARNING: archived log not deleted, needed for standby or upstream capture process

I was in the process of building a physical standby that had to be put on hold for a few days while other more pressing priorities were handled. I thought it would be sufficient to change the below parameter:

NAME                                 TYPE        VALUE
log_archive_dest_state_2             string      DEFER

However, I then noticed that the time taken by the archived log backup script had increased from a couple of minutes to over three hours. This was because the archived logs were no longer being deleted as RMAN was displaying the below message:

channel disk2: backup set complete, elapsed time: 00:01:35
channel disk2: deleting archived log(s)
RMAN-08137: WARNING: archived log not deleted, needed for standby or upstream capture process
archived log file name=/archlogs/1_3426_959714474.arc thread=1 sequence=3426
RMAN-08137: WARNING: archived log not deleted, needed for standby or upstream capture process
archived log file name=/archlogs/1_3427_959714474.arc thread=1 sequence=3427

In order to resolve this issue I had to remove the entry for the physical standby in the below parameter:

NAME                                 TYPE        VALUE
log_archive_dest_2                   string

After this was done, the next run of the archived log backup process completed in a few minutes and subsequent runs returned to the expected couple of minutes execution time.