ORA-10458, ORA-01196, ORA-01110 errors on a physical standby

This morning we had an issue with a physical standby server. The first notification was that log apply was falling behind. It turned out that the standby server had been restarted and the listener and the database not restarted. When we started the database, we saw the below errors in the alert log:

Standby crash recovery aborted due to error 16016.
Errors in file /opt/app/oracle/diag/rdbms/dr/PRODDR/trace/PRODDR_ora_21598.trc:
ORA-16016: archived log for thread 1 sequence# 16661 unavailable
Recovery interrupted!
Some recovered datafiles maybe left media fuzzy
Media recovery may continue but open resetlogs may fail
Completed standby crash recovery.
Errors in file /opt/app/oracle/diag/rdbms/dr/PRODDR/trace/PRODDR_ora_21598.trc:
ORA-10458: standby database requires recovery
ORA-01196: file 1 is inconsistent due to a failed media recovery session
ORA-01110: data file 1: '/appl_ware/oradata/data/PRODDR/system01.dbf'
ORA-10458 signalled during: ALTER DATABASE OPEN...

A lot of posts on google indicated that a recovery was needed. Before we did that we decided to try and start MRP to see if that would resolve the issue. We issued

ALTER DATABASE MOUNT STANDBY DATABASE;

followed by the command to start MRP

alter database recover managed standby database using current logfile disconnect from session;

and the problem was resolved.

ORA-15032, ORA-15017 and ORA-15040 issues with an ASM disk group

While creating a new Oracle database with ASM on an AWS EC2 platform I encountered the below errors while attempting to mount one of the ASM disk groups:

ALTER DISKGROUP VOTE_DISK mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15017: diskgroup "VOTE_DISK" cannot be mounted
ORA-15040: diskgroup is incomplete

When I checked the v$asm_diskgroup, I saw

SQL> select NAME,TOTAL_MB,FREE_MB from v$asm_diskgroup;  

NAME                             TOTAL_MB    FREE_MB
------------------------------ ---------- ----------
ASMDATA1                            716796     716726
DBVOT                                    0          0
ASMDATA2                            716796     716726
ASMARCH                             716796     716726

The goal was to create the VOTE_DISK diskgroup on the DBVOT volume that was in turn mapped to the /dev/bqdx disk. The AWS Console indicated that the /dev/bqdx was attached to the EC2 but was not visible in the OS for some reason. I issued the command:

/usr/bin/echo -e "o\nn\np\n1\n\n\nt\n8e\nw" | /usr/sbin/fdisk /dev/bqdx

Which resulted in:

Welcome to fdisk (util-linux 2.23.2).

Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.


Command (m for help): Building a new DOS disklabel with disk identifier 0x29ba6b8d.

Command (m for help): Partition type:
   p   primary (0 primary, 0 extended, 4 free)
   e   extended
Select (default p): Partition number (1-4, default 1): First sector (2048-104857599, default 2048): Using default value 2048
Last sector, +sectors or +size{K,M,G} (2048-104857599, default 104857599): Using default value 104857599
Partition 1 of type Linux and of size 50 GiB is set

Command (m for help): Selected partition 1
Hex code (type L to list all codes): Changed type of partition 'Linux' to 'Linux LVM'

Command (m for help): The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

However, when I attempted to create the disk I got the error:

root@10-118-134-71:/root # /usr/sbin/oracleasm createdisk DBVOT /dev/bqdx1
Device "/dev/bqdx1" is already labeled for ASM disk "DBVOT"

I then attempted to drop the disk group

SQL> drop diskgroup DBVOT;
drop diskgroup DBVOT
*
ERROR at line 1:
ORA-15039: diskgroup not dropped
ORA-15001: diskgroup "DBVOT" does not exist or is not mounted

I then tried with the FORCE option and succeeded

SQL> drop diskgroup DBVOT force including contents;

Diskgroup dropped.

I then deleted the disk

root@10-118-134-71:/root # /usr/sbin/oracleasm deletedisk DBVOT 
Clearing disk header: done
Dropping disk: done

and checked to make sure it was no longer visible

[oracle@10-118-134-71 ~]$ /usr/sbin/oracleasm listdisks
DB1
DG10
DG11
DG12
DG2
DG3
DG4
DG5
DG6
DG7
DG8
DG9

Then I recreated the disk

root@10-118-134-71:/root # /usr/sbin/oracleasm createdisk DBVOT /dev/bqdx1
Writing disk header: done
Instantiating disk: done

and listed the disks to confirm

root@10-118-134-71:/root # /usr/sbin/oracleasm listdisks
DG1
DG10
DG11
DG12
DG2
DG3
DG4
DG5
DG6
DG7
DG8
DG9
DBVOT

Then I created the disk in ASM

SQL> CREATE DISKGROUP DBVOT EXTERNAL REDUNDANCY DISK '/dev/oracleasm/disks/DBVOT';

Diskgroup created.

TNS-12535, ns secondary err code: 12560 and users unable to log on

This afternoon the users of one of my applications reported that they were unable to sign on using a particular user account. The user account in question was open and unexpired. The users were able to sign on to other accounts without any issues. The alert logs on all nodes of the RAC were being flooded with messages such as:

Fatal NI connect error 12170.

  VERSION INFORMATION:
        TNS for Linux: Version 11.2.0.4.0 - Production
        Oracle Bequeath NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
        TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.4.0 - Production
  Time: 15-JUN-2018 19:17:55
  Tracing not turned on.
  Tns error struct:
    ns main err code: 12535
    
TNS-12535: TNS:operation timed out
    ns secondary err code: 12560
    nt main err code: 505
    
TNS-00505: Operation timed out
    nt secondary err code: 110
    nt OS err code: 0
  Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=xxx.xxx.xxx.xx)(PORT=nnn))
Fri Jun 15 19:18:37 2018

Despite bouncing all nodes of the RAC and flushing the shared pool etc. the users were not able to connect via this particular user account. There were existing connections with this account that were working.

After some false leads, a colleague noticed that there were a number of library cache locks with the user account set to “null”. These were being generated by a set of new servers that the application was attempting to configure using the user account that was experiencing the issue. After these new servers were shut down and the database bounced, normal functionality was restored.