Upgrade failure (11.2.0.4 to 19c) followed by ORA-29702: error occurred in Cluster Group Service operation & ORA-00704, ORA-00604 and ORA-00904

Update – April 10, 2021
We encountered a failure during an upgrade from 12.1 to 19c. The SYSTEM tablespace ran out of space causing the upgrade to fail. When we attempted to mount the database on the 12c home we encountered the error

 ORA-00704: bootstrap process failure
 ORA-00604: error occurred at recursive SQL level 1
 ORA-00904: "SPARE10": invalid identifier

and

ORA-29702: error occurred in Cluster Group Service operation

We mounted the database in the oracle home and performed the following steps

startup nomount;

alter database mount;

alter database open;

flashback database to restore point PRE_UPGRADE;

alter database open resetlogs;

shutdown immediate;

After that, we continued at “After consulting Google, we found the new process is to run a manual upgrade as follows:” in the below post. The upgrade lasted for 90 minutes and performed 107 phases. Most phases run between 10 and 150 seconds except for phase 53, 1135 seconds, and phase 98, 1388 seconds.

Original post continues below:
We were attempting to upgrade an Oracle database from 11.2.0.4 to 19 in an Oracle Exadata cluster in the Oracle Cloud Infrastructure. Approximately 44% in to the upgrade the process failed and performed a rollback. We suspect we had some invalid objects that caused this. However, when we attempted to start the database at 11.2.0.4 (old Oracle home) we encountered the error message “ORA-29702: error occurred in Cluster Group Service operation”. Oracle suggested that we bounce CRS across all nodes together (i.e. non-rolling) to resolve this. However, the database still did not start and displayed the message:

[actpsrvr-ACPTDB1] srvctl start database -d ACPTDB
PRCR-1079 : Failed to start resource ora.ACPTDB.db
CRS-5017: The resource action "ora.ACPTDB.db start" encountered the following error:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-39700: database must be opened with UPGRADE option
Process ID: 251124
Session ID: 775 Serial number: 1
. For details refer to "(:CLSN00107:)" in "/grid19/crs/trace/crsd_oraagent_oracle.trc".
 
CRS-2674: Start of 'ora.ACPTDB.db' on 'acptsrvr' failed
CRS-2632: There are no more servers to try to place resource 'ora.ACPTDB.db' on that would satisfy its placement policy
CRS-5017: The resource action "ora.ACPTDB.db start" encountered the following error:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-39700: database must be opened with UPGRADE option
Process ID: 135914
Session ID: 775 Serial number: 1
. For details refer to "(:CLSN00107:)" in "/grid19/crs/trace/crsd_oraagent_oracle.trc".
 
CRS-2674: Start of 'ora.ACPTDB.db' on 'actpsrvr' failed

Any attempt to start the database resulted in the database starting in upgrade mode at this point. So we switched the database to the new 19c home and issued a STARTUP UPGRADE on node 01 and then attempted to run CATUPGRD which displayed:

SQL> @?/rdbms/admin/catupgrd.sql
DOC>######################################################################
DOC>######################################################################
DOC>                                 ERROR
DOC>
DOC>
DOC>    As of 12.2, customers must use the parallel upgrade utility, catctl.pl,
DOC>    to invoke catupgrd.sql when upgrading the database dictionary.
DOC>    Running catupgrd.sql directly from SQL*Plus is no longer supported.
DOC>
DOC>    For Example:
DOC>
DOC>
DOC>          catctl
DOC>
DOC>          or
DOC>
DOC>          cd $ORACLE_HOME/rdbms/admin
DOC>          $ORACLE_HOME/perl/bin/perl catctl.pl catupgrd.sql
DOC>
DOC>    Refer to the Oracle Database Upgrade Guide for more information.
DOC>
DOC>
DOC>######################################################################
DOC>######################################################################
DOC>#
Disconnected from Oracle Database 19c EE Extreme Perf Release 19.0.0.0.0 - Production
Version 19.8.0.0.0

After consulting Google, we found the new process is to run a manual upgrade as follows:

cd $ORACLE_HOME/bin
./dbupgrade

This kicks of a PERL script that does the upgrade. The log files are in:

$ORACLE_HOME/product/19.0.0.0/dbhome_9/cfgtoollogs/ACPTDB/upgrade20210226194351

Takes about 90 minutes and the log files keep switching in the above directory so keep changing which one you are tailing.

After the upgrade was finished, I attempted to start with SRVCTL and got the below error:

[actpsrvr-ACPTDB1] srvctl status database -d ACPTDB
PRCD-1229 : An attempt to access configuration of database ACPTDB was rejected because its version 11.2.0.4.0 differs from the program version 19.0.0.0.0. Instead run the program from /11.2.0/dbhome_2.

This is documented by Oracle in note “After manual database upgrade, srvctl commands fail with PRCD-1027, PRCD-1229 (Doc ID 1281852.1)”

CAUSE
Oracle Clusterware keys for the database still refer to the old ORACLE_HOME.
 
SOLUTION
1. Upgrade the Oracle Clusterware keys for the database by running the "srvctl upgrade database" command.
 
Run srvctl from the new release $ORACLE_HOME to upgrade the database keys. For example:
 
/bin/srvctl upgrade database -d DB_NAME -o NEW_ORACLE_DB_HOME

After we issued the above command the database started normally with SRVCTL commands. Oddly, it cut about 100 logs in quick succession before it calmed down and started reporting routine errors on objects. We then recompiled all the objects and the upgrade was successful.