Databases – A DBA's blog

Oracle Active Data Guard on AWS RDS

On Aug 23, 2022, AWS announced support for managed Oracle Data Guard Switchover and Automated Backups for replicas. See the announcement at:

AWS RDS for Oracle now supports managed Oracle Data Guard

Oracle DBAs are familiar with creating a physical standby in an on-premises environment and opening it in read-only (i.e. Active Data Guard) mode. AWS has now automated the process of creating the standby and the process of switching over in the context of an RDS (i.e. managed database service). All the manual tasks of taking an RMAN backup, transferring to the standby, restoring and recovering the database, setting up Data Guard Manager, etc. can be accomplished by a few clicks on the AWS console. This blog describes the steps required to create and test this scenario.

Prerequisites
1. The database instance size must be db.t3.large at the minimum.
2. The primary database should be using an option group that is exclusive to this database.
3. Ensure you have the required KMS key in both regions
4. Create a parameter group in the target region that exactly matches the parameter group from the source region

Creating a replica
After you have created a primary database in any region:
1. Select the primary database and under the actions button, select “Create replica”.
2. Choose multi-AZ on the target also
3. Click on the create replica button. Replica creation will take approximately 1.5 hours. This may vary based on the size of the database.
4. Change the backup option on the replica. The replica will be created with the backup retention period set to “0” as shown below. This is contrary to the standard practice of not backing up a data guard standby. However, for the replica to switch over to primary at some point in the future, the retention period should be greater than “0”.

Performing a switchover across regions
A switchover is a role reversal between the primary database and one of its standby databases. A switchover guarantees no data loss. This is typically done for planned maintenance of the primary system. During a switchover, the primary database transitions to a standby role, and the standby database transitions to the primary role.

1. Confirm there is no lag. This can be done via the console under the “Replication” section under the “Lag” column or via SQL

SELECT ARCH.thread#                        "Thread",
       ARCH.sequence#                      "Last Sequence Received",
       APPL.sequence#                      "Last Sequence Applied",
       ( ARCH.sequence# - APPL.sequence# ) "Difference"
FROM   (SELECT thread#,
               sequence#
        FROM   v$archived_log
        WHERE  ( thread#, first_time ) IN (SELECT thread#,
                                                  Max(first_time)
                                           FROM   v$archived_log
                                           GROUP  BY thread#)) ARCH,
       (SELECT thread#,
               sequence#
        FROM   v$log_history
        WHERE  ( thread#, first_time ) IN (SELECT thread#,
                                                  Max(first_time)
                                           FROM   v$log_history
                                           GROUP  BY thread#)) APPL
WHERE  ARCH.thread# = APPL.thread#
ORDER  BY 1;

2. Confirm that there is no pending maintenance on either the primary of the replica

3. Initiate the switchover from the replica (not the primary!).

4. Select the RDS instance and click on “Switch over replica”. And agree to the warning panel that is displayed

5. The former replica will display the message “The Switchover to the read replica started” under “Logs & events”, “Recent events” section. (Remember to sort the messages such that the latest messages are displayed.)

6. After the switchover is complete, both databases will display “The Switchover to the read replica finished successfully”

7. Applications will have to reconnect to the databases

8. You can confirm that the data is replicating from the new primary via the console or SQL

Test reboot with multi-AZ failover – Primary
Initiate a reboot with failover on the current primary.

1. Select the RDS instance, under “Actions”, and click on “Reboot”. Check the option “Reboot With Failover?”

2. After about 8 to 10 minutes, the database will complete the multi-AZ failover. Under “Logs & events”, “Recent events” section, look for messages “Multi-AZ instance failover started” and “Multi-AZ instance failover completed”.

A similar test can be done on the Replica.

Observations
1. After a switchover, both the databases may show as Read-Only. This can be ignored

2. You cannot temporarily shut down the database for the customary 7 days as it has a read replica

3. Ensure that there is no pending maintenance before issuing a switchover

4. In the event that you are using a modified character set on your primary, the console may show the replica’s character set as “AL32UTF8”. However, in the database, it matches the primary in the database. This can be confirmed with

select  property_value 
from    DATABASE_properties 
where   property_name='NLS_CHARACTERSET'
;

5. The console will display a message that the switchover is complete. However, this is displayed prematurely. The actual switchover will take additional time. Monitor the status of the two database instances under the “Status” column.

Trigger with a subprogram in PL/SQL

Last week I had a requirement to create a trigger on a table that had to perform some logic and then run additional updates based on the outcome of the logic. The logic ran into multiple IF statements resulting in long messy code. Additionally, there was second requirement to execute this chunk of code from two places within the trigger functionality. I moved the code to a subprogram within the trigger’s PL/SQL logic resulting in a modular structure. This is a simplified example of the trigger code to demonstrate the technique.

The below lines declare a subprogram with some logic that creates an update statement. The EXECUTE IMMEDIATE is commented out for testing and demonstration:

  PROCEDURE perform_calculations(cust_num number) IS
    BEGIN 

        v_upd_sql :='';

		IF  :OLD.CUST_ZIP <> :NEW.CUST_ZIP THEN  
		    v_upd_sql := ' UPDATE DEAN_CUSTOMER SET CUST_ZIP_OLD = ' || :OLD.CUST_ZIP || ' WHERE CUST_NUM = ''' || CUST_NUM || '''';
   	    END IF;  

		--EXECUTE IMMEDIATE v_upd_sql;
		dbms_output.put_line('SQL --->' || v_upd_sql);  
    END;

The below line of code calls the subprogram:

        perform_calculations(:old.cust_num);

In the real-world scenario, I was working on, the code issued a BULK COLLECT similar to below and ran the subprogram for each row collected:

        v_main_sql := ' SELECT ROWID FROM ' || v_table_name || 
                      ' WHERE COL01 = ' :OLD.COL01;

        EXECUTE IMMEDIATE v_main_sql BULK COLLECT INTO var01;
        FOR i IN 1..var01.COUNT
        LOOP
            perform_calculations(var01(i));
        END LOOP;

“var01” was defined as shown below:

TYPE var01_type IS TABLE OF ROWID;
Var01 var01_type;

Code

create or replace TRIGGER UPDATE_TRIGGER AFTER UPDATE ON DEAN_CUSTOMER FOR EACH ROW 
DECLARE 
  v_upd_sql         VARCHAR2(8000); 

  table_not_found EXCEPTION;  
  PRAGMA EXCEPTION_INIT (table_not_found, -942);  

  PROCEDURE perform_calculations(cust_num number) IS
    BEGIN 

        v_upd_sql :='';

		IF  :OLD.CUST_ZIP <> :NEW.CUST_ZIP THEN  
		    v_upd_sql := ' UPDATE DEAN_CUSTOMER SET CUST_ZIP_OLD = ' || :OLD.CUST_ZIP || ' WHERE CUST_NUM = ''' || CUST_NUM || '''';
   	    END IF;  

		--EXECUTE IMMEDIATE v_upd_sql;
		dbms_output.put_line('SQL --->' || v_upd_sql);  
    END;

    BEGIN 
        dbms_output.put_line(' ');  
        dbms_output.put_line('********** Trigger DC invoked **********');  
        dbms_output.put_line(' ');  
        perform_calculations(:old.cust_num);
END;

Table definition

CREATE TABLE DEAN_CUSTOMER
(
  “CUST_NUM"     NUMBER(8,0),
  "CUST_NAME"    VARCHAR2(10 BYTE), 
  "CUST_ZIP"     NUMBER(12,0),
  "CUST_ZIP_OLD" NUMBER(12,0)
) SEGMENT CREATION DEFERRED 
  PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 
  NOCOMPRESS LOGGING
  TABLESPACE "USERS"
;

Finding the sizes of LOBs and listing the associated table and columns

I created the below SQL to identify the largest LOB objects and associate them back to the respective table and column:

set pages 99
set lines 200
col owner           format a12
col table_name      format a25
col column_name     format a25
col segment_name    format a30
col partition_name  format a15
SELECT      tc.owner,
            tc.table_name,
            tc.column_name,
            lb.segment_name,
            sg.partition_name,
            Round(( sg.bytes / (1024*1024*1024)),2) AS size_gb
FROM        dba_tab_cols tc
inner join  dba_lobs lb
        ON  lb.owner        =  tc.owner
       AND  lb.table_name   =  tc.table_name
       AND  lb.column_name  =  tc.column_name
inner join  dba_segments sg
        ON  sg.segment_name =  lb.segment_name
WHERE       tc.owner        = 'schema owner here'
  AND       tc.data_type LIKE '%LOB%'
  and       Round(( sg.bytes / (1024*1024*1024)),2) > 50
ORDER  BY 
            size_gb DESC,
            tc.owner,
            tc.table_name,
            tc.column_name,
            lb.segment_name
;

ORA-00059: maximum number of DB_FILES exceeded

I got paged over the weekend for an Enterprise Edition Release 19.0.0.0.0 RAC database running out of space. When I attempted to add a new data file with the commands:

alter tablespace USERS add datafile '+DATA' size 31G;

the SQL failed with the response

Attempted to add a datafile but encountered
ERROR at line 1:
ORA-00059: maximum number of DB_FILES exceeded

I checked the DB_FILES parameter with the command:

SQL> show parameter db_files;
 
NAME           TYPE        VALUE
-------------- ----------- -----
db_files       integer     100

Check the number of data files with the below SQL:

SQL> select count(*) from dba_data_files;
 
  COUNT(*)
----------
       100

This parameter is not dynamically modifiable, so I issued the command:

alter system set db_files=125 scope=spfile sid='*';

and attempted to do a rolling bounce with:

srvctl stop instance -d PRODDB -i PRODDB4 -o immediate

This resulted in the error

CRS-2974: unable to act on resource 'ora.PRODDB.db' on server 'prodbox87' because that would require stopping or relocating resource 'ora.PRODDB.PRODDB_svc.svc' but the appropriate force flag was not specified

I checked the status of the services with:

srvctl status service -d PRODDB
Service PRODDB_svc is running on instance(s) PRODDB1,PRODDB2,PRODDB3,PRODDB4

Since the service was configured to run on the other nodes, I stopped the service on node 4 with:

srvctl stop service -d PRODDB -i PRODDB4

and retried the rolling bounce. The instance shutdown successfully but did not come back up because of the below error:

srvctl start instance -d PRODDB -i PRODDB4
PRCR-1013 : Failed to start resource ora.PRODDB.db
PRCR-1064 : Failed to start resource ora.PRODDB.db on node prodbox87
CRS-5017: The resource action "ora.PRODDB.db start" encountered the following error:
ORA-01105: mount is incompatible with mounts by other instances
ORA-01174: DB_FILES is 125 buts needs to be 100 to be compatible
. For details refer to "(:CLSN00107:)" in "/oracle/product/diag/crs/prodbox87/trace/crsd_oraagent_oracle.trc".
 
CRS-2674: Start of 'ora.PRODDB.db' on 'prodbox87' failed

At this point I was forced to perform a bounce of all nodes of the RAC after which I was able to add a new data file.

Moving a table from one user/schema to another

I had a requirement to move a table from one user (i.e. schema) to another. As this is not possible with an ALTER, the usual approach is to do CREATE TABLE AS or an export-import. Depending on the volume of data, this may take a long time. I found this interesting approach on google and decided to try it

Can you move a table between schemas ?

Many thanks to the person that identified this approach.

My test was as follows:

As sysdba, I created the below users. The assumption is user JIM needs to be dropped but his table named JIMS_TABLE needs to be transferred to user BOB. User Mike is a user who has select on JIM.JIMS_TABLE. The grant on DBA_TABLES to JIM is only to facilitate the creation of a test table. It has no other relevance to the test.

create user jim identified by N0t_Jim#01#  TEMPORARY TABLESPACE TEMP DEFAULT TABLESPACE PRODTSPACE;
grant connect to jim;
GRANT UNLIMITED TABLESPACE TO jim;
grant create table to jim;
grant select on dba_tables to jim;
 
 
create user bob identified by N0t_Bob#01#  TEMPORARY TABLESPACE TEMP DEFAULT TABLESPACE PRODTSPACE;
grant connect to bob;
GRANT UNLIMITED TABLESPACE TO bob;
grant create table to bob;
grant select on dba_tables to bob;
 
 
create user mike identified by N0t_Mike#01#  TEMPORARY TABLESPACE TEMP DEFAULT TABLESPACE PRODTSPACE;
grant connect to mike;

As user Jim, create a table, grant select to user MIKE.

create table jim.jims_table
tablespace PRODTSPACE
as
select * from dba_tables
;
 
GRANT SELECT ON JIM.JIMS_TABLE TO MIKE;

The table now exists

col owner format a5;
col table_name format a10;
select
owner, table_name, tablespace_name, partitioned, row_movement
from dba_tables
where owner in ('JIM','BOB')
;
 
OWNER TABLE_NAME TABLESPACE_NAME                PAR ROW_MOVE
----- ---------- ------------------------------ --- --------
JIM   JIMS_TABLE PRODTSPACE                     NO  DISABLED

With the following grants:

set pages 99;
col grantee  format a10;
col owner format a10;
col table_name format a10;
col grantor  format a10;
col PRIVILEGE format a17;
 
SELECT GRANTEE, OWNER, TABLE_NAME, GRANTOR, PRIVILEGE
FROM DBA_TAB_PRIVS
WHERE  owner in ('JIM','BOB')
order by grantee
;
 
GRANTEE    OWNER      TABLE_NAME GRANTOR    PRIVILEGE
---------- ---------- ---------- ---------- -----------------
BOB        JIM        JIMS_TABLE JIM        ALTER
BOB        JIM        JIMS_TABLE JIM        DELETE
BOB        JIM        JIMS_TABLE JIM        INDEX
BOB        JIM        JIMS_TABLE JIM        INSERT
BOB        JIM        JIMS_TABLE JIM        SELECT
BOB        JIM        JIMS_TABLE JIM        UPDATE
BOB        JIM        JIMS_TABLE JIM        FLASHBACK
BOB        JIM        JIMS_TABLE JIM        REFERENCES
BOB        JIM        JIMS_TABLE JIM        READ
BOB        JIM        JIMS_TABLE JIM        ON COMMIT REFRESH
BOB        JIM        JIMS_TABLE JIM        QUERY REWRITE
BOB        JIM        JIMS_TABLE JIM        DEBUG
MIKE       JIM        JIMS_TABLE JIM        SELECT

In preparation to move the table, user JIM grants all on the table to the future new owner BOB.

GRANT ALL ON JIM.JIMS_TABLE TO BOB;

Next, JIM generates the DDL of the table

set long 9999999 pages 0 heading off pagesize 0;
SELECT dbms_metadata.get_ddl('TABLE','JIMS_TABLE') FROM dual;

Give the DDL to user BOB, change the table owner and add a partitioning clause (see the last line of DDL). In order to reduce the size of the DDL, I have deleted many of the columns.

  CREATE TABLE "BOB"."JIMS_TABLE"
   (    "OWNER" VARCHAR2(128) NOT NULL ENABLE,
        "TABLE_NAME" VARCHAR2(128) NOT NULL ENABLE,
        "TABLESPACE_NAME" VARCHAR2(30),
        .
        .
        .
        Columns deleted
        .
        .
        .
        "DATA_LINK_DML_ENABLED" VARCHAR2(3),
        "LOGICAL_REPLICATION" VARCHAR2(8)
   ) SEGMENT CREATION IMMEDIATE
  PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
 NOCOMPRESS LOGGING
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "PRODTSPACE"
  partition BY hash (OWNER) (partition p1);

We now have two tables

col owner format a5;
col table_name format a10;
select
owner, table_name, tablespace_name, partitioned, row_movement
from dba_tables
where owner in ('JIM','BOB')
;
 
OWNER TABLE_NAME TABLESPACE_NAME                PAR ROW_MOVE
----- ---------- ------------------------------ --- --------
JIM   JIMS_TABLE PRODTSPACE                     NO  DISABLED
BOB   JIMS_TABLE                                YES DISABLED

Now to swap the data from JIM’S table to the new BOB table

ALTER TABLE BOB.JIMS_TABLE exchange partition p1 WITH TABLE JIM.JIMS_TABLE;

The data is now in the new table

SELECT COUNT(*) FROM JIM.JIMS_TABLE;
SELECT COUNT(*) FROM BOB.JIMS_TABLE;
 
SQL> SELECT COUNT(*) FROM JIM.JIMS_TABLE;
 
  COUNT(*)
----------
         0
 
SQL> SELECT COUNT(*) FROM BOB.JIMS_TABLE;
 
  COUNT(*)
----------
      2228

The grants will have to be re-granted on the new table named BOB.JIMS_TABLE;

GRANTEE    OWNER      TABLE_NAME GRANTOR    PRIVILEGE
---------- ---------- ---------- ---------- -----------------
BOB        JIM        JIMS_TABLE JIM        ALTER
BOB        JIM        JIMS_TABLE JIM        DELETE
BOB        JIM        JIMS_TABLE JIM        INDEX
BOB        JIM        JIMS_TABLE JIM        INSERT
BOB        JIM        JIMS_TABLE JIM        SELECT
BOB        JIM        JIMS_TABLE JIM        UPDATE
BOB        JIM        JIMS_TABLE JIM        REFERENCES
BOB        JIM        JIMS_TABLE JIM        READ
BOB        JIM        JIMS_TABLE JIM        ON COMMIT REFRESH
BOB        JIM        JIMS_TABLE JIM        QUERY REWRITE
BOB        JIM        JIMS_TABLE JIM        DEBUG
BOB        JIM        JIMS_TABLE JIM        FLASHBACK
MIKE       JIM        JIMS_TABLE JIM        SELECT

Large number of deletes causing table fragmentation resulting in poor query performance

I was asked to look into a nightly script that had significantly increased in elapsed time. The script was rather simple in that it deleted rows older than (today’s date – 3 months) from multiple tables. I ran an ADDM report on the database (12.1.0.2.0 – 64bit Production), and one of the deletes popped up in the report:

   Action
      Run SQL Tuning Advisor on the DELETE statement with SQL_ID
      "75csungpxhvv6".
      Related Object
         SQL statement with SQL_ID 75csungpxhvv6.
         delete from PROD_SCHEMA.PROD_TABLE where ORDER_DT<20201212
   Rationale
      The SQL spent 100% of its database time on CPU, I/O and Cluster waits.
      This part of database time may be improved by the SQL Tuning Advisor.
   Rationale
      Database time for this SQL was divided as follows: 100% for SQL
      execution, 0% for parsing, 0% for PL/SQL execution and 0% for Java
      execution.
   Rationale
      I/O and Cluster wait for INDEX "PROD_SCHEMA.PROD_TABLE_ORDER_DT" with object ID
      322364 consumed 100% of the database time spent on this SQL statement.

The table in question was not large; a few 100 thousand rows with reasonable row length; no large varchars etc. The obvious issues were quickly eliminated; the table has current stats, there was an index to exactly match the predicate, and the optimizer was choosing the index. However, I did notice that the table was occupying 28 GB of space. A quick calculation of # of rows multiplied by row length indicated that it should be occupying less than 1 GB of space approx.

I ran the below query and confirmed that most of the space was empty

set line 200;
set pages 9999;
col owner                format a15               heading "Owner"
col table_name           format a30               heading "Table Name"
col avg_row_len          format 999,999,999       heading "Avg Row Len"
col num_rows             format 999,999,999,999   heading "Number of Rows"
col TOTAL_SIZE_MB        format 999,999,999       heading "Total|Size|(MB)"
col ACTUAL_SIZE_MB       format 999,999,999       heading "Actual|Size|(MB)"
col FRAGMENTED_SPACE_MB  format 999,999,999       heading "Frag|Size|(MB)"
col percentage           format 999.99            heading "Per|Cent|Frag"
SELECT   owner
        ,table_name
        ,avg_row_len
        ,num_rows
        ,Round((( blocks * 8 / 1024 )), 2) "TOTAL_SIZE_MB"
        ,Round(( num_rows * avg_row_len / 1024 / 1024 ), 2)  "ACTUAL_SIZE_MB"
        ,Round(( ( blocks * 8 / 1024 ) - ( num_rows * avg_row_len / 1024 / 1024 ) ), 2) "FRAGMENTED_SPACE_MB"
        ,Round(( Round(( ( blocks * 8 / 1024 ) - ( num_rows * avg_row_len / 1024
                                                  / 1024
                                                ) ), 2
                     )
                     / Round((( blocks * 8 / 1024 )), 2) ) * 100, 2)  "percentage"
FROM     dba_tables
WHERE    num_rows > 0
  and    table_name = 'PROD_TABLE'
  and    owner = ' PROD_SCHEMA '
order by FRAGMENTED_SPACE_MB desc 
;

We were able to ask the application for a small outage to export (data pump), truncate, and import that data from the table. After this, the query that used to take almost 8 minutes started executing in less than a second.

ORA-38882 on standby database

We ran into an interesting situation recently when MRP failed on a standby with the error:

ORA-38882: Cannot drop tablespace UNDOTBS2 on standby database due to guaranteed restore points.

The background was that we had to remove a node from our primary RAC to upgrade the server. We did not remove any nodes from the standby. After the node was removed from the primary, MRP failed on the standby with the above error. As far as we could remember or check, we did not have a guaranteed restore point on the database. A google search led us to Mike Dietrich’s Blog, where he had responded to a similar error in a different context. The link is

Mike Dietrich’s Blog About Oracle Database

We were not using a PDB, but since we did not have any other ideas, we decided to try Mike Dietrich’s solution and issued the following alter:

alter system set "_allow_drop_ts_with_grp"=true;

After the above alter, the database was bounced, and MRP started successfully.

Upgrade failure (11.2.0.4 to 19c) followed by ORA-29702: error occurred in Cluster Group Service operation & ORA-00704, ORA-00604 and ORA-00904

Update – April 10, 2021
We encountered a failure during an upgrade from 12.1 to 19c. The SYSTEM tablespace ran out of space causing the upgrade to fail. When we attempted to mount the database on the 12c home we encountered the error

 ORA-00704: bootstrap process failure
 ORA-00604: error occurred at recursive SQL level 1
 ORA-00904: "SPARE10": invalid identifier

and

ORA-29702: error occurred in Cluster Group Service operation

We mounted the database in the oracle home and performed the following steps

startup nomount;

alter database mount;

alter database open;

flashback database to restore point PRE_UPGRADE;

alter database open resetlogs;

shutdown immediate;

After that, we continued at “After consulting Google, we found the new process is to run a manual upgrade as follows:” in the below post. The upgrade lasted for 90 minutes and performed 107 phases. Most phases run between 10 and 150 seconds except for phase 53, 1135 seconds, and phase 98, 1388 seconds.

Original post continues below:
We were attempting to upgrade an Oracle database from 11.2.0.4 to 19 in an Oracle Exadata cluster in the Oracle Cloud Infrastructure. Approximately 44% in to the upgrade the process failed and performed a rollback. We suspect we had some invalid objects that caused this. However, when we attempted to start the database at 11.2.0.4 (old Oracle home) we encountered the error message “ORA-29702: error occurred in Cluster Group Service operation”. Oracle suggested that we bounce CRS across all nodes together (i.e. non-rolling) to resolve this. However, the database still did not start and displayed the message:

[actpsrvr-ACPTDB1] srvctl start database -d ACPTDB
PRCR-1079 : Failed to start resource ora.ACPTDB.db
CRS-5017: The resource action "ora.ACPTDB.db start" encountered the following error:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-39700: database must be opened with UPGRADE option
Process ID: 251124
Session ID: 775 Serial number: 1
. For details refer to "(:CLSN00107:)" in "/grid19/crs/trace/crsd_oraagent_oracle.trc".
 
CRS-2674: Start of 'ora.ACPTDB.db' on 'acptsrvr' failed
CRS-2632: There are no more servers to try to place resource 'ora.ACPTDB.db' on that would satisfy its placement policy
CRS-5017: The resource action "ora.ACPTDB.db start" encountered the following error:
ORA-01092: ORACLE instance terminated. Disconnection forced
ORA-00704: bootstrap process failure
ORA-39700: database must be opened with UPGRADE option
Process ID: 135914
Session ID: 775 Serial number: 1
. For details refer to "(:CLSN00107:)" in "/grid19/crs/trace/crsd_oraagent_oracle.trc".
 
CRS-2674: Start of 'ora.ACPTDB.db' on 'actpsrvr' failed

Any attempt to start the database resulted in the database starting in upgrade mode at this point. So we switched the database to the new 19c home and issued a STARTUP UPGRADE on node 01 and then attempted to run CATUPGRD which displayed:

SQL> @?/rdbms/admin/catupgrd.sql
DOC>######################################################################
DOC>######################################################################
DOC>                                 ERROR
DOC>
DOC>
DOC>    As of 12.2, customers must use the parallel upgrade utility, catctl.pl,
DOC>    to invoke catupgrd.sql when upgrading the database dictionary.
DOC>    Running catupgrd.sql directly from SQL*Plus is no longer supported.
DOC>
DOC>    For Example:
DOC>
DOC>
DOC>          catctl
DOC>
DOC>          or
DOC>
DOC>          cd $ORACLE_HOME/rdbms/admin
DOC>          $ORACLE_HOME/perl/bin/perl catctl.pl catupgrd.sql
DOC>
DOC>    Refer to the Oracle Database Upgrade Guide for more information.
DOC>
DOC>
DOC>######################################################################
DOC>######################################################################
DOC>#
Disconnected from Oracle Database 19c EE Extreme Perf Release 19.0.0.0.0 - Production
Version 19.8.0.0.0

After consulting Google, we found the new process is to run a manual upgrade as follows:

cd $ORACLE_HOME/bin
./dbupgrade

This kicks of a PERL script that does the upgrade. The log files are in:

$ORACLE_HOME/product/19.0.0.0/dbhome_9/cfgtoollogs/ACPTDB/upgrade20210226194351

Takes about 90 minutes and the log files keep switching in the above directory so keep changing which one you are tailing.

After the upgrade was finished, I attempted to start with SRVCTL and got the below error:

[actpsrvr-ACPTDB1] srvctl status database -d ACPTDB
PRCD-1229 : An attempt to access configuration of database ACPTDB was rejected because its version 11.2.0.4.0 differs from the program version 19.0.0.0.0. Instead run the program from /11.2.0/dbhome_2.

This is documented by Oracle in note “After manual database upgrade, srvctl commands fail with PRCD-1027, PRCD-1229 (Doc ID 1281852.1)”

CAUSE
Oracle Clusterware keys for the database still refer to the old ORACLE_HOME.
 
SOLUTION
1. Upgrade the Oracle Clusterware keys for the database by running the "srvctl upgrade database" command.
 
Run srvctl from the new release $ORACLE_HOME to upgrade the database keys. For example:
 
/bin/srvctl upgrade database -d DB_NAME -o NEW_ORACLE_DB_HOME

After we issued the above command the database started normally with SRVCTL commands. Oddly, it cut about 100 logs in quick succession before it calmed down and started reporting routine errors on objects. We then recompiled all the objects and the upgrade was successful.

ORA-31604: invalid transform NAME parameter “MODIFY” for object type PROCACT_INSTANCE in function ADD_TRANSFORM

While performing an import data pump operation into an Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 – 64bit Production we encountered the below error:

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
Master table "SYS"."SYS_IMPORT_FULL_15" successfully loaded/unloaded
Starting "SYS"."SYS_IMPORT_FULL_15":  "/******** AS SYSDBA" directory=exp_dir dumpfile=resolvecode.dmp remap_table=myuser.accounting_codes:accounting_codes_prod 
Processing object type TABLE_EXPORT/TABLE/PROCACT_INSTANCE
ORA-39126: Worker unexpected fatal error in KUPW$WORKER.LOAD_MD_TRANSFORMS [] 
ORA-31604: invalid transform NAME parameter "MODIFY" for object type PROCACT_INSTANCE in function ADD_TRANSFORM
ORA-06512: at "SYS.DBMS_SYS_ERROR", line 86
ORA-06512: at "SYS.KUPW$WORKER", line 9710

This issue is documented on Oracle support at

Impdp Fails With ORA-39126 ORA-31604 Iinvalid Transform NAME Parameter "MODIFY" For Object Type PROCACT_INSTANCE in function ADD_TRANSFORM (Doc ID 1596495.1)

The solution is to change the import data pump command from:

impdp \'/ as sysdba \' dumpfile=resolvecode.dmp directory=exp_dir remap_table=myuser.accounting_codes:accounting_codes_prod

to:

impdp \'/ as sysdba \' dumpfile=resolvecode.dmp directory=exp_dir remap_table=myuser.accounting_codes:accounting_codes_prod exclude=PROCACT_INSTANCE

The use of the exclude=PROCACT_SYSTEM will exclude the resource manager objects such as resource plans and groups.

There is also a patch available, but the addition of exclude=PROCACT_INSTANCE solved my issue faster than patching.

CRS-6706: Oracle Clusterware Release patch level (‘xxxxxxxx’) does not match Software patch level (‘xxxxxxxxx’). Oracle Clusterware cannot be started

After applying the October 2020 patch to a 12c GI, we were attempting to bring up CRS and got the below error message:

CRS-6706: Oracle Clusterware Release patch level (‘xxxxxxxx’) does not match Software patch level (‘xxxxxxxxx’). Oracle Clusterware cannot be started

This is described in Oracle support document 1639285.1

We stopped CRS on all nodes and performed the below on the RAC node that reported the error:

Sudo to root and navigate to the grid home/install directory and run

./rootcrs.pl –unlock

The output will look like:

Using configuration parameter file: ./crsconfig_params
2021/01/15 00:52:44 CLSRSC-4012: Shutting down Oracle Trace File Analyzer (TFA) Collector.
 
2021/01/15 00:52:55 CLSRSC-4013: Successfully shut down Oracle Trace File Analyzer (TFA) Collector.
 
2021/01/15 00:52:56 CLSRSC-347: Successfully unlock /grid_home/12.2.0.2

After the above, run the patch command:

./rootcrs.pl –patch

The output will look like:

Using configuration parameter file: ./crsconfig_params
2021/01/15 00:53:12 CLSRSC-4015: Performing install or upgrade action for Oracle Trace File Analyzer (TFA) Collector.
 
2021/01/15 00:53:12 CLSRSC-4005: Failed to patch Oracle Trace File Analyzer (TFA) Collector. Grid Infrastructure operations will continue.
 
2021/01/15 00:53:24 CLSRSC-329: Replacing Clusterware entries in file 'oracle-ohasd.conf'
 
CRS-4123: Oracle High Availability Services has been started.
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'pdcc04-d03'
CRS-2673: Attempting to stop 'ora.drivers.acfs' on 'pdcc04-d03'
CRS-2677: Stop of 'ora.drivers.acfs' on 'pdcc04-d03' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'pdcc04-d03' has completed
CRS-4133: Oracle High Availability Services has been stopped.

After this, we were able to start CRS on all nodes of the RAC.