ORA-04031 in the middle of upgrade to 11.2.0.4

A colleague of mine was performing an upgrade of a four node Oracle RAC from 11.2.0.3 to 11.2.0.4 using DBUA silent mode. About halfway through the upgrade the following error was reported and DBUA failed:

Upgrading Oracle interMedia 
UPGRADE_PROGRESS : 46% 
ORA-00604: error occurred at recursive SQL level 1 
ORA-04031: unable to allocate 4096 bytes of shared memory ("java

I suspected that the solution was to increase the below parameters and restart the upgrade.

ALTER SYSTEM SET SGA_TARGET=2048M scope=spfile sid='*';
ALTER SYSTEM SET pga_aggregate_target=500M  scope=spfile sid='*';
ALTER SYSTEM SET sga_max_size =2048M scope=spfile sid='*';

Also set cluster to false for the upgrade

alter system set cluster_database=true scope=spfile;

However, because this was a production system, I opened a severity one service request with our Oracle to get their opinion before we continued. Oracle agreed with the with the above. The steps we followed were:

01.     Set Oracle home to 11.2.0.4
02.     Copy initora file from 11.2.0.3 home to 11.2.0.4 home
        (file contains 1 line pointing to spfile in asm)
03.     Some DB processes were running on node 4. Killed 
        these via Unix kill commands
04.     Startup no mount on node 4 via SQL (not srvctl)
05.     Make above alters
06.     Shutdown immediate
07.     Startup no mount on node 4 via SQL (not srvctl)
08.     Confirm above parameter changes were in place
09.     Shutdown immediate
10.     startup upgrade
11.     Exit this session
12.     cd to oracle home
13.     Start a new session as sqlplus " / as sysdba " 
14.     spool /tmp/upgrade.log 
15.     set echo on 
16.     @$ORACLE_HOME/rdbms/admin/catupgrd.sql; 
17.     spool off 
18.     Shutdown immediate (seemed to have been issued 
        in the catupgrd.sql)
19.     Check catupgrd.sql spool file for errors
20.     Restart the database in normal mode
21.     @$ORACLE_HOME/rdbms/admin/catuppst.sql; 
22.     @$ORACLE_HOME/rdbms/admin/utlrp.sql; 
23.     Execute dbupgdiag.sql

After the above is complete, upgrade the Oracle Clusterware Configuration

srvctl upgrade database -d  -o 

Our time zone was already current so we did not have to upgrade to the latest version.

After all of the above was complete, the database was ready to use. We had to complete some routine tasks such as turning of flash back, confirming that all scripts had been updated to use the new oracle home etc.

Refreshing a database from one LPAR/Subsystem to another

I am often tasked with cloning a production DB2 database from one LPAR/Subsystem into another LPAR/Subsystem. This has to be accomplished with minimum impact to the production database. In order to accomplish this, I created a process based on REXX generating the required JCL. In summary, my approach is:

Generate a list of the latest image copy files for the production database via SQL/SPUFI

A REXX routine reads in the above list and generates one unload JCL per table space (we have 1 table per table space) that
– unloads the Image copy
– edits the punch file to change the load parameters
– submits MFT processes for the punch and data
– submits the respective load job on the target

The REXX also creates the required load JCL that has logic to
– display the status of the database before and after the load
– Load the data
– Repair any check pending
– Perform an image copy (There is a requirement to recover the table multiple times during the testing cycle)

The PDS containing the load JCL is copied to the target before starting any of the unloads. As each unload finishes, it starts the transfer process about the data and punch files. After the transfer of the files is complete the process automatically starts the load.

The process also contains logic to throttle the unload and load processes via the use of jes agents. LOB tables are handle via cursor loads.

Using this process I am able to copy approximately 600 tables containing approximately 2.25 billion rows of data (about 1 TB compressed) in about 10 hours without impacting the production system.

Upgrading an Oracle database with Oracle Wallet Manager

Today we upgraded a 4 node RAC from 11.2.0.3 to 11.2.0.4. Unfortunately, we did not know the database was using encryption and the Oracle Wallet Manager. The users began to notice issues accessing the encrypted table. I ran the below SQL and the status indicated there was an issue:

SQL> select wrl_type wallet,status,wrl_parameter wallet_location from v$encryption_wallet;
 
WALLET  STATUS              WALLET_LOCATION 
------  ------------------  ------------------------
file    OPEN_NO_MASTER_KEY /opt/oracle/product/cost

The solution for this was to

back up the sqlnet.ora file in the ora_home_11.2.0.4/network/admin directory
 
On each node of the RAC, copy the sqlnet.ora file 
from the ora_home_11.2.0.3/network/admin directory 
to   the ora_home_11.2.0.4/network/admin directory
 
bounce the RAC with srvctl commands

I am not sure if a rolling bounce would have resolved the issue or not. We were in a hurry to get the application back into service and hence bounced the database.

Oracle – Performing silent patching

As part of the process to automate the creation of an Oracle AMI based on an Amazon Linux AMI, I had to create a yml that would install Oracle and patch it. Since this is being done via Jenkins/Ansible/yml etc., all the patching had to be performed in silent mode. In order to run opatch in silent mode, you need a response file. This is created as follows:

change directory to any directory that you want to create the response file output

cd /tmp

Invoke the command to create the file

oracle_home/OPatch/ocm/bin/emocmrsp

You will be prompted as follow:

OCM Installation Response Generator 10.3.7.0.0 - Production
Copyright (c) 2005, 2012, Oracle and/or its affiliates.  All rights reserved.
 
Provide your email address to be informed of security issues, install and
initiate Oracle Configuration Manager. Easier for you if you use your My
Oracle Support Email address/User Name.
Visit http://www.oracle.com/support/policies.html for details.
Email address/User Name: 
 
You have not provided an email address for notification of security issues.
Do you wish to remain uninformed of security issues ([Y]es, [N]o) [N]:  y
The OCM configuration response file (ocm.rsp) was successfully created.

A file named ocm.rsp will be create in /tmp.

You can use this file to perform silent patching. As an example, I am applying patch 16188701

cd /software/11204/11204_160416_RDBMS_ONEOFF/03_parallel/16188701
opatch apply -local -silent -ocmrf /software/ocm.rsp

Export/Import from a database containing Object Types

If you are performing and Export/Import from a source database that contains types, you may encounter the error:

ORA-39083: Object type TYPE failed to create with error:
ORA-02304: invalid object identifier literal
Failing sql is:
CREATE TYPE ""."object_type_name"   OID '8FD43617684FA1F6E033A625D854A1F6' AS OBJECT
(
   parm_01             varchar2(18),
   parm_02             varchar2(50)
);

Each type has an OID. If you create an object type and do not specify an OID, Oracle generates an OID and assigns it to the type.

To resolve this error, you need to transform the OID by adding the below option to you IMPDP command:

transform=OID:n:type

The type object will now get a new OID in the target database.

Unusual variation in elapsed times on a SQL

Today we had users complaining that a query would run fast the first time and then fail to complete the second time. The SQL used bind parameters and we (DBAs) suspected that the issue was caused by the user changing the bind parameters. However, when we ran the SQL ourselves with hard coded values we also experienced the same behavior. The first execution completed in < 5 seconds and the second execution was killed at > 400 seconds of elapsed time. After some searching on the web I ran across an excellent explanation at the web site:

http://orcasoracle.squarespace.com/oracle-rdbms/2012/12/18/when-a-query-runs-slower-on-second-execution-a-possible-side.html

We added

/*+ OPT_PARAM('_OPTIMIZER_USE_FEEDBACK' 'FALSE') */

and the SQL began to perform consistently every time.

This was experienced on Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 – 64bit Production.

Python/Selenium example

My daughter asked me to create a process that would log on to her high school grade website, collect her current grades and calculate the current grade-point average. I used this as an opportunity to become familiar with Selenium. You can find more information about Selenium here:

http://www.seleniumhq.org/

I coded this in Python and the below blog posting contains the relevant parts of the script with explanation. I was running this on Kali Linux as root.

Assuming that you already have Python installed on your system, you can get Selenium as follows:

install selenium

After you have installed Selenium, you have to install the correct driver for the browser that you intend to use. For this example, I was using the Firefox browser which needs the geckodriver. Additional information on this can be found at:
https://developer.mozilla.org/en-US/docs/Mozilla/QA/Marionette/WebDriver

The geckodriver location has to be added to the PATH variable. As an example:

export PATH=$PATH:/path/to/geckodriver

At the top of the program, after the usual imports for Python, add in the Selenium commands:

#!/usr/bin/python
#
import sys
import time
import string

from decimal import *

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

This routine will set up the username and password and then open the browser:

def read_web_page():
    user_name = 'user name'
    pass_word = 'password'
        
    #
    ## Open the browser
    #
    driver = webdriver.Firefox()

Navigate to the web page with:

    driver.get("https:website name.jsp?status=login")

Confirm that you at the correct web page with:

    assert "Campus Parent Portal Login" in driver.title
    temp = driver.current_url
    print 'Now on page ' + temp

Now that we are on the login page, find the elements username and password and supply the information from the variables created previously:

    elem = driver.find_element_by_name("username")
    elem.send_keys(user_name)

    elem = driver.find_element_by_name("password")
    elem.send_keys(pass_word)

Click on the sign in button and wait for 30 seconds for the website to respond:

    driver.find_element_by_css_selector("input[src*=btn_sign_in]").click()
    time.sleep(30)

Confirm that we are at the grades web page:

    #
    ## click on grades
    #
    temp = driver.current_url
    print 'Now on page ' + temp

This web page had multiple frames. The following command switches to the frame detail page, finds the element grades and clicks on it. After the click is complete wait for 20 seconds for the website to respond:

    driver.switch_to_frame("frameDetail")
    driver.find_element_by_link_text("Grades").click()
    time.sleep(20)

After the website has responded (i.e. the grades are being displayed), grab the contents of the web page (i.e. page source) and convert all characters to “printable characters”:

    #
    ## get the grades
    #
    temp = driver.page_source
    printable = set(string.printable)
    temp2 = filter(lambda x: x in printable, temp)

Write the page source to a file as text:

    grades_file = open('/dean/python/grades_file_raw.txt','w')
    grades_file.write(temp2)
    grades_file.close()

Sign out of the web site:

    #
    ## Sign out
    #
    #driver.switch_to_frame("frameDetail")
    driver.find_element_by_link_text("Sign Out").click()
    time.sleep(10)

This particular website had an alert window that pops up with the question “Do you really want to log off?”. The below command switch to the alert window and accept the alert indicating consent to log off:

    alert = driver.switch_to_alert()
    alert.accept()
    time.sleep(10)
    driver.quit()

All of the above processing involves the selenium driver. Now that the information is available as text in a file, I was able to parse it with the regular Python commands. Some of these are shown below as examples:

    
    grades_file_raw = open('/dean/python/grades_file_raw.txt','r')
    grades_file_fin = open('/dean/python/grades_file_fin.txt','w')
    prev_teacher = ""
    for raw_line in grades_file_raw:
        printable = set(string.printable)
        temp = filter(lambda x: x in printable, raw_line)
        raw_line = temp

        if  'div id="studentName"' in raw_line: 
             student = cut_str(2, '>', '<', raw_line) 
             out_line = student grades_file_fin.write('%-60s\n' %(out_line)) 
             out_line = '-' * len(student) grades_file_fin.write('%-60s\n' %(out_line)) 

Additional code removed for brevity.

Finding the row that returned the -803

One of our users created a somewhat complex update statement that was intended to update about 1600 rows in a table. The update statement was coded to eliminate the possibility of duplicate rows, however we continued to encounter -803 responses.

SQLCODE = -803, ERROR:  AN INSERTED OR UPDATED VALUE IS INVALID       
BECAUSE INDEX IN INDEX SPACE UNQNDX1 CONSTRAINS COLUMNS OF THE TABLE 
SO NO TWO ROWS CAN CONTAIN DUPLICATE VALUES IN THOSE COLUMNS.         
RID OF EXISTING ROW IS X'00000FC90F'.

We were somewhat at a loss to identify which row was causing this error. The below SQL was able to identify the row that was causing the issue. This also identified the logic error in the update SQL.

SELECT  RID(owner.table_name), owner.table_name.*  
  FROM  owner.table_name                            
WHERE   RID(owner.table_name) = 1034511             
;

The hex value reported above was X’00000FC90F’. Use the below calculator to convert this to decimal:

http://www.rapidtables.com/convert/number/hex-to-decimal.htm

Simulating an Oracle error into the alert log

In an Oracle AWS RDS instance we do not have direct access to the server and hence we could not use our regular CRON scripts. I was testing an Oracle Stored procedure to replace this functionality and needed to simulate an error in the alert log. I found the below web site

Simulating ORA-errors

(nice site with a good explanation)

and used these commands:

alter session set events '942 incident(SIMULATED_ERROR)';
drop table tablethatdoesnotexist;
alter session set events '942 trace name context off';

The alert log contained:

ORA-00700: soft internal error, arguments: [EVENT_CREATED_INCIDENT], [942], [SIMULATED_ERROR], [], [], [], [], [], [], [], [], []
ORA-00942: table or view does not exist

This was exactly what I needed for my testing.

Unix script to monitor the alert log for errors using x$dbgalertext

A simple script to monitor the alert log for errors via x$dbgalertext rather than the text error log. This script assumes a four node RAC. The script runs on each of the four nodes via CRON. The instance name is passed in as a parameter.

Call with
Check_alert_log.ksh instance_name1/2/3/4

The script will mail out the last 24 hours of the alert log at the 15:00 hour. All other executions will only look back 90 minutes and mail only if an error is found.

#!/bin/ksh
#
#-----------------------------------------------------------------------------------------
#
# Name       : Monitor_alert_log.ksh
# Purpose    : check the alert log
#
#-----------------------------------------------------------------------------------------
export PATH=$PATH:Oracle_Home/bin
export ORACLE_BASE=/opt/oracle/product
export ORACLE_HOME=Oracle_Home

function run_sql
{
	echo "Entering function run_sql `date`"	
	sqlplus -s / as sysdba << EOF > $ReportFile
	
	  set pagesize 100
	  set linesize 200
	  set tab off
	  set wrap off
	            
        col mt              for a16 head 'Time (MT)'
        col gmt             for a16 head 'Time (GMT)'
        col problem_key     for a10 head 'Error'
        col message_text    for a40 head 'Message Text' wrap
		select        TO_CHAR(new_time(originating_timestamp,'GMT','MDT'), 'MM-DD-YYYY HH24:MI') as MT
		             ,TO_CHAR(originating_timestamp,'MM-DD-YYYY HH24:MI') as GMT
		             ,problem_key
		             ,MESSAGE_TEXT
		from          x\$dbgalertext a
		where         originating_timestamp > sysdate - interval '$look_back' minute
		  and         MESSAGE_TEXT LIKE '%ORA-%'
		order by      originating_timestamp desc
		;
    exit
EOF

	echo "Exiting  function run_sql `date`"		
}


function decide_to_email
{
	egrep -i "no rows selected" $ReportFile > /dev/null
	if  [ $? -eq 0 ]
	then
	    if  [ $mandatory_mail == "Y" ]
	    then
	        echo 'Mailing out report based on time'
	    	mail -s "Alert Log monitoring $ORACLE_SID" "Email_id" < $ReportFile
	    fi
    else
	        echo 'Mailing out report based on errors found'
	   		mail -s "Alert Log monitoring $ORACLE_SID" "Email_id" < $ReportFile
	fi
}

#
## start of script
#
sid_parm=$1
export ORACLE_SID=$sid_parm
case $ORACLE_SID in
	 instance_name1)
		export ReportFile=/opt/oracle/product/diag/rdbms/database_name/instance_name1/database_name_monitor_alert_log.report
 		;;
	 instance_name2)
		export ReportFile=/opt/oracle/product/diag/rdbms/database_name/instance_name2/database_name_monitor_alert_log.report
 		;;
	 instance_name3)
		export ReportFile=/opt/oracle/product/diag/rdbms/database_name/instance_name3/database_name_monitor_alert_log.report
 		;;
	 instance_name4)
		export ReportFile=/opt/oracle/product/diag/rdbms/database_name/instance_name4/database_name_monitor_alert_log.report
 		;;
esac

echo 'Oracle base is : ' $ORACLE_BASE
echo 'Oracle home is : ' $ORACLE_HOME
echo 'Oracle SID is  : ' $ORACLE_SID
echo 'Path is        : ' $PATH
echo 'Report file is : ' $ReportFile

#
## If the hour is "15" then look back 24 hours and mail out unconditionally 
# 
TimeOfDay=`date +"%H"` if  [ $TimeOfDay == 15 ] 
	then
		mandatory_mail="Y"
		look_back=1440
	else
		mandatory_mail="N"
		look_back=90
fi		
echo 'Mandatory mail is set to   : ' $mandatory_mail
echo 'Look back period is set to : ' $look_back

#
## Call routine to run SQLs
#
echo 'Script started at ' `date`
run_sql

#
## Decide if an email is required or not # decide_to_email

#
## End of script
#
echo 'Script ended   at ' `date`
142
51
73
77