Installing the Oracle Client 12c on an AWS Linux EC2 – Silent Mode

After the EC2 is built, run the below commands as root. Depending on the network security, you may have to either set or unset proxy variables.

/usr/bin/yum install -y binutils.x86_64 compat-libcap1.x86_64 compat-libstdc++-33.i686 \
compat-libstdc++-33.x86_64 gcc.x86_64 gcc-c++.x86_64 glibc-devel.i686 glibc-devel.x86_64 glibc.i686 \
glibc.x86_64 libXtst.x86_64 libXtst.i686 ksh.x86_64 libaio.x86_64 libaio.i686 libaio-devel.i686 \
libaio-devel.x86_64 libgcc.i686 libgcc.x86_64 libstdc++.x86_64 libstdc++.i686 libstdc++-devel.i686 \
libstdc++-devel.x86_64 libXi.i686 libXi.x86_64 make.x86_64 sysstat.x86_6

The above command will install all of the prerequisites required for the Oracle Client 12c .

Create the install group

/usr/sbin/groupadd oinstall

Create the dba group

/usr/sbin/groupadd dba

Create the oracle user and add to the groups

/usr/sbin/useradd -g oinstall -G dba oracle

Set the oracle password

passwd oracle

Respond to the above prompt with the oracle password that you want to use.

Make a directory for the software

mkdir /apps/opt/oracle

Change ownership of the directory to the oracle user and the oinstall group

chown oracle:oinstall /apps/opt/oracle

Change to oracle user

su - oracle

Copy the software into the above directory via ftp, sftp, scp etc. The directory will contain file

linuxamd64_12102_client.zip

Unzip the foftware

unzip linuxamd64_12102_client.zip

Change directory to the client directory that was unzipped.

cd /apps/opt/oracle/client

Run the installer in silent mode

./runInstaller -silent \
 -responseFile /apps/opt/oracle/client/response/client_install.rsp    \
   oracle.install.client.installType=Runtime           \
   UNIX_GROUP_NAME=dba                                 \
   INVENTORY_LOCATION=/apps/opt/oracle/oraInventory            \
   SELECTED_LANGUAGES=en                               \
   ORACLE_HOME=/apps/opt/oracle/base/rtclient64            \
   ORACLE_BASE=/apps/opt/oracle/base                           \
   waitForCompletion=true

At the end of the installation, you should see the message

The installation of Oracle Client 12c was successful.
Please check '/apps/opt/oracle/oraInventory/logs/silentInstall2017-08-30_06-00-25PM.log' for more details

Exit from oracl back to root

exit

As a root user, execute the following script

/apps/opt/oracle/oraInventory/orainstRoot.sh

The above script will display messages similar to

Changing permissions of /apps/opt/oracle/oraInventory.
Adding read,write permissions for group.
Removing read,write,execute permissions for world.

Changing groupname of /apps/opt/oracle/oraInventory to dba.
The execution of the script is complete.

Set up environment variables for the oracle user. Go to the oracle user’s home

su - oracle
cd

Edit the oracle users

vi .bash_profile

Setup the below variable to make it easier for the users to access sqlplus etc.

# Oracle client variables
export ORACLE_HOME=/apps/opt/oracle/base/rtclient64
export LD_LIBRARY_PATH=$ORACLE_HOME/lib:$LD_LIBRARY_PATH
PATH=$ORACLE_HOME/bin:$PATH
which sqlplus

Test sqlplus

sqlplus user_id

SQL*Plus: Release 12.1.0.2.0 Production on Wed Aug 30 18:23:27 2017

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Enter password: 

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, Oracle Label Security,
OLAP, Data Mining and Real Application Testing options

Moving a database to a new server while attempting to use the old remote catalog

As part of a hardware upgrade, I moved a database to a new RAC. I was attempting to use my backup scripts from the old RAC. The scripts contained connections to a remote catalog that I wanted to reuse. On the new RAC, I attempted to register and resync the catalog and got the below error messages:

[new_server-DBNAME2] rman catalog=rman_cat/rman_password@remote_catalog target=/ 

Recovery Manager: Release 11.2.0.4.0 - Production on Fri Sep 8 10:47:33 2017

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: DBNAME (DBID=599767446)
connected to recovery catalog database

RMAN> register database;

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of register command on default channel at 09/08/2017 10:47:44
RMAN-20002: target database already registered in recovery catalog

RMAN> RESYNC CATALOG;

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of resync command on default channel at 09/08/2017 10:47:51
RMAN-20020: database incarnation not set

To resolve this issue, I executed the commands:

RMAN> unregister database;

database name is "DBNAME" and DBID is 599767446

Do you really want to unregister the database (enter YES or NO)? yes
database unregistered from the recovery catalog

After the database was unregistered, I was able to register the database:

RMAN> register database;

database registered in recovery catalog
starting full resync of recovery catalog
full resync complete

RMAN> RESYNC CATALOG;

starting full resync of recovery catalog
full resync complete

The resync catalog command may be redundant, but it never hurts to make sure.

AWS – EC2 with Oracle for a I/O intensive application – Part 02

This week one of our applications is attempting to benchmark the performance of an Oracle database running on an Exadata machine versus running in the AWS cloud on an x1.16xlarge EC2. Our initial thought was to define the storage as two ASM disk groups, each consisting of four 400 gb disks at 6,000 IOPS each per ASM disk group. Assuming that a given data file was spread across all four disks in a disk group, we could have speeds approaching 20,000 IOPS.

During testing, Amazon recommended that the ASM data be placed on single disks running at 20,000 IOPS. The current maximum IOPS that can be provisioned is 20,000 IOPS. Additionally, AWS requires that the IOPS:gb ratio be maintained at a maximum of 50:1. In order to meet these requirements, we decided to create two mount points, each of 1000 gb at 20,000 IOPS. An additional 1000 gb at 20,000 IOPS was also added for the archive logs.

We created a json template to spin up the three disks and mounted them to the existing EC2. Before mounting, make sure that the mount points are not already being used by currently attached disks. After the disks are mounted, we attached them to ASM and waited for the rebalance to complete. After the rebalance was completed, we dropped the old disks that were provisioned at 6000 IOPS. The application saw a 67% decrease in elapsed time on some of the queries.

RMAN – Mismatched objects

One of my production databases was scripted to write backups to /backup01. For reasons not germane to this post, this mount point became full and I had to move the backups to /backup02. I copied the old backups and modified the script to write to /backup02 starting with the next run. However, the next execution gave me the below error.

RMAN-06207: WARNING: 1 objects could not be deleted for DISK channel(s) due
RMAN-06208:          to mismatched status.  Use CROSSCHECK command to fix status
RMAN-06210: List of Mismatched objects
RMAN-06211: ==========================
RMAN-06212:   Object Type   Filename/Handle
RMAN-06213: --------------- ---------------------------------------------------
RMAN-06214: Backup Piece    /backup01/GPn4sbh1bm_1_1.bus

This was a direct result of my moving the backups from one location to another external to RMAN. To resolve this error, I issued a cross check

RMAN> crosscheck backuppiece '/backup01/GPn4sbh1bm_1_1.bus';
 
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1011 instance=instance name device type=DISK
crosschecked backup piece: found to be 'EXPIRED'
backup piece handle=/backup01/GPn4sbh1bm_1_1.bus RECID=125099 STAMP=951616886
Crosschecked 1 objects

followed by a delete obsolete

RMAN> delete obsolete;                         
 
RMAN retention policy will be applied to the command
RMAN retention policy is set to recovery window of 7 days
using channel ORA_DISK_1
Deleting the following obsolete backups and copies:
Type                 Key    Completion Time    Filename/Handle
-------------------- ------ ------------------ --------------------
Backup Set           125099 10-AUG-17         
  Backup Piece       125099 10-AUG-17          /backup01/GPn4sbh1bm_1_1.bus
 
Do you really want to delete the above objects (enter YES or NO)? y
deleted backup piece
backup piece handle=/backup01/GPn4sbh1bm_1_1.bus RECID=125099 STAMP=951616886
Deleted 1 objects

This resolved the issue.

Resolving a persistent locking issue hours after it occurred

A fellow DBA supported an application that had a process that compiled a number of packages every night as part of a scheduled script. Sample command:

ALTER PACKAGE schema.package_name COMPILE PACKAGE;
 
ERROR at line 1:
ORA-04021: timeout occurred while waiting to lock object
 

Over the last couple of days this script began to fail every night. I used the below SQL to identify the blocking session:

select    * 
from      gv$active_session_history
where     sample_time > sysdate -  (interval '1' day)
  and     sql_opname = 'ALTER PACKAGE'
  and not blocking_Session is null
order by  sample_time
;

This SQL returned the information identifying the blocks with the sample time exactly matching the script’s scheduled run time. I used the sample time, blocking session id, session serial and instance id and ran the below SQL:

select * 
from   gv$active_session_history
where sample_time between 
                  to_date('07-AUG-17 08.01 PM',
                          'dd-mon-yy hh.mi pm') 
                  and 
                  to_date('07-AUG-17 09.32 PM',
                          'dd-mon-yy hh.mi pm')
  and  session_id      = 
  and session_serial# = 
  and inst_id         = 
order by sample_time
;

This provided me the last sql_id that the blocking session had executed.

select  * 
from    gv$sql
where   inst_id = 
  and   sql_id in 'sql id'
;

The last SQL that the blocking session had executed happened to be a select. However, working on the assumption that it had previously performed a DML, I cancelled the session and the ALTER PACKAGES script was able to execute successfully.

AWS – EC2 with Oracle for a I/O intensive application

This morning we received a request to create an EC2 instance in AWS for an Oracle database. The application that was intending to benchmark on this EC2 instance tended to be very I/O intensive. In the past we had been using general-purpose disks rather than provisioned IOPS. To ensure that this application got the best I/O performance, they wanted the disks to be provisioned at the highest possible IOPS rate. Amazon has a restriction that the IOPS:GB ratio cannot exceed 30:1. We were striping the disks to improve I/O such that each ASM group consisted of 4 disks. We did some math and came up with the below structure:

IOPS    #of disks    Size per disk   Ratio
                               (GB)
Arch      3000    4            100             30
ASM1      6000    4            200             30
ASM2      6000    4            200             30
Total size = 2000 gb

Missing an hourly snapshot while running AWR?

Have you ever tried to run an AWR report only to find that your hourly snapshots did not get created? A colleague of mine ran into this and worked with Oracle to identify a solution:

select snap_id,BEGIN_INTERVAL_TIME,END_INTERVAL_TIME 
from dba_hist_snapshot 
where BEGIN_INTERVAL_TIME > sysdate - 1 
and instance_number = 1 
order by 1; 

SNAP_ID BEGIN_INTERVAL_TIME END_INTERVAL_TIME 
---------- ------------------------------ ------------------------------ 
239690 2017-06-28-01.17.23.246000 2017-06-28-02.00.34.302000 
239691 2017-06-28-02.00.34.302000 2017-06-28-03.00.28.007000 
239692 2017-06-28-03.00.28.007000 2017-06-28-04.00.39.460000 
239693 2017-06-28-04.00.39.460000 2017-06-28-05.00.03.694000 
239694 2017-06-28-05.00.03.694000 2017-06-28-06.00.13.452000 <--missing 
239696 2017-06-28-07.00.10.342000 2017-06-28-08.00.20.270000 
239697 2017-06-28-08.00.20.270000 2017-06-28-09.00.09.222000 
239698 2017-06-28-09.00.09.222000 2017-06-28-10.00.39.143000 
239699 2017-06-28-10.00.39.143000 2017-06-28-11.00.55.633000 
239700 2017-06-28-11.00.55.633000 2017-06-28-12.00.02.801000 
239701 2017-06-28-12.00.02.801000 2017-06-28-13.00.12.972000 
239702 2017-06-28-13.00.12.972000 2017-06-28-19.21.20.515000 <--missing
12 rows selected.        

The issue is caused by MMON being suspended – confirmed by this command:

oradebug unit_test kebm_dmp_slv_attrs kewrmafsa_ 
Status: 3 
Flags: 0 
Runtime limit: 900 
CPU time limit: 300 
Violations: 4 
Suspended until: 1498817730  <-- non-zero indicated MMON is suspended

To clear the suspension issue this command:

oradebug unit_test kebm_set_slv_attrs kewrmafsa_ retain retain retain retain 0 0
Modified attributes of kewrmafsa_ (slave id 12)

And now it shows zero and snapshots are being generated again:

oradebug unit_test kebm_dmp_slv_attrs kewrmafsa_
Status: 3
Flags: 0
Runtime limit: 900
CPU time limit: 300
Violations: 0
Suspended until: 0  zero indicates MMON is running normally

The reason for the missing snapshots was SYSAUX cleanup work which locked up some of the tables that snapshot generation needs. But it can happen for a variety of reasons.

Automate the move of a single instance Oracle database to a new EC2

This week we were challenged to come up with a solution that would allow a database administrator to move a single instance of an Oracle database on an AWS EC2 machine to a different EC2 machine. We were tasked with coming up with as much automation as possible to accomplish this goal.

A colleague had already written the required cloud formation template to create the new EC2 instance and to install Oracle on it via an Ansible playbook. All of this was executed via Jenkins. The challenge now was to be able to restore the database on the new EC2 instance. In order to automate the backup and restore part of the process I created a python script that accepted as input the parameters:

  • Name of the database
  • Location of the script directory that contained maintenance scripts for the database
  • An AWS S3 bucket to be used for transporting the data

The python script could be downloaded from an S3 bucket that we had set up as a code repository. Once the python script had been downloaded onto the source server, the database administrator could run the script and provide the above parameters. Python would then create the required backup and restore scripts, copy the crontab entries and stage the maintenance scripts for transfer to the new server.

The backup script backs up the database via RMAN commands (including the database, archive logs, spfile, control file) and then uploads all of this to an S3 bucket.

The next step would be to run the cloud formation template (to create a new EC2 instance) and the Ansible playbook (to install Oracle, ASM, etc.). We added a parameter to the Ansible playbook to accept the name of the S3 bucket that holds the backups. At the end of the creation and install, the playbook would

  • Copy the script generated on the source server from the S3 bucket
  • Run the restore script which would
    • Download all the files from the S3 bucket
    • Create the adump directory
    • Create a spfile from the backup
    • Create a control file
    • Restore and recover the database
    • Register the database with the listener
    • Add entry in /etc/oratab
    • Restore cron entries to the crontab
    • Copy the scripts from the source server to the target server into the same directory
    • Open the database in read/write mode
    • Log all of the above to a file for review

All of this worked as expected. The one weak link is the amount of time that it would take to transfer the information to and from the S3 bucket. An alternative would be to FTP the data between the servers but I wanted to try using an S3 bucket.

142
51
73
77