CloudFormation 101 – Part 01

Some quick blogs on using CloudFormation to provision AWS infrastructure using code. I have attempted to put descriptive comments in line to explain what each sub-unit of code is doing.

This YAML code creates a VPC, an internet gateway, and attaches the gateway to the VPC.

Some definitions:
VPC
Amazon Virtual Private Cloud (Amazon VPC) enables you to launch AWS resources into a virtual network that you’ve defined. This virtual network closely resembles a traditional network that you’d operate in your own data center, with the benefits of using the scalable infrastructure of AWS.

Internet Gateway
An internet gateway is a horizontally scaled, redundant, and highly available VPC component that allows communication between your VPC and the internet.

VPC Gateway Attachment
Attaches an internet gateway, or a virtual private gateway to a VPC, enabling connectivity between the internet and the VPC.

The code is below:

AWSTemplateFormatVersion: '2010-09-09'
#
## The Description section (optional) enables you to include comments about your template.
#
Description:  
  Create VPC, an internet gateway, and attach the internet gateway to the VPC
#
## Parameters section to customize your templates
#
Parameters:
  VPCName:
    Description: Name of the VPC
    Type: String
    Default: "MyVPC"    
    MinLength: '1'
    MaxLength: '30'
    AllowedPattern: '^[a-zA-Z]+[0-9a-zA-Z\-]*$'
    ConstraintDescription: Must contain alphabets and/or numbers.

  VpcCIDR:
    Description: Please enter the IP range (CIDR notation) for this VPC
    Type: String
    Default: 10.0.0.0/16    
    MinLength: '10'
    MaxLength: '18'
    AllowedPattern: "(\\d{1,3})\\.(\\d{1,3})\\.(\\d{1,3})\\.(\\d{1,3})/(\\d{1,2})"
    ConstraintDescription: Must be a valid CIDR range of the form x.x.x.x/x.

#
## Resources created by the stack
#
Resources:
  #
  ## Create the VPC
  ##
  ## Uses the intrinsic function Ref to get the value of the VPC Name
  ## from parameters above
  #
  VPC:
    Type: AWS::EC2::VPC
    Properties:
      CidrBlock: !Ref VpcCIDR
      EnableDnsSupport: true
      EnableDnsHostnames: true
      Tags:
        - Key: Name
          Value: !Ref VPCName
  #
  ## Create the IGW
  #
  InternetGateway:
    Type: AWS::EC2::InternetGateway
    Properties:
      Tags:
        - Key: Name
          Value: !Ref VPCName
  #
  ## Connect the IGW to the VPC
  #
  InternetGatewayAttachment:
    Type: AWS::EC2::VPCGatewayAttachment
    Properties:
      InternetGatewayId: !Ref InternetGateway
      VpcId: !Ref VPC

#
## Resources created by the stack
##
## Uses the intrinsic function Sub to get the stack name 
## from parameters above and substitute it into the name of
## the internet gateway
#          
Outputs:
  VPC:
    Description: Name of the VPC
    Value: !Ref VPC
    Export:
      Name: !Sub '${AWS::StackName}'

  InternetGateway:
    Description: Internet Gateway 
    Value: !Ref InternetGateway
    Export:
      Name: !Sub '${AWS::StackName}-InternetGateway'
      

Finding the sizes of LOBs and listing the associated table and columns

I created the below SQL to identify the largest LOB objects and associate them back to the respective table and column:

set pages 99
set lines 200
col owner           format a12
col table_name      format a25
col column_name     format a25
col segment_name    format a30
col partition_name  format a15
SELECT      tc.owner,
            tc.table_name,
            tc.column_name,
            lb.segment_name,
            sg.partition_name,
            Round(( sg.bytes / (1024*1024*1024)),2) AS size_gb
FROM        dba_tab_cols tc
inner join  dba_lobs lb
        ON  lb.owner        =  tc.owner
       AND  lb.table_name   =  tc.table_name
       AND  lb.column_name  =  tc.column_name
inner join  dba_segments sg
        ON  sg.segment_name =  lb.segment_name
WHERE       tc.owner        = 'schema owner here'
  AND       tc.data_type LIKE '%LOB%'
  and       Round(( sg.bytes / (1024*1024*1024)),2) > 50
ORDER  BY 
            size_gb DESC,
            tc.owner,
            tc.table_name,
            tc.column_name,
            lb.segment_name
;          

ORA-00059: maximum number of DB_FILES exceeded

I got paged over the weekend for an Enterprise Edition Release 19.0.0.0.0 RAC database running out of space. When I attempted to add a new data file with the commands:

alter tablespace USERS add datafile '+DATA' size 31G;

the SQL failed with the response

Attempted to add a datafile but encountered
ERROR at line 1:
ORA-00059: maximum number of DB_FILES exceeded

I checked the DB_FILES parameter with the command:

SQL> show parameter db_files;
 
NAME           TYPE        VALUE
-------------- ----------- -----
db_files       integer     100

Check the number of data files with the below SQL:

SQL> select count(*) from dba_data_files;
 
  COUNT(*)
----------
       100

This parameter is not dynamically modifiable, so I issued the command:

alter system set db_files=125 scope=spfile sid='*';

and attempted to do a rolling bounce with:

srvctl stop instance -d PRODDB -i PRODDB4 -o immediate

This resulted in the error

CRS-2974: unable to act on resource 'ora.PRODDB.db' on server 'prodbox87' because that would require stopping or relocating resource 'ora.PRODDB.PRODDB_svc.svc' but the appropriate force flag was not specified

I checked the status of the services with:

srvctl status service -d PRODDB
Service PRODDB_svc is running on instance(s) PRODDB1,PRODDB2,PRODDB3,PRODDB4

Since the service was configured to run on the other nodes, I stopped the service on node 4 with:

srvctl stop service -d PRODDB -i PRODDB4

and retried the rolling bounce. The instance shutdown successfully but did not come back up because of the below error:

srvctl start instance -d PRODDB -i PRODDB4
PRCR-1013 : Failed to start resource ora.PRODDB.db
PRCR-1064 : Failed to start resource ora.PRODDB.db on node prodbox87
CRS-5017: The resource action "ora.PRODDB.db start" encountered the following error:
ORA-01105: mount is incompatible with mounts by other instances
ORA-01174: DB_FILES is 125 buts needs to be 100 to be compatible
. For details refer to "(:CLSN00107:)" in "/oracle/product/diag/crs/prodbox87/trace/crsd_oraagent_oracle.trc".
 
CRS-2674: Start of 'ora.PRODDB.db' on 'prodbox87' failed

At this point I was forced to perform a bounce of all nodes of the RAC after which I was able to add a new data file.

Moving a table from one user/schema to another

I had a requirement to move a table from one user (i.e. schema) to another. As this is not possible with an ALTER, the usual approach is to do CREATE TABLE AS or an export-import. Depending on the volume of data, this may take a long time. I found this interesting approach on google and decided to try it

Can you move a table between schemas ?

Many thanks to the person that identified this approach.

My test was as follows:

As sysdba, I created the below users. The assumption is user JIM needs to be dropped but his table named JIMS_TABLE needs to be transferred to user BOB. User Mike is a user who has select on JIM.JIMS_TABLE. The grant on DBA_TABLES to JIM is only to facilitate the creation of a test table. It has no other relevance to the test.

create user jim identified by N0t_Jim#01#  TEMPORARY TABLESPACE TEMP DEFAULT TABLESPACE PRODTSPACE;
grant connect to jim;
GRANT UNLIMITED TABLESPACE TO jim;
grant create table to jim;
grant select on dba_tables to jim;
 
 
create user bob identified by N0t_Bob#01#  TEMPORARY TABLESPACE TEMP DEFAULT TABLESPACE PRODTSPACE;
grant connect to bob;
GRANT UNLIMITED TABLESPACE TO bob;
grant create table to bob;
grant select on dba_tables to bob;
 
 
create user mike identified by N0t_Mike#01#  TEMPORARY TABLESPACE TEMP DEFAULT TABLESPACE PRODTSPACE;
grant connect to mike;

As user Jim, create a table, grant select to user MIKE.

create table jim.jims_table
tablespace PRODTSPACE
as
select * from dba_tables
;
 
GRANT SELECT ON JIM.JIMS_TABLE TO MIKE;

The table now exists

col owner format a5;
col table_name format a10;
select
owner, table_name, tablespace_name, partitioned, row_movement
from dba_tables
where owner in ('JIM','BOB')
;
 
OWNER TABLE_NAME TABLESPACE_NAME                PAR ROW_MOVE
----- ---------- ------------------------------ --- --------
JIM   JIMS_TABLE PRODTSPACE                     NO  DISABLED

With the following grants:

set pages 99;
col grantee  format a10;
col owner format a10;
col table_name format a10;
col grantor  format a10;
col PRIVILEGE format a17;
 
SELECT GRANTEE, OWNER, TABLE_NAME, GRANTOR, PRIVILEGE
FROM DBA_TAB_PRIVS
WHERE  owner in ('JIM','BOB')
order by grantee
;
 
GRANTEE    OWNER      TABLE_NAME GRANTOR    PRIVILEGE
---------- ---------- ---------- ---------- -----------------
BOB        JIM        JIMS_TABLE JIM        ALTER
BOB        JIM        JIMS_TABLE JIM        DELETE
BOB        JIM        JIMS_TABLE JIM        INDEX
BOB        JIM        JIMS_TABLE JIM        INSERT
BOB        JIM        JIMS_TABLE JIM        SELECT
BOB        JIM        JIMS_TABLE JIM        UPDATE
BOB        JIM        JIMS_TABLE JIM        FLASHBACK
BOB        JIM        JIMS_TABLE JIM        REFERENCES
BOB        JIM        JIMS_TABLE JIM        READ
BOB        JIM        JIMS_TABLE JIM        ON COMMIT REFRESH
BOB        JIM        JIMS_TABLE JIM        QUERY REWRITE
BOB        JIM        JIMS_TABLE JIM        DEBUG
MIKE       JIM        JIMS_TABLE JIM        SELECT

In preparation to move the table, user JIM grants all on the table to the future new owner BOB.

GRANT ALL ON JIM.JIMS_TABLE TO BOB;

Next, JIM generates the DDL of the table

set long 9999999 pages 0 heading off pagesize 0;
SELECT dbms_metadata.get_ddl('TABLE','JIMS_TABLE') FROM dual;

Give the DDL to user BOB, change the table owner and add a partitioning clause (see the last line of DDL). In order to reduce the size of the DDL, I have deleted many of the columns.

  CREATE TABLE "BOB"."JIMS_TABLE"
   (    "OWNER" VARCHAR2(128) NOT NULL ENABLE,
        "TABLE_NAME" VARCHAR2(128) NOT NULL ENABLE,
        "TABLESPACE_NAME" VARCHAR2(30),
        .
        .
        .
        Columns deleted
        .
        .
        .
        "DATA_LINK_DML_ENABLED" VARCHAR2(3),
        "LOGICAL_REPLICATION" VARCHAR2(8)
   ) SEGMENT CREATION IMMEDIATE
  PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
 NOCOMPRESS LOGGING
  STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
  PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
  BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
  TABLESPACE "PRODTSPACE"
  partition BY hash (OWNER) (partition p1);

We now have two tables

col owner format a5;
col table_name format a10;
select
owner, table_name, tablespace_name, partitioned, row_movement
from dba_tables
where owner in ('JIM','BOB')
;
 
OWNER TABLE_NAME TABLESPACE_NAME                PAR ROW_MOVE
----- ---------- ------------------------------ --- --------
JIM   JIMS_TABLE PRODTSPACE                     NO  DISABLED
BOB   JIMS_TABLE                                YES DISABLED

Now to swap the data from JIM’S table to the new BOB table

ALTER TABLE BOB.JIMS_TABLE exchange partition p1 WITH TABLE JIM.JIMS_TABLE;

The data is now in the new table

SELECT COUNT(*) FROM JIM.JIMS_TABLE;
SELECT COUNT(*) FROM BOB.JIMS_TABLE;
 
SQL> SELECT COUNT(*) FROM JIM.JIMS_TABLE;
 
  COUNT(*)
----------
         0
 
SQL> SELECT COUNT(*) FROM BOB.JIMS_TABLE;
 
  COUNT(*)
----------
      2228

The grants will have to be re-granted on the new table named BOB.JIMS_TABLE;

GRANTEE    OWNER      TABLE_NAME GRANTOR    PRIVILEGE
---------- ---------- ---------- ---------- -----------------
BOB        JIM        JIMS_TABLE JIM        ALTER
BOB        JIM        JIMS_TABLE JIM        DELETE
BOB        JIM        JIMS_TABLE JIM        INDEX
BOB        JIM        JIMS_TABLE JIM        INSERT
BOB        JIM        JIMS_TABLE JIM        SELECT
BOB        JIM        JIMS_TABLE JIM        UPDATE
BOB        JIM        JIMS_TABLE JIM        REFERENCES
BOB        JIM        JIMS_TABLE JIM        READ
BOB        JIM        JIMS_TABLE JIM        ON COMMIT REFRESH
BOB        JIM        JIMS_TABLE JIM        QUERY REWRITE
BOB        JIM        JIMS_TABLE JIM        DEBUG
BOB        JIM        JIMS_TABLE JIM        FLASHBACK
MIKE       JIM        JIMS_TABLE JIM        SELECT

Commemorative Air Force Event – Colorado Springs

This week I had the opportunity to see two amazing airplanes restored to flying condition. The first was “Fifi”, a Boeing B-29 Superfortress and “Diamond Lil”, a B-24 Liberator. Both aircraft are one of only two in flying condition in the world. These two very special aircraft were accompanied by a P-51 Mustang and a T-6 Texan.

B-29 Engine start

B-29 Taxiing

B-24 Taxiing

P-51 Mustang Taxiing

Executing 7-Zip from the command line and protecting the archive with a password

My past practice to back up my personal files was to copy the files to a couple of USB drives and store them in a fireproof safe. However, with the continued drought and increased risk of fire in the Western States, I was evaluating a cloud-based backup solution. I must confess I am very wary of such a solution as I am placing my data (files) into the hands of a third party.

In my first attempt, I turned on passwords in all the software packages that I use so that the underlying data files will be password protected. However, I continued to feel uneasy about this and wanted another layer of security. I also needed the ability to pick and choose certain files from different directories for the cloud backup. I decide to use 7-Zip as it allowed me to:

a. create a windows shell script that I could run with a mouse click
b. add files from multiple locations
c. protect the encrypted archive with a password

You can download 7-Zip from here

Download 7-Zip

After you install 7-Zip, you can use it via its GUI. I, however, wanted to do this via a shell script. On Windows 10, under the c:\Program Files (x86)\7-Zip directory you will find the executable 7z.exe. You can either add this location to your Windows path variable or directly reference the executable as I have done below.

The commands to add files to an archive are:

"c:\Program Files (x86)\7-Zip\7z.exe" a C:\7zip-archive-location\output-archive.7z "C:\location-01\file-01.txt" -pYour_Super_Secret_Password

"c:\Program Files (x86)\7-Zip\7z.exe" a C:\7zip-archive-location\output-archive.7z "C:\location-02\file-02.txt" -pYour_Super_Secret_Password

"c:\Program Files (x86)\7-Zip\7z.exe" a C:\7zip-archive-location\output-archive.7z "C:\location-03\file-03.txt" -pYour_Super_Secret_Password

The command is broken down as follows:

"c:\Program Files (x86)\7-Zip\7z.exe" 
The location of the executable. Since the location has an embedded                                         space, the entire text is enclosed in double-quotes. For simplicity, I have enclosed all the file locations in double-quotes.

a
Parameter to create (add to) an archive (zip file) 

C:\7zip-archive-location\output-archive.7z
Location and name of the archive  

"C:\location-01\file-01.txt"
The file(s) you want to be included in the archive. Wildcards are permitted.

-pYour_Super_Secret_Password
The password for the file. Note that this parameter is preceded by a hyphen and the password follows the “p” parameter without a space.

After the file is created, I added the below command to list all the files in the archive:

"c:\Program Files (x86)\7-Zip\7z.exe" l C:\7zip-archive-location\output-archive.7z

The command is broken down as follows:

"c:\Program Files (x86)\7-Zip\7z.exe" 
The location of the executable. Since the location has an embedded space, the entire text is enclosed in double-quotes. For simplicity, I have enclosed all the file locations in double-quotes.

l
Parameter to list the files within an archive (zip file) 

C:\7zip-archive-location\output-archive.7z
Location and name of the archive  

CRON not executing scripts – Time zone issue

I ran into a strange issue today on a SunOS 5.11 11.4.24.75.2 sun4v sparc sun4v server. When I used the “date” command, I would see the current date and time displayed as (for example) 15:25 ET. I then created a CRON entry for 15:35 ET. However, the script would not execute at the time defined in CRON. I checked the usual permissions etc., but nothing worked. After some trial and error, a colleague identified that the user’s profile had the following entries:

TZ="EST5EDT"
export TZ

The server was actually running GMT, but the time was being displayed in ET. The scripts I was scheduling in the displayed time (ET) were not executing because CRON was scheduling the jobs in GMT.

After the above two lines were commented out, and the CRON entries defined in GMT, the scripts executed at the expected times.

Large number of deletes causing table fragmentation resulting in poor query performance

I was asked to look into a nightly script that had significantly increased in elapsed time. The script was rather simple in that it deleted rows older than (today’s date – 3 months) from multiple tables. I ran an ADDM report on the database (12.1.0.2.0 – 64bit Production), and one of the deletes popped up in the report:

   Action
      Run SQL Tuning Advisor on the DELETE statement with SQL_ID
      "75csungpxhvv6".
      Related Object
         SQL statement with SQL_ID 75csungpxhvv6.
         delete from PROD_SCHEMA.PROD_TABLE where ORDER_DT<20201212
   Rationale
      The SQL spent 100% of its database time on CPU, I/O and Cluster waits.
      This part of database time may be improved by the SQL Tuning Advisor.
   Rationale
      Database time for this SQL was divided as follows: 100% for SQL
      execution, 0% for parsing, 0% for PL/SQL execution and 0% for Java
      execution.
   Rationale
      I/O and Cluster wait for INDEX "PROD_SCHEMA.PROD_TABLE_ORDER_DT" with object ID
      322364 consumed 100% of the database time spent on this SQL statement.

The table in question was not large; a few 100 thousand rows with reasonable row length; no large varchars etc. The obvious issues were quickly eliminated; the table has current stats, there was an index to exactly match the predicate, and the optimizer was choosing the index. However, I did notice that the table was occupying 28 GB of space. A quick calculation of # of rows multiplied by row length indicated that it should be occupying less than 1 GB of space approx.

I ran the below query and confirmed that most of the space was empty

set line 200;
set pages 9999;
col owner                format a15               heading "Owner"
col table_name           format a30               heading "Table Name"
col avg_row_len          format 999,999,999       heading "Avg Row Len"
col num_rows             format 999,999,999,999   heading "Number of Rows"
col TOTAL_SIZE_MB        format 999,999,999       heading "Total|Size|(MB)"
col ACTUAL_SIZE_MB       format 999,999,999       heading "Actual|Size|(MB)"
col FRAGMENTED_SPACE_MB  format 999,999,999       heading "Frag|Size|(MB)"
col percentage           format 999.99            heading "Per|Cent|Frag"
SELECT   owner
        ,table_name
        ,avg_row_len
        ,num_rows
        ,Round((( blocks * 8 / 1024 )), 2) "TOTAL_SIZE_MB"
        ,Round(( num_rows * avg_row_len / 1024 / 1024 ), 2)  "ACTUAL_SIZE_MB"
        ,Round(( ( blocks * 8 / 1024 ) - ( num_rows * avg_row_len / 1024 / 1024 ) ), 2) "FRAGMENTED_SPACE_MB"
        ,Round(( Round(( ( blocks * 8 / 1024 ) - ( num_rows * avg_row_len / 1024
                                                  / 1024
                                                ) ), 2
                     )
                     / Round((( blocks * 8 / 1024 )), 2) ) * 100, 2)  "percentage"
FROM     dba_tables
WHERE    num_rows > 0
  and    table_name = 'PROD_TABLE'
  and    owner = ' PROD_SCHEMA '
order by FRAGMENTED_SPACE_MB desc 
;

We were able to ask the application for a small outage to export (data pump), truncate, and import that data from the table. After this, the query that used to take almost 8 minutes started executing in less than a second.

ORA-38882 on standby database

We ran into an interesting situation recently when MRP failed on a standby with the error:

ORA-38882: Cannot drop tablespace UNDOTBS2 on standby database due to guaranteed restore points.

The background was that we had to remove a node from our primary RAC to upgrade the server. We did not remove any nodes from the standby. After the node was removed from the primary, MRP failed on the standby with the above error. As far as we could remember or check, we did not have a guaranteed restore point on the database. A google search led us to Mike Dietrich’s Blog, where he had responded to a similar error in a different context. The link is

Mike Dietrich’s Blog About Oracle Database

We were not using a PDB, but since we did not have any other ideas, we decided to try Mike Dietrich’s solution and issued the following alter:

alter system set "_allow_drop_ts_with_grp"=true;

After the above alter, the database was bounced, and MRP started successfully.

142
51
73
77