Recovering a tablespaces with a LOB Tablespace/Auxiliary table

Due to a miscommunication with the development team, the DBA team accidentally refreshed the data in an environment that was currently being used for a test cycle. The users requested that we recover the database back to the most recent copy which we had taken earlier that morning. I created a REXX routine to generate the approximately 600 recover jobs that were needed to perform this as fast as possible. Some of my jobs failed because the base table had a LOB column that was pointing to an auxiliary table in a LOB TABLESPACE.

The table structure was defined as follows (DDL has been edited to remove options not relevant):

     CREATE TABLE                                             
       PRODA.MY_TABLE_01                                      
        (                                                     
        ACCOUNT_NAME   CHAR(50) NOT NULL WITH DEFAULT          
         FOR SBCS DATA                                        
       ,ACCOUNT_TXT   CLOB(1 M) NOT NULL WITH DEFAULT          
         FOR SBCS DATA                                        
        )                                                     
         IN PRODAB00.TSPACE57  
         CCSID EBCDIC                                         
         NOT VOLATILE                                         
         APPEND NO                                                                           
     ;                                
     CREATE                           
       LOB TABLESPACE TSPACE58        
         IN DPDNBB00                  
         USING STOGROUP PRODASMS      
         PRIQTY -1                    
         SECQTY -1                    
         ERASE NO                         
         GBPCACHE SYSTEM                  
         NOT LOGGED                       
         DSSIZE 4 G                       
         BUFFERPOOL BP32K1                
         LOCKSIZE ANY                     
         LOCKMAX SYSTEM                   
         CLOSE NO                         
     ;                                    
     CREATE AUXILIARY TABLE               
         PRODA.MY_TABLE_01_AUX_01         
         IN PRODAB00.TSPACE58             
         STORES PRODA.MY_TABLE_01         
         APPEND NO                        
         COLUMN ACCOUNT_TXT                
         PART 1                           
     ;                                    

The initial recover was coded as follows:

//SYSIN    DD  *                                   
  RECOVER  TABLESPACE  PRODAB00.TSPACE57           
           TORBA       X'0000000015D79FB6B7E0'     
/*                                                 

This generated the error:

DSNUGUTC -  RECOVER TABLESPACE PRODAB00.TSPACE57 TORBA X'0000000015D79FB6B7E0'
DSNU1316I -DBXC 102 15:41:03.83 DSNUCAIN - THE FOLLOWING TABLESPACES ARE MISSING FROM THE RECOVERY LIST
PRODAB00.TSPACE58

I changed the JCL as follows:

//SYSIN    DD  *                                  
  RECOVER  TABLESPACE  PRODAB00.TSPACE57          
           TABLESPACE  PRODAB00.TSPACE58          
           TORBA       X'0000000015D79FB6B7E0'    
/*                                                

And the recover completed:

DSNUCBMD - RECOVER TABLESPACE PRODAB00.TSPACE57   START
DSNUCBAL - THE IMAGE COPY DATA SET HLQ.PRODAB00.TSPACE57.NVE.F1010801 WITH DATE=20180411 AND TIME=071133 IS PARTICIPATING IN RECOVERY OF TABLESPACE PRODAB00.TSPACE57

DSNUCBMD - RECOVER TABLESPACE PRODAB00.TSPACE58   START                    
DSNUCBAL - THE IMAGE COPY DATA SET HLQ.PRODAB00.TSPACE58.NVE.F1010801 WITH DATE=20180411 AND TIME=071133 IS PARTICIPATING IN RECOVERY OF TABLESPACE PRODAB00.TSPACE58

Moving temp table spaces on a RAC with users online

Please see blog post entitled “Moving undo table spaces on a RAC with users online” for some additional background. In summary, we had to free up space on a certain ASM disk group. We had about 175 GB allocate to temp and decided to move this with users actively accessing the database.

First, take the file offline:

ALTER DATABASE TEMPFILE '+ASM_DG_01/PRODPRM/temp01.dbf' OFFLINE;

Copy the file from the old ASM disk group to the new disk group. Set oracle home to your ASM instance and then issue ASMCMD.

cp +ASM_DG_01/PRODPRM/temp01.dbf +ASM_DG_04/PRODPRM/temp01.dbf

(Note that this is a copy, not a move.)

Rename the data file in the database in sqlplus. If you have changed the oracle home to the ASM instance, remember to change back to your RAC instance’s oracle home:

ALTER DATABASE RENAME FILE '+ASM_DG_01/PRODPRM/temp01.dbf' TO '+ASM_DG_04/PRODPRM/temp01.dbf';

Bring it back online:

ALTER DATABASE TEMPFILE '+ASM_DG_04/PRODPRM/temp01.dbf' ONLINE;

Now to delete the file from the old ASM disk group. Set oracle home to your ASM instance and then issue ASMCMD.

rm +ASM_DG_01/PRODPRM/temp01.dbf

The space utilized in the ASM disk group should now be reduced.

Moving undo table spaces on a RAC with users online

We were running out of space on an ASM DG and had to free up space ASAP. Since out UNDO tablespaces are sized at about 200gb and were on the ASM DG that was full, we decided to move them.

This was the primary database with a physical (data guard) standby, so I needed to turn off standby file management

ALTER SYSTEM SET STANDBY_FILE_MANAGEMENT='MANUAL';

Create a new undo file for the RAC node 01 in the new ASM disk group:

create undo tablespace UNDOTBS1A datafile '+ASM_DG_01/PRODPRM/undotbs1a.dbf' size 20G;

Alter to allocate the new undo file to RAC node 01:

ALTER SYSTEM SET UNDO_TABLESPACE=UNDOTBS1A SCOPE=BOTH SID='PRODPRM1';

Since this is a multi-node RAC, we can take one node offline to remove all users attached to the old undo tablespace off RAC node 01:

srvctl stop instance PRODPRM1 –d PRODPRM –o immediate

To be safe, make sure the RAC node is down with:

srvctl status database –d PRODPRM

Also, check the alert log on this particular node to make sure it was a clean shutdown.

With the RAC node down, drop the data file:

drop tablespace UNDOTBS1 including contents and datafiles;

Bring up RAC node 01 and it will be using the new undo table space:

srvctl start instance PRODPRM1 –d PRODPRM

Turn on standby file management

ALTER SYSTEM SET STANDBY_FILE_MANAGEMENT='AUTO';

Recovering in the event a user issues an incorrect SQL in DB2 for z/OS

If a user inadvertently updates, inserts or deletes data in a table during the day and you need to recover the table to a prior point in time, you can recover to a checkpoint rather than an Image Copy or a quiesce point to reduce the amount of data loss. For example, if the user issued a “bad” query at 17:18 on March 29, 2018, you can print the checkpoint queue with DB2 utility DSNJU004. The output will look similar to:

                    CHECKPOINT QUEUE                
                18:40:10 MARCH 29, 2018             
TIME OF CHECKPOINT       18:30:04 MARCH 29, 2018    
BEGIN CHECKPOINT RBA            0000000023FDD4187405
END CHECKPOINT RBA              0000000023FDD4194000
END CHECKPOINT STCK             00D41A198B683F8E0A00
TIME OF CHECKPOINT       18:15:04 MARCH 29, 2018    
BEGIN CHECKPOINT RBA            0000000023FDCE6AA615
END CHECKPOINT RBA              0000000023FDCE6B73E7
END CHECKPOINT STCK             00D41A16311B02B53A00
TIME OF CHECKPOINT       18:02:44 MARCH 29, 2018    
BEGIN CHECKPOINT RBA            0000000023FDC976C806
END CHECKPOINT RBA              0000000023FDC9785E4F
END CHECKPOINT STCK             00D41A136F4ACE910800
TIME OF CHECKPOINT       17:47:44 MARCH 29, 2018    
BEGIN CHECKPOINT RBA            0000000023FDC3A02EF2
END CHECKPOINT RBA              0000000023FDC3A11000
END CHECKPOINT STCK             00D41A1014FBDC3B0800
TIME OF CHECKPOINT       17:32:44 MARCH 29, 2018    
BEGIN CHECKPOINT RBA            0000000023FDBDA9FBDC
END CHECKPOINT RBA              0000000023FDBDAAD967
END CHECKPOINT STCK             00D41A0CBAAD71430800
TIME OF CHECKPOINT       17:17:44 MARCH 29, 2018    
BEGIN CHECKPOINT RBA            0000000023FDB7EAA020
END CHECKPOINT RBA              0000000023FDB7EC4622
END CHECKPOINT STCK             00D41A096060B5B53000

There is a checkpoint at 17:17 on March 29, 2018 that may be a good recovery point.

The tablespace can be recovered with

//SYSIN    DD  *                        
  RECOVER  TABLESPACE proddb00.prodts33 
           TORBA      X'23FDB7EC4622'   
/*                                      

If you have an old table space with multiple tables, they will all be recovered to the RBA. Rebuild all the associated indices after the recovery completes.

Delete a backup file from RMAN

Yesterday I had a requirement to remove a backup file from the RMAN catalog because we did not want recover to it. After some “research” on Google and the RMAN documentation, I found the below method:

Connect to RMAN an issue command:

list backup;

This will list all the backups cataloged in RMAN. Find the one that you want to “uncatalog”. For example, I wanted to remove the below file:

BS Key  Type LV Size       Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
27249   Full    3.75G      DISK        00:00:00     28-MAR-18      
        BP Key: 27249   Status: AVAILABLE  Compressed: NO  Tag: TAG20180328
        Piece Name: /backup/move/dean/PRODDBbtsuuqm6_1_1.dbc
  List of Datafiles in backup set 27249
  File LV Type Ckp SCN    Ckp Time  Name
  ---- -- ---- ---------- --------- ----
  120     Full 132807363910 29-MAR-18 +DG_ORA64/proddb/perfx_52.dbf

I then issued the below command in RMAN:

CHANGE BACKUPpiece 27249 uncatalog;

Response was

uncataloged backup piece
backup piece handle=/backup/move/dean/PRODDBbtsuuqm6_1_1.dbc RECID=27249 STAMP=972007418
Uncataloged 1 objects

The file was no longer part of the subsequent RECOVER and RESTORE commands.

ORA-31693 and ORA-06502 on an Import Datapump (IMPDP)

While importing data via an IMPDP job we encountered the error:

Import: Release 11.2.0.4.0 - Production on Tue Mar 20 13:59:30 2018
 
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
;;; 
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, Oracle Label Security,
OLAP, Data Mining and Real Application Testing options
Master table "SYS"."SYS_IMPORT_FULL_01" successfully loaded/unloaded
Starting "SYS"."SYS_IMPORT_FULL_01":  "/******** AS SYSDBA" directory=dpump dumpfile=dpump_data.dmp 
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
. . 
ORA-31693: Table data object "SCHEMA_NAME"."TABLE_NAME" failed to load/unload and is being skipped due to error:
ORA-06502: PL/SQL: numeric or value error
. . 
. . 
. . 
. . 
. . 
Job "SYS"."SYS_IMPORT_FULL_01" completed with 1 error(s) at Tue Mar 20 13:59:32 2018 elapsed 0 00:00:02

Per Oracle Support, this was documented in “Bug 18665004 – ORA-31693/ORA-6502 from Datapump import (impdp) into an existing table having a LONG or LONG RAW column (Doc ID 18665004.8)”. This table did have a “LONG” column.

We were not in a position to install the fix suggested and instead decided to try the work around. We dropped the only index on the table and reran the import successfully. After the import, we recreated the index.

ORA-16047: DGID mismatch

I was building a new standby and after I enabled log_archive_dest_state_2 on the primary, the below error message was reported in the primary’s alert_log:

ALTER SYSTEM SET log_archive_dest_state_2='ENABLE' SCOPE=BOTH SID='*';
Tue Mar 20 00:20:08 2018
Errors in file /oracle/home/trace/PRODDB_tt01_438.trc:
ORA-16047: DGID mismatch between destination setting and target database
Error 16047 for archive log file 6 to 'STDBYDB'
Tue Mar 20 00:20:08 2018
Errors in file /oracle/home/trace/PRODDB_tt01_438.trc:
ORA-16047: DGID mismatch between destination setting and target database
Tue Mar 20 00:20:08 2018
Errors in file /oracle/home/trace/PRODDB_tt01_438.trc:
ORA-16047: DGID mismatch between destination setting and target database
Tue Mar 20 00:20:08 2018
Errors in file /oracle/home/trace/PRODDB_arc1_8368.trc:
ORA-16047: DGID mismatch between destination setting and target database
Tue Mar 20 00:20:08 2018
PING[ARC1]: Heartbeat failed to connect to standby 'STDBYDB'. Error is 16047.

After a few minutes of consternation, I realized that I had an incorrect parameter on the standby:

SQL> show parameter db_unique_name;
 
NAME                                 TYPE        VALUE
db_unique_name                       string      PRODDB

Basically, I had inadvertently set the db_unique_name to PRODDB on both the primary and standby database. I changed the standby as follows:

alter system set db_unique_name=STDBYDB scope=spfile sid='*';

And bounced the standby. This resolved the issue.

Testing Oracle’s Database Security Assessment Tool

I received an email suggesting that I run a “Security Assessment” by using Oracle’s Database Security Assessment Tool. The tool promises to

“analyzes database configurations, users, their entitlements, security policies and identifies where sensitive data resides to uncover security risks and improve the security posture of Oracle Databases within your organization.”

The tool can be downloaded for free from support.oracle.com reference Oracle Database Security Assessment Tool (DBSAT) (Doc ID 2138254.1).

After you, agree, a zip file will be downloaded to your laptop. I choose to unzip the file and then FTP the contents to a directory on my database server. (I suggest creating a separate directory.)

The contents of the directory will be:

[snoopy_test-TSTRAC2] ls -l
total 352
-rw-r--r-- 1 oracle dba  12433 Mar 15 18:11 dbsat
-rw-r--r-- 1 oracle dba  12579 Mar 15 18:11 dbsat.bat
drwxr-xr-x 5 oracle dba     40 Mar 15 18:15 Discover
-rw-r--r-- 1 oracle dba  28216 Mar 15 18:11 sat_analysis.py
-rw-r--r-- 1 oracle dba  43181 Mar 15 18:11 sat_collector.sql
-rw-r--r-- 1 oracle dba 247465 Mar 15 18:11 sat_reporter.py
drwxr-xr-x 2 oracle dba   4096 Mar 15 18:15 xlsxwriter

chmod the files to add execute permissions.

The tool has 3 components:
Collector collects data on the system being assessed
Reporter analyzes the above output
Discoverer generates the Database Sensitive Data Assessment Report

Running the Collector
Run the collector with the below command. In my example, I ran this as sysdba with my ORACLE_HOME set to the RAC instance that I intended to execute the DBSAT tool:

./dbsat collect -n "/ as sysdba" /TSTRAC2/dean/dbsat/dbsat_collect
                 |               |                   | 
No encryption ---+               |                   |
                                 |                   | 
location for the output ---------+                   |
                                                     |
name of report file ---------------------------------+

The output will be written to a file named dbsat_collect.json

Running the Reporter
The reporter needs Python version 2.6 or later. This can be verified with

[snoopy_test-TSTRAC2] python -V
Python 2.7.5

Run the report with the below command:

./dbsat report -a -n /TSTRAC2/dean/dbsat/dbsat_collect
 
Note: 
a. Paste the above command into notepad. I used word to write this blog entry and word sometimes replaces the “-“ to a non-unix compatible character.
b. The file name from the Collector is specified without the extension

The below files will be generated:
dbsat_collect_report.txt
dbsat_collect_report.html
dbsat_collect_report.xlsx

The html file is probably the nicest format to review and will contain a large amount of information classified by level of risk.

Using the Discoverer
To run this part of the tool, you have to set up the

/TSTRAC2/dean/dbsat/Discover/conf/sample_dbsat.config
 
File which contains the information on the database connection and the location of the
 
sensitive_en.ini
 
file which contains the patterns that will be used to query database column names, column comments etc. These patterns are specified as regex and logical operators as documented in Section 1.4.3.1.2 Pattern File Configuration of the Database Security Assessment Tool User Guide.

Based on the parameters specified in the config files, the tool will then scan the database looking for column, comments etc. that match the patterns specified.

Some useful scheduler related commands

If you want to list all the jobs that are in the scheduler:

SELECT  A.job_name
       ,a.last_start_date
       ,a.next_run_date
       ,a.enabled
       ,a.*
FROM    dba_scheduler_jobs A
WHERE   job_name like ('%DAILY_JOB%')
;

To find out if a job ran successfully (or not):

select   a.job_name
        ,a.log_date
        ,a.status
        ,a.*
from     dba_SCHEDULER_JOB_LOG a
WHERE    a.job_name like ('%DAILY_JOB%')
order by
         a.job_name
        ,a.log_date desc
;

If you need to see the complete DDL of the job

select dbms_metadata.get_ddl('PROCOBJ','SOME_APP_JOB','DEAN') from dual;

The parameters are:

PROCOBJ      = procedural object which cover all objects in the scheduler
SOME_APP_JOB = Name of the job
DEAN         = Schema that contains the job

It does not seem to be possible to get the DDL for a job in the SYS schema. For example:

select dbms_metadata.get_ddl('PROCOBJ','FILE_WATCHER','SYS') from dual;

Will result in:

ORA-31603: object "FILE_WATCHER" of type PROCOBJ not found in schema "SYS"
ORA-06512: at "SYS.DBMS_METADATA", line 6069
ORA-06512: at "SYS.DBMS_METADATA", line 8666
ORA-06512: at line 1
31603. 00000 -  "object \"%s\" of type %s not found in schema \"%s\""
*Cause:    The specified object was not found in the database.
*Action:   Correct the object specification and try the call again.

One way to get around this is to copy the job to another schema as follows:

exec dbms_scheduler.copy_job('SYS.FILE_WATCHER',’DEAN.FILE_WATCHER’);

The parameters in the above SQL are ‘old_schema.job_name’ to ‘new_schema.jobname’.

As an FYI, a copied job will be created in (i.e. copied into) the new schema as a DISABLED job.
After you get the DDL, you can drop the job with:

BEGIN
    DBMS_SCHEDULER.DROP_JOB(job_name => '"SYS"."FILE_WATCHER"',
                                defer => false,
                                force => true);
END;

As mentioned above, a copied job will be created in (i.e. copied into) the new schema as a DISABLED job. You can enable the job with:

exec dbms_scheduler.enable('DEAN.SOME_APP_JOB');

Stuck in a loop between restore and recover database

This week a colleague of mine was building a physical standby. The method she had chosen was to copy the backups from the primary server to the standby server and then restore the database followed by a recover. The database was on the larger size; approximately 4 TB. After the restore completed successfully, she issued the recover database command and Oracle responded that data file 114 needed to be restored. At this point in time I got involved and issued a restore datafile 114 command which completed successfully.

We then tried the recover database command and were informed that data file 177 needed to be restored. We issued a restore data file 177 command which also completed successfully. The next recover database command pointed to a different file that needed to be restored. This continued alternating between a series of six data files.

At this point in time we looked at the V$datafile table with the below SQL

select name, count(*) from v$datafile group by name having count(*) > 1;

The SQL stated indicated that there were three pairs of data files, each with two identically named filenames. This has occurred because the source database had many more ASM disk groups than the target (i.e. the standby database). As a result of this mismatch between the number of ASM disk groups, three pairs of files were acquiring identical names in the same ASM disk groups on the standby. Each time we restored one of the duplicate pairs the other file of the duplicate pair would then need to be restored causing us to continuously loop between restores and recovers.

We resolve the issues by issuing set new name for file ID into new location commands and then restoring and recovering the database.