REORG and contention – Forcibly terminating threads?

One of the challenges in performing online REORGs (i.e. SHRLEVEL CHANGE) is successfully completing the switch phase of the process. This problem seems to compound in direct relation to the number of distributed threads that connect to the database objects. I often see a pattern wherein a distributed thread acquires a lock or claim on an object and then does not perform a commit to release the lock or claim. This results in the REORG failing with a page out to the DBA.

I initially coded the REROG jobs with

REORG TABLESPACE   LIST CLUSTTS                            
      LOG           NO                        
      KEEPDICTIONARY                          
      ROWFORMAT BRF                           
      SORTDATA                                
      SORTKEYS                                
      NOSYSREC                                
      COPYDDN      (LPCOPY)                   
      RECOVERYDDN  (RBCOPY)                   
      SHRLEVEL      CHANGE                    
      DRAIN_WAIT    10 RETRY 3  RETRY_DELAY 10
      DEADLINE      NONE                      
      MAXRO         60                        
      DRAIN         WRITERS   <---
      LONGLOG       CONTINUE                  
      DELAY         0                         
      TIMEOUT       TERM                      
      FASTSWITCH    YES                       
      UNLOAD        CONTINUE                  
      STATISTICS    TABLE INDEX ALL           
      UPDATE ALL                

Assuming that this would be the least invasive approach. This was based on the below information from the Utility Guide:

WRITERS
Specifies that Db2 drains only the writers during the 
log phase after the MAXRO threshold is reached and 
then issues DRAIN ALL on entering the switch phase.

I changed the options to include DRAIN ALL FORCE ALL:

REORG TABLESPACE   LIST CLUSTTS                            
      LOG           NO                        
      KEEPDICTIONARY                          
      ROWFORMAT BRF                           
      SORTDATA                                
      SORTKEYS                                
      NOSYSREC                                
      COPYDDN      (LPCOPY)                   
      RECOVERYDDN  (RBCOPY)                   
      SHRLEVEL      CHANGE                    
      DRAIN_WAIT    10 RETRY 3  RETRY_DELAY 10
      DEADLINE      NONE                      
      MAXRO         60 
      DRAIN         ALL    <---                   
      FORCE         ALL    <---           
      LONGLOG       CONTINUE                  
      DELAY         0                         
      TIMEOUT       TERM                      
      FASTSWITCH    YES                       
      UNLOAD        CONTINUE                  
      STATISTICS    TABLE INDEX ALL           
      UPDATE ALL                

The results were as follow:

If the DDF thread had only performed a select without a commit, it was cancelled with the below message in the DB2 master address space:

DSNL027I  -PR01 SERVER DISTRIBUTED AGENT WITH  828                   
           LUWID=JF0C80D3.DC4E.D8C6D35F5B78=129996                   
                                                                     
THREAD-INFO=ABC123Y:67.19.138.241:ABC123y:db2jcc_application:DYNAMIC:
5719:*:<::67.19.138.241.55378.D8C6D35F5B78>                          
           RECEIVED ABEND=04E                                        
           FOR REASON=00E50013

If the DDF thread had performed an update without a commit, it was not cancelled. Instead, the REORG fails with the below message in the job output:

DSNURDRN - RESOURCE NOT AVAILABLE, REASON=X'00C200EA', ON 
DBNAME.TSNAME PROHIBITS PROCESSING

I would have preferred that the DRAIN ALL FORCE ALL options would have cancelled the thread in both cases.

Author: Dean Capps

Database consultant at Amazon Web Services.

2 thoughts on “REORG and contention – Forcibly terminating threads?”

Comments are closed.