Solved

Iseries object lock monitor and report

Posted on 2009-07-01
6
5,229 Views
Last Modified: 2013-12-06
Is there a way to setup a collection service and/or report that will provide object lock information for objects that generate contention errors?   We have a program that runs unattended that periodically encounters an allocation issue for a variety of files, typically due to an end user's exclusive access.   By the time we see the error in the joblog, the contention no longer exists and we have no way of following up with the party that caused the problem.   We use WRKOBJLCK to investigate issues presented during our attended monitoring, but there is no way to go back in time to see what caused the contention for an event that happened during our unattended hours.
0
Comment
Question by:keykjexpert
  • 3
  • 2
6 Comments
 
LVL 27

Expert Comment

by:tliotta
ID: 24758969
keykjexpert:

There are many, many possibilities...

Is this "program that runs unattended" one which your site has full control over? I.e., did your company create it, or is it part of a 3rd-party package or otherwise out of your hands?

What do you see in the joblog when the contention causes an error? It might be necessary to show one or more messages before the error message.

And, are you a programmer or are you looking for some operations automation?

Tom
0
 
LVL 35

Expert Comment

by:Gary Patterson
ID: 24759341
Do you own Performance Tools?  (GO LICPGM, option 10)

If you own the Performance Tools licensed program, and you capture performance data while the problems occur, you can run the Lock Report and see where your contention is.  If you don't have Performance Tools, you can still capture performance data, but you will have to query the data yourself, or send it to a service provider or IBM for analysis.

Might be an even easier way, though:  

What specific CPF message are you getting?  If you check the job log in the job that gets the error, you will often see a diagnostic or escape message that includes the name of the locking job - all you have to do is look at the second level help text (cursor on the message and press F1).  If you are seeing this in a printed job log, change the jobd or sbmjob, or schedule entry that starts the job to LOG(4 0 *SECLVL) to force second level message text to be printed to the spooled job log.

- Gary Patterson
0
 

Author Comment

by:keykjexpert
ID: 24764169
Gary,
The programs in question are out of our control, part of a packaged solution.
I am in the operations group, looking to help the programming team diagnose the root cause of the lock issue.   The problem is trying to figure out what happened after the fact.

Thanks,

Ken
The contention issue appears in the joblog as a "Not able to allocate objects needed for file FCTHFDL1 in library ORCFP member or program device FCTHDL1"
Followed by RPG MSGID RPG1216
                         Additional Message Information                        
 Message ID . . . . . . :   RPG1216                                            
 Date sent  . . . . . . :   07/01/09      Time sent  . . . . . . :   06:05:46  
                                                                               
 Message . . . . :   Error message CPF4128 appeared during OPEN (C S D F).      
                                                                               
 Cause . . . . . :   The RPG program FCT002 in library ORCOP received the      
   message CPF4128 while doing an implicit OPEN to file FCTHFDL1. See the job  
   log for a complete description of message CPF4128. If the file has a device  
   type of SPECIAL, there may be no message in the job log.                    
 Recovery  . . . :   Enter C to cancel, S to obtain a printout of system        
   storage, D to obtain an RPG formatted printout of system storage, or F to    
   obtain a full formatted printout of system storage.                          
 Possible choices for replying to message . . . . . . . . . . . . . . . :      
   D -- Obtain RPG formatted printout of system storage.                        
   S -- Obtain printout of system storage.                                      
   F -- Obtain full formatted printout of system storage.                      
                                                     
                                                                               
0
Webinar: Aligning, Automating, Winning

Join Dan Russo, Senior Manager of Operations Intelligence, for an in-depth discussion on how Dealertrack, leading provider of integrated digital solutions for the automotive industry, transformed their DevOps processes to increase collaboration and move with greater velocity.

 

Author Comment

by:keykjexpert
ID: 24764277
Gary and Tliotta...The object lock error is CPF4128.    We do have performance tools, and do recall a Lock Report--but I don't remember how to setup the collection or run the query.
0
 
LVL 35

Accepted Solution

by:
Gary Patterson earned 125 total points
ID: 24765811
Performance Tools Lock Report will get you what you need, and is particularly good if this is happening widely across multiple programs and jobs.  

See example Lock Report below from Performance Tools manual.

Complete instructions on installing, configuring and using Performance Tools.  Consult the Information Center for your OS version, but the process is similar in recent versions:

http://publib.boulder.ibm.com/infocenter/iseries/v5r4/index.jsp?topic=/rzahx/rzahxperftoolsdesc.htm

You will need to capture trace data.

Capturing performance data, especially trace data, can be performance and disk-intensive.  Start small, and monitor system performance until you understand the impact it has on your system.  Or get a performance pro in to help you - I've seen operators create real disasters with improperly configured Performance Tools data collection.

Of course, if you catch it while the lock is occurring (job just went into MSGW and other job is still holding the lock), you can use WRKOBJLCK command with the object name and type from the CPF4128 message on the job log and see who is holding the lock.

Bottom line: these errors are the result of sloppy programming - the result of programs that hold long locks on files, like locking a file and then waiting on input from an interactive user - inexcusable.  Who knows how long a user will leave their workstation or be on a call?  Combined with programs that obviously don't check for existing locks before opening a file, this is a recipie for repeated failures.  I've been fixing sloppy code like this for 20 years (and I'm sure I've written some in a hurry myself).

Assuming this is a database file, the system default is for a job to wait for TWO MINUTES (forever in computer time) before throwing an error.  You can configure this on a file-by-file basis (DSPFD command will show you the settings for a given file), and also override it on a case-by-case basis in any program (OVRDBF command).

Also, this looks like a database logical file, from the naming pattern.  Remember that logical files are associated with one or more physical files, and an exclusive lock on the physical can create problems for programs attemptint to lock the logical.

Have fun!

- Gary Patterson

12/14/00 12:46:01                           Seize/Lock Wait
Statistics by Time of Day                                  Page   1
                                                        Report type
*ALL
 TOD of  Length                                                              Object                                      Record
  Wait   of Wait L Requestor's Job Name         Holder's Job Name             Type    Object Name                        Number
-------- ------- - ---------------------------- ---------------------------- ------ -------------------------------- ----------
12.05.39    4264 L QPADEV0006 SUSTAITA   012538 QPADEV000R SUSTAITA   012535 PGM    QAVCPP     QPFR
12.05.41    6866 L QPADEV000S SUSTAITA   012537 QPADEV0006 SUSTAITA   012538 PGM    QAVCPP     QPFR
12.05.55    7858 L QPADEV0006 SUSTAITA   012538 QPADEV000R SUSTAITA   012535 PGM    QAVCPP     QPFR
12.05.57    8988 L QPADEV000S SUSTAITA   012537 QPADEV0006 SUSTAITA   012538 PGM    QAVCPP     QPFR
Member LCKTRC1        Library TRACESVT       Period from 00.00.00
through 23.59.59        500 ms minimum wait12/14/00 12:46:01       
                 
                                          Seize/Lock Wait
Statistics by Requesting Job                                 Page   2
                                                        Report type
*ALL
 TOD of  Length                                                              Object                                      Record
  Wait   of Wait L Requestor's Job Name         Holder's Job Name             Type    Object Name                        Number
-------- ------- - ---------------------------- ---------------------------- ------ -------------------------------- ----------
12.05.41    6866 L QPADEV000S SUSTAITA   012537 QPADEV0006 SUSTAITA   012538 PGM    QAVCPP     QPFR
12.05.57    8988 L QPADEV000S SUSTAITA   012537 QPADEV0006 SUSTAITA   012538 PGM    QAVCPP     QPFR
12.05.39    4264 L QPADEV0006 SUSTAITA   012538 QPADEV000R SUSTAITA   012535 PGM    QAVCPP     QPFR
12.05.55    7858 L QPADEV0006 SUSTAITA   012538 QPADEV000R SUSTAITA   012535 PGM    QAVCPP     QPFR
Member LCKTRC1        Library TRACESVT       Period from 00.00.00
through 23.59.59        500 ms minimum wait
12/14/00 12:46:01                           Seize/Lock Wait
Statistics by Holding Job                                  Page   3
                                                        Report type
*ALL
 TOD of  Length                                                              Object                                      Record
  Wait   of Wait L Requestor's Job Name         Holder's Job Name             Type    Object Name                        Number
-------- ------- - ---------------------------- ---------------------------- ------ -------------------------------- ----------
12.05.39    4264 L QPADEV0006 SUSTAITA   012538 QPADEV000R SUSTAITA   012535 PGM    QAVCPP     QPFR
12.05.55    7858 L QPADEV0006 SUSTAITA   012538 QPADEV000R SUSTAITA   012535 PGM    QAVCPP     QPFR
12.05.41    6866 L QPADEV000S SUSTAITA   012537 QPADEV0006 SUSTAITA   012538 PGM    QAVCPP     QPFR
12.05.57    8988 L QPADEV000S SUSTAITA   012537 QPADEV0006 SUSTAITA   012538 PGM    QAVCPP     QPFR
Member LCKTRC1        Library TRACESVT       Period from 00.00.00
through 23.59.59        500 ms minimum wait
12/14/00 12:46:01                             Seize/Lock Wait
Statistics by Object                                     Page   4
                                                        Report type
*ALL
 TOD of  Length                                                              Object                                      Record
  Wait   of Wait L Requestor's Job Name         Holder's Job Name             Type    Object Name                        Number
-------- ------- - ---------------------------- ---------------------------- ------ -------------------------------- ----------
12.05.39    4264 L QPADEV0006 SUSTAITA   012538 QPADEV000R SUSTAITA   012535 PGM    QAVCPP     QPFR
12.05.41    6866 L QPADEV000S SUSTAITA   012537 QPADEV0006 SUSTAITA   012538 PGM    QAVCPP     QPFR
12.05.55    7858 L QPADEV0006 SUSTAITA   012538 QPADEV000R SUSTAITA   012535 PGM    QAVCPP     QPFR
12.05.57    8988 L QPADEV000S SUSTAITA   012537 QPADEV0006 SUSTAITA   012538 PGM    QAVCPP     QPFR
Member LCKTRC1        Library TRACESVT       Period from 00.00.00
through 23.59.59        500 ms minimum wait

Open in new window

0
 

Author Closing Comment

by:keykjexpert
ID: 31598961
THANKS GARY!!
0

Featured Post

How Do You Stack Up Against Your Peers?

With today’s modern enterprise so dependent on digital infrastructures, the impact of major incidents has increased dramatically. Grab the report now to gain insight into how your organization ranks against your peers and learn best-in-class strategies to resolve incidents.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Sometimes a user will call me frantically, explaining that something has gone wrong and they have tried everything (read - they have messed it up more and now need someone to clean up) and it still does no good, can I help them?!  Usually the standa…
Windows 10 is here and for most admins this means frustration and challenges getting that first working Windows 10 image. As in my previous sysprep articles, I've put together a simple help guide to get you through this process. The aim is to achiev…
This is used to tweak the memory usage for your computer, it is used for servers more so than workstations but just be careful editing registry settings as it may cause irreversible results. I hold no responsibility for anything you do to the regist…
Hi friends,  in this video  I'll show you how new windows 10 user can learn the using of windows 10. Thank you.

679 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question