Iseries object lock monitor and report

Is there a way to setup a collection service and/or report that will provide object lock information for objects that generate contention errors?   We have a program that runs unattended that periodically encounters an allocation issue for a variety of files, typically due to an end user's exclusive access.   By the time we see the error in the joblog, the contention no longer exists and we have no way of following up with the party that caused the problem.   We use WRKOBJLCK to investigate issues presented during our attended monitoring, but there is no way to go back in time to see what caused the contention for an event that happened during our unattended hours.
keykjexpertAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

tliottaCommented:
keykjexpert:

There are many, many possibilities...

Is this "program that runs unattended" one which your site has full control over? I.e., did your company create it, or is it part of a 3rd-party package or otherwise out of your hands?

What do you see in the joblog when the contention causes an error? It might be necessary to show one or more messages before the error message.

And, are you a programmer or are you looking for some operations automation?

Tom
0
Gary PattersonVP Technology / Senior Consultant Commented:
Do you own Performance Tools?  (GO LICPGM, option 10)

If you own the Performance Tools licensed program, and you capture performance data while the problems occur, you can run the Lock Report and see where your contention is.  If you don't have Performance Tools, you can still capture performance data, but you will have to query the data yourself, or send it to a service provider or IBM for analysis.

Might be an even easier way, though:  

What specific CPF message are you getting?  If you check the job log in the job that gets the error, you will often see a diagnostic or escape message that includes the name of the locking job - all you have to do is look at the second level help text (cursor on the message and press F1).  If you are seeing this in a printed job log, change the jobd or sbmjob, or schedule entry that starts the job to LOG(4 0 *SECLVL) to force second level message text to be printed to the spooled job log.

- Gary Patterson
0
keykjexpertAuthor Commented:
Gary,
The programs in question are out of our control, part of a packaged solution.
I am in the operations group, looking to help the programming team diagnose the root cause of the lock issue.   The problem is trying to figure out what happened after the fact.

Thanks,

Ken
The contention issue appears in the joblog as a "Not able to allocate objects needed for file FCTHFDL1 in library ORCFP member or program device FCTHDL1"
Followed by RPG MSGID RPG1216
                         Additional Message Information                        
 Message ID . . . . . . :   RPG1216                                            
 Date sent  . . . . . . :   07/01/09      Time sent  . . . . . . :   06:05:46  
                                                                               
 Message . . . . :   Error message CPF4128 appeared during OPEN (C S D F).      
                                                                               
 Cause . . . . . :   The RPG program FCT002 in library ORCOP received the      
   message CPF4128 while doing an implicit OPEN to file FCTHFDL1. See the job  
   log for a complete description of message CPF4128. If the file has a device  
   type of SPECIAL, there may be no message in the job log.                    
 Recovery  . . . :   Enter C to cancel, S to obtain a printout of system        
   storage, D to obtain an RPG formatted printout of system storage, or F to    
   obtain a full formatted printout of system storage.                          
 Possible choices for replying to message . . . . . . . . . . . . . . . :      
   D -- Obtain RPG formatted printout of system storage.                        
   S -- Obtain printout of system storage.                                      
   F -- Obtain full formatted printout of system storage.                      
                                                     
                                                                               
0
Cloud Class® Course: CompTIA Healthcare IT Tech

This course will help prep you to earn the CompTIA Healthcare IT Technician certification showing that you have the knowledge and skills needed to succeed in installing, managing, and troubleshooting IT systems in medical and clinical settings.

keykjexpertAuthor Commented:
Gary and Tliotta...The object lock error is CPF4128.    We do have performance tools, and do recall a Lock Report--but I don't remember how to setup the collection or run the query.
0
Gary PattersonVP Technology / Senior Consultant Commented:
Performance Tools Lock Report will get you what you need, and is particularly good if this is happening widely across multiple programs and jobs.  

See example Lock Report below from Performance Tools manual.

Complete instructions on installing, configuring and using Performance Tools.  Consult the Information Center for your OS version, but the process is similar in recent versions:

http://publib.boulder.ibm.com/infocenter/iseries/v5r4/index.jsp?topic=/rzahx/rzahxperftoolsdesc.htm

You will need to capture trace data.

Capturing performance data, especially trace data, can be performance and disk-intensive.  Start small, and monitor system performance until you understand the impact it has on your system.  Or get a performance pro in to help you - I've seen operators create real disasters with improperly configured Performance Tools data collection.

Of course, if you catch it while the lock is occurring (job just went into MSGW and other job is still holding the lock), you can use WRKOBJLCK command with the object name and type from the CPF4128 message on the job log and see who is holding the lock.

Bottom line: these errors are the result of sloppy programming - the result of programs that hold long locks on files, like locking a file and then waiting on input from an interactive user - inexcusable.  Who knows how long a user will leave their workstation or be on a call?  Combined with programs that obviously don't check for existing locks before opening a file, this is a recipie for repeated failures.  I've been fixing sloppy code like this for 20 years (and I'm sure I've written some in a hurry myself).

Assuming this is a database file, the system default is for a job to wait for TWO MINUTES (forever in computer time) before throwing an error.  You can configure this on a file-by-file basis (DSPFD command will show you the settings for a given file), and also override it on a case-by-case basis in any program (OVRDBF command).

Also, this looks like a database logical file, from the naming pattern.  Remember that logical files are associated with one or more physical files, and an exclusive lock on the physical can create problems for programs attemptint to lock the logical.

Have fun!

- Gary Patterson

12/14/00 12:46:01                           Seize/Lock Wait
Statistics by Time of Day                                  Page   1
                                                        Report type
*ALL
 TOD of  Length                                                              Object                                      Record
  Wait   of Wait L Requestor's Job Name         Holder's Job Name             Type    Object Name                        Number
-------- ------- - ---------------------------- ---------------------------- ------ -------------------------------- ----------
12.05.39    4264 L QPADEV0006 SUSTAITA   012538 QPADEV000R SUSTAITA   012535 PGM    QAVCPP     QPFR
12.05.41    6866 L QPADEV000S SUSTAITA   012537 QPADEV0006 SUSTAITA   012538 PGM    QAVCPP     QPFR
12.05.55    7858 L QPADEV0006 SUSTAITA   012538 QPADEV000R SUSTAITA   012535 PGM    QAVCPP     QPFR
12.05.57    8988 L QPADEV000S SUSTAITA   012537 QPADEV0006 SUSTAITA   012538 PGM    QAVCPP     QPFR
Member LCKTRC1        Library TRACESVT       Period from 00.00.00
through 23.59.59        500 ms minimum wait12/14/00 12:46:01       
                 
                                          Seize/Lock Wait
Statistics by Requesting Job                                 Page   2
                                                        Report type
*ALL
 TOD of  Length                                                              Object                                      Record
  Wait   of Wait L Requestor's Job Name         Holder's Job Name             Type    Object Name                        Number
-------- ------- - ---------------------------- ---------------------------- ------ -------------------------------- ----------
12.05.41    6866 L QPADEV000S SUSTAITA   012537 QPADEV0006 SUSTAITA   012538 PGM    QAVCPP     QPFR
12.05.57    8988 L QPADEV000S SUSTAITA   012537 QPADEV0006 SUSTAITA   012538 PGM    QAVCPP     QPFR
12.05.39    4264 L QPADEV0006 SUSTAITA   012538 QPADEV000R SUSTAITA   012535 PGM    QAVCPP     QPFR
12.05.55    7858 L QPADEV0006 SUSTAITA   012538 QPADEV000R SUSTAITA   012535 PGM    QAVCPP     QPFR
Member LCKTRC1        Library TRACESVT       Period from 00.00.00
through 23.59.59        500 ms minimum wait
12/14/00 12:46:01                           Seize/Lock Wait
Statistics by Holding Job                                  Page   3
                                                        Report type
*ALL
 TOD of  Length                                                              Object                                      Record
  Wait   of Wait L Requestor's Job Name         Holder's Job Name             Type    Object Name                        Number
-------- ------- - ---------------------------- ---------------------------- ------ -------------------------------- ----------
12.05.39    4264 L QPADEV0006 SUSTAITA   012538 QPADEV000R SUSTAITA   012535 PGM    QAVCPP     QPFR
12.05.55    7858 L QPADEV0006 SUSTAITA   012538 QPADEV000R SUSTAITA   012535 PGM    QAVCPP     QPFR
12.05.41    6866 L QPADEV000S SUSTAITA   012537 QPADEV0006 SUSTAITA   012538 PGM    QAVCPP     QPFR
12.05.57    8988 L QPADEV000S SUSTAITA   012537 QPADEV0006 SUSTAITA   012538 PGM    QAVCPP     QPFR
Member LCKTRC1        Library TRACESVT       Period from 00.00.00
through 23.59.59        500 ms minimum wait
12/14/00 12:46:01                             Seize/Lock Wait
Statistics by Object                                     Page   4
                                                        Report type
*ALL
 TOD of  Length                                                              Object                                      Record
  Wait   of Wait L Requestor's Job Name         Holder's Job Name             Type    Object Name                        Number
-------- ------- - ---------------------------- ---------------------------- ------ -------------------------------- ----------
12.05.39    4264 L QPADEV0006 SUSTAITA   012538 QPADEV000R SUSTAITA   012535 PGM    QAVCPP     QPFR
12.05.41    6866 L QPADEV000S SUSTAITA   012537 QPADEV0006 SUSTAITA   012538 PGM    QAVCPP     QPFR
12.05.55    7858 L QPADEV0006 SUSTAITA   012538 QPADEV000R SUSTAITA   012535 PGM    QAVCPP     QPFR
12.05.57    8988 L QPADEV000S SUSTAITA   012537 QPADEV0006 SUSTAITA   012538 PGM    QAVCPP     QPFR
Member LCKTRC1        Library TRACESVT       Period from 00.00.00
through 23.59.59        500 ms minimum wait

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
keykjexpertAuthor Commented:
THANKS GARY!!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Operating Systems

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.