Sessions hang up on AS/400

wildchoi
wildchoi used Ask the Experts™
on
Hi all,

I've got a problem on our company AS/400
We're using the 570 with OS version V5R3

The users are getting into the system through iSeries emulator(Telnet) and some web interface product
(ABL, Webface, etc)

Starting this 2 weeks sometimes a number of users will get the session hang up. Some users are having the sessions hang up when signing on to the AS/400 with emulator.
checking the Jobs in QINTER, some of them are showing Lock waiting (LCKW).
And there are some messages in the QSYSOPR message queue showing "Unable to obtain lock on device XXXXXX", "All jobs at work station XXXXX ended", which XXXXX is the session job names.
Normally it last to about 1 minutes. And our jobs Interactive delay (QINACTITV) is 30s.
The Network engineer said there is no interruption on the AS/400 network at the moment and we are now having no idea on what happened.

Anyone has the related experience and any advice on the problem?
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Theo KouwenhovenApplication Consultant

Commented:
Hi wildchoi,

The jobs that giving the LCKW status are the best to begin with, look to the job log and findout what kind of lock it is why it is waiting, is it in use? by who? why?

The "unable to obtain lockon device XXXXX" is the step after the  LCKW ststus.
you can look at the joblog to get more details.
Can you tell us what kind of device XXXXX is?

Regards,
Murph

Author

Commented:
Hi,

I've tried to look to the job log but nothing can see.
This problem occurs in a short time so not enough have time to find out which object is locking.
Theo KouwenhovenApplication Consultant

Commented:
Hi,

The joblog will show you for sure the object or device it try to lock and asl long as you can see the
LCKW ststus, you can chsck what the job is trying to lock.
Without this info we can't tell you anything.
Please post a joblog here, so that we can help you check it.

Regards,
Murph
Rowby Goren Makes an Impact on Screen and Online

Learn about longtime user Rowby Goren and his great contributions to the site. We explore his method for posing questions that are likely to yield a solution, and take a look at how his career transformed from a Hollywood writer to a website entrepreneur.

Author

Commented:
Below is one of the job log. I'm not sure whether it is the message level not enough. However nothing can see on it.


5722SS1 V5R3M0 040528                           Job Log                             CVRSYSA  12/13/08 12:24:24          Page    1
  Job name . . . . . . . . . . :   QPADEV007Q      User  . . . . . . :   LAUPA2       Number . . . . . . . . . . . :   159494
  Job description  . . . . . . :   SYSOPR          Library . . . . . :   QGPL
MSGID      TYPE                    SEV  DATE      TIME             FROM PGM     LIBRARY     INST     TO PGM      LIBRARY     INST
CPF1124    Information             00   12/13/08  11:32:20.152960  QWTPIIPP     QSYS        061C     *EXT                    *N
                                     Message . . . . :   Job 159494/LAUPA2/QPADEV007Q started on 12/13/08 at
                                       11:32:20 in subsystem QINTER in QSYS. Job entered system on 12/13/08 at
                                       11:32:20.
CPF1164    Completion              00   12/13/08  12:24:24.504328  QWTMCEOJ     QSYS        00C9     *EXT                    *N
                                     Message . . . . :   Job 159494/LAUPA2/QPADEV007Q ended on 12/13/08 at
                                       12:24:24; 1 seconds used; end code 50 .
                                     Cause . . . . . :   Job 159494/LAUPA2/QPADEV007Q completed on 12/13/08 at
                                       12:24:24 after it used 1 seconds processing unit time.  The job had ending
                                       code 50. The job ended after 1 routing steps with a secondary ending code of
                                       0.  The job ending codes and their meanings are as follows:  0 - The job
                                       completed normally. 10 - The job completed normally during controlled ending
                                       or controlled subsystem ending. 20 - The job exceeded end severity (ENDSEV
                                       job attribute). 30 - The job ended abnormally. 40 - The job ended before
                                       becoming active. 50 - The job ended while the job was active. 60 - The
                                       subsystem ended abnormally while the job was active. 70 - The system ended
                                       abnormally while the job was active. 80 - The job ended (ENDJOBABN command).
                                       90 - The job was forced to end after the time limit ended (ENDJOBABN
                                       command). Recovery  . . . :   For more information, see the Work Management
                                       topic in the Information Center,
                                       http://www.ibm.com/eserver/iseries/infocenter.
Gary PattersonVP Technology / Senior Consultant

Commented:
BTW: QINACTITV is minites, not seconds.

This sounds similar:

http://www-01.ibm.com/support/docview.wss?rs=0&dc=DB550&dc=D100&q1=telnet+lckw&uid=nas26b62775f4a7336c38625710f004226ed&loc=en_US&cs=UTF-8&lang=all

Do you have PTF SI23849 installed?  http://www-933.ibm.com/eserver/support/fixes/fixcentral/fixdetails?fixid=SI24176

If not, suggest you install it and let us know it that resolves the issue.  If not, let's do some additional diagnosis:

Assuming that the lock issue in the job is really the workstation DEVD, the the question here is "What is locking my QPADEV devices, and why?"  A good secondary question is "And why is Telnet trying to hand our a locked device?".

Find a job that is hung in LCKW.  Work with that job, and select "Work with Job Locks".  Identify the object and object type that is causing the Lock wait on the job.  If these only last a minute or so you will need to work quickly.  Use the WRKOBJLCK command to determine what job is holding the lock, and post that information back here.  (QSYSARB, another user job?)

Once we know what is holding the exclusive lock, we can probably be of more assistance.

Also, please post the following values:

  • QSYS/QINTER Subsystem job log error messages that seem to be relevant
  • Relevant error messages from all QTVTELNET and QTVDEVICE job logs
  • QAUTOVRT and QCRTAUT system values
  • CHGTCPA TCP Keepalive value
  • WRKREGINF and list any exit programs associated with Telnet Exit Points QIBM_QTG_DEVINIT or QIBM_QTG_DEVTERM
  • Telnet server keepalive value: iSeries Navigator, select your system ’ Network ’ Servers ’ TCP/IP. In the right pane, right-click Telnet and select Properties. On the Telnet Properties page, click the Time-Out tab.

- Gary Patterson

Author

Commented:

 I found something at the system log, showing QTVDEVICE is doing some changes on the devd in a short time
CPD2639  40  DIAGNOSTIC   Device QPADEV0019 currently on-line.
                      QTVDEVICE  QTCP       209540                   12/18/08 15:40:19.726928 QTCP
CPC2606  00  COMPLETION   Vary off completed for device QPADEV000S.
                      QTVDEVICE  QTCP       209540                   12/18/08 15:40:19.729720 QTCP
CPC2602  00  COMPLETION   Description for device QPADEV0085 deleted.
                      QTVDEVICE  QTCP       209542                   12/18/08 15:40:19.809216 QTCP
CPC2622  00  COMPLETION   Description for device QPADEV0085 created.
                      QTVDEVICE  QTCP       209542                   12/18/08 15:40:19.848440 QTCP
CPC2602  00  COMPLETION   Description for device QPADEV000S deleted.
                      QTVDEVICE  QTCP       209540                   12/18/08 15:40:19.861744 QTCP
CPC2605  00  COMPLETION   Vary on completed for device QPADEV0085.
                      QTVDEVICE  QTCP       209542                   12/18/08 15:40:19.875312 QTCP
CPC2622  00  COMPLETION   Description for device QPADEV000S created.
                      QTVDEVICE  QTCP       209540                   12/18/08 15:40:19.880608 QTCP
CPC2613  00  COMPLETION   Description for device QPADEV0085 changed.
                      QTVDEVICE  QTCP       209542                   12/18/08 15:40:19.889624 QTCP
CPC2605  00  COMPLETION   Vary on completed for device QPADEV000S.
                      QTVDEVICE  QTCP       209540                   12/18/08 15:40:19.891120 QTCP
CPC2613  00  COMPLETION   Description for device QPADEV000S changed.
                      QTVDEVICE  QTCP       209540                   12/18/08 15:40:19.909808 QTCP

QAUTOVRT       - *NOMAX
QCRTAUT          - *USE

TCP Keepalive value      - 120 (Mins)

Exit program for
QIBM_QTG_DEVINIT:  
 
Exit point . . . . . . . . . . . . . . :   QIBM_QTG_DEVINIT
 Exit point format  . . . . . . . . . . :   INIT0100
 Exit point registered  . . . . . . . . :   *YES
 Allow deregister . . . . . . . . . . . :   *YES
 Maximum number of exit programs  . . . :   1
 Current number of exit programs  . . . :   0
 Preprocessing for add  . . . . . . . . :   QTVADDP
   Library  . . . . . . . . . . . . . . :     QSYS
   Format . . . . . . . . . . . . . . . :     ADDP0100
 Preprocessing for remove . . . . . . . :   QTVRMVP
   Library  . . . . . . . . . . . . . . :     QSYS
   Format . . . . . . . . . . . . . . . :     RMVP0100

QIBM_QTG_DEVTERM

Exit point . . . . . . . . . . . . . . :   QIBM_QTG_DEVTERM
 Exit point format  . . . . . . . . . . :   TERM0100
 Exit point registered  . . . . . . . . :   *YES
 Allow deregister . . . . . . . . . . . :   *YES
 Maximum number of exit programs  . . . :   1
 Current number of exit programs  . . . :   0
 Preprocessing for add  . . . . . . . . :   QTVADDP
   Library  . . . . . . . . . . . . . . :     QSYS
   Format . . . . . . . . . . . . . . . :     ADDP0100
 Preprocessing for remove . . . . . . . :   QTVRMVP
   Library  . . . . . . . . . . . . . . :     QSYS
   Format . . . . . . . . . . . . . . . :     RMVP0100
 Preprocessing for retrieve . . . . . . :   *NONE
   Library  . . . . . . . . . . . . . . :
   Format . . . . . . . . . . . . . . . :
 Allow change . . . . . . . . . . . . . :   *NO
Theo KouwenhovenApplication Consultant

Commented:
Hi wildchoi,

Device QPADE* is autocreated by the system. This happpens when a device is connected to the AS/400. The Lock wait seems to me like 2 jobs waiting for creation of the DEVD or try to use the same Device at the same time.
If one of the users get this error, try to look what the status is of thet device and check who is suzing that object WRKOBJLCL

Regards,
Murph
Gary PattersonVP Technology / Senior Consultant

Commented:
Wildchoi,

OK, QINACTITV, QCRTAUT, QAUTOVRT and TCP Keepalive timers all look OK.

I strongly suggest that before we spend time with further troubleshooting, you please verify that you have PTF SI23849 installed:

  • DSPPTF SELECT(SI23849)

 If you don't have it installed, order it and install it and see if it resolves the issue.
 
Additional information needed:
  • On WRKREGINF, you need to use option 8 to display exit programs, not option 5.
  • Did you find a LCKW job and determine what object and job were causing the LCKW?
  • What is the Telnet time-out setting?
- Gary Patterson

Author

Commented:
Checked that the PTF is not installed

And there is no exit program assigned to the QIBM_QTG_DEVINIT and QIBM_QTG_DEVTERM

Confirmed with the Administrator, there is no Telenet timeout value set so it is using the Interactive session timeout value (30 mins)

Thank you for your help
Gary PattersonVP Technology / Senior Consultant

Commented:
Recommendations:

If not already installed, obtain and apply the most current Cumulate PTF package for V5R3: C8267530
Obtain and apply any V5R3 HIPER PTF's
Obtain and install PTF SI23849 if not already included in above.

- Gary Patterson

Author

Commented:
We've applied and monitoring.
However, if still not solved what's the problem it is expected to be?
VP Technology / Senior Consultant
Commented:
We will have to continue with troubleshooting if getting current on PTFs did not resolve the issue.  This is not normal behavior, so hopefully applying PTFs will resolve the issue.

Did you just apply that one PTF, or did you apply the most recent cume tape, and relevant HIPER PTF's too?

This problem occurs in part because you are using shared default terminal names generated by the system.  Problems like this are less likely to happen if you use named devices (configured in your TN5250 software).  When you use named devices, the system doesn't have to do as much deleting, recreating, and changing device descriptions, and sessions can be disconnected and reconnected instead of terminated and reallocated to another process.  One way to eliminate this sort of problem is to name your session devices in Client Access (or whatever you are using for terminal software).

If you apply PTF's and it still does not resolve the issue, you will probably need to open a ticket with IBM - it doesn't appear this is a configuration issue or an Exit point program failure, so it could be a symptom of an OS defect.  Again, hopefully getting caught up on PTFs will resolve the issue.

Please post back and let us know what your results are.

- Gary Patterson

Author

Commented:
Hi Gary,
We've applied the PTF but still happens.
However I also agree with the not using named device.
We are having the plan to upgrade all the company users sessions to be using named device.
Thank you very much

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial