Link to home
Start Free TrialLog in
Avatar of DavidASolis
DavidASolis

asked on

AS400 Communication Problems

Hi Everyone

I have a huge problem which started recently.

I have a 170 running V4R5. The problem im having is that I can connect to the system using Client Access but about 10 minutes later, the sessions die out. I can ping fine, but the actual telnet sessions do not connect. No communications can connect at all.

To get the system back up I have to end tcp and start it again or it also comes back when I do a strtcpsvr *all

When I look at the netstat screen all previously open sessions are still sitting there as if they were still active.

I do not see any error messages in qsysopr nor do I see any joblogs in the qezjoblog.

There have been no changes to the system, and the system has been working fine for years.

This is a very important matter, as this is our target system for mirroring, and there is a hurricane in our sites....so any any any help is appreciated...I will be checking this thread constantly to answer any questions!

Avatar of dedy_djajapermana
dedy_djajapermana

Hi,

Check the joblog of QTVTELNET job, there maybe more than one. QTVTELNET is the job handling telnet connection.
WRKJOB QTVTELNET
I think the problem is with TELNET job, try to restart TELNET server only instead of restarting entire TCP services when the problem arise (to prove that the problem is with TELNET server).
ENDTCPSVR *TELNET
STRTCPSVR *TELNET
Are the telnet sessions being disconnected at all? I would think if a job has been disconnected, it might still be active when you check out netstat.

Check out QINACTITV to see if possibly people are being disconnected after whatever the system value is set at.

Probably not related, but I figured I'd throw it out there.

In addition to the above steps in the previous posts, I would think you would need to start up the Client Access Host Servers too - STRHOSTSVR *ALL

Avatar of DavidASolis

ASKER

It looks like my problem is related to the Telnet server. I ended and started that and was able to connect. Problem is there are no error messages at that job when I end it.   Any ideas??
The host servers all have been started. I tried FTPing too while this was happening and that does not work either, but when I end the telnet server and restart it...it all works again.
Also the QINACTITV is set to *NONE
checked the joblog of QTVTELNET
WRKJOB QTVTELNET option 10 (if the job is still active)
WRKJOB QTVTELNET option 4 - spooled files (if the job is not active)
the only message that shows up is the following.... SSL socket operation received return value error -93.
The odd thing is when telnet stops working from outside the machine....I try telneting from the console to the machine's loopback address and that works!

do you use SSL for your client access?
does telnet from PC work?
windows: START, RUN, type TELNET xxx.xxx.xxx.xxx <enter>
No to both questions....telnet from windows just hangs
did you try to restart telnet server ONLY instead of restart TCP?
if it works after restarting TELNET server only, then we can concentrate on TELNET server.
Also check QTVDEVICE joblog (Besides QTVTELNET)
Yes, ive been ending and starting only the TELNET server, so thats what ive been concentrating on...I'll look at the QTVDEVICE log now...
I do not get any error messages under the QTVDEVICE jobs....just acknowledges program start.

Under the QZSOSMAPD job I get the following messages...
Job 532212/QUSER/QZSOSMAPD submitted.                                  
Object changed.                                                        
Object changed.                                                        
The protocol required to support the specified address family is not  
  available at this time.                                              
Host server communications error occurred on socket() - spx family.    
Anyone with other ideas??
My problem seems to be exactly what this person was encountering..

https://www.experts-exchange.com/questions/21046608/Major-problem-with-my-AS400-client-access.html

I have tried what was written in that thread but none of the accepted answers have worked
The only thing I have not tried is, IPLing with a restart *no .... I do not see the reasoning behind why that would make a difference. I have however performed a full IPL and no luck.
ASKER CERTIFIED SOLUTION
Avatar of dedy_djajapermana
dedy_djajapermana

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
when you do a DSPLOG for around the time your clients hang, can you find any jobs ending or sending an error?
The only errors I see in the QTDEVICE job are messages like....
 Device QPADEV000I associated with client 172.28.46.150 port 4051 has been
   recovered.                                                            


172.28.46.150 is my IP
nothing's seem to be close so far...
does the problem happen to multiple PC? or only to one PC? (i was thinking of possibility of duplicate PC IP address)
When you mention you can telnet to local loopback address, was it 127.0.0.1 or the network IP address?
It was 127.0.0.1

It happens from any PC I try, even on the same network segment (I thought maybe a firewall or router was causing this)
i suspect there's some corruption in the license program, that's what i can think of...
do you want to try to delete and re-install tcp/ip services licensed program?
What about CUM Tape level?  Latest and greatest out there for your version and release?
or maybe wait for other experts opinion...
yeah, i assumed you have some CUM PTF level on both system... have you checked?
CUM Levels are the same on both systems...I checked to make sure all PTFs related to Telnet were applied as well.
How would I go about deleting and re-installing the tcp/ip services?
first, make sure you have the OS installation CD, when you're ready with it:
- Go to terminal, ENDTCP *IMMED
- type GO LICPGM
- take option 12 Delete licensed program
- put option 4 on  5769TC1    *BASE   TCP/IP Connectivity Utilities for AS/400
licensed program is now deleted, you shouldn't be able to STRTCP now

install:
- option 11 from LICPGM menu
- insert the OS CD
- put option 1 on 5769TC1    *BASE   TCP/IP Connectivity Utilities for AS/400
- key in your optical drive when prompted

now you have to re-install PTF related to 5769TC1 (you can also reinstall the latest CUM PTFs)
you may want to test it before reinstall the PTF

hmmm re-install sounds like a windows fix... This should not be nessesary but yes it will rule out some posible causes for you... re-installing TCP/IP shouldn't take long anyways...
Also take a PC and place it as close to the AS/400 as possible: on the same switch. This will rule out all other network hardware currently in between the AS/400 and your PC...
switch problem can be ruled out, i think, telnet from the AS/400 itself to its network adapter IP address was fail..
ah, sorry I must have missed that...
Hey guys, would like to let you all know that I found out late last night what the problem was....

This machine is on a network in another state at our coporate headquarters...I was looking into DNS and saw that when I did an nslookup on the name of the machine DNS would return the correct address....when I reversed the nslookup to the ip address, a different machine name would show up with the same IP.

Looks like someone threw a machine onto the network with the same IP address...and either the 2 machines were conflicting with each other (which would be why I could still ping the address even though I couldnt connect) or the DNS entries were conflicting with each other when either the AS/400 refreshed its DNS cache or when the client was refreshing they got the wrong name....Im going with option 1 though.

Thanks for all your help! It was greatly appreciated!! This site is a valuable forum for people!!
you're welcome