Question

Telnet connection over the internet drops after idle

Asked by: George46227

11/5/06
8:10pm

I am having a lot of trouble with Telnet sessions over the internet. After 20 minutes of idle time the connection is often dropped. This is a major problem because of lost data entry productivity for many users. The users are at several locations with a PIX vpn to corporate. All are vpn over DSL. Previously this happened occasionally, now it happens very frequently. Now users at 3 of the 4 branch locations are dropping several times a day, it happens after 20 min. or so of idle time.

I have tried doing some diagnostics, testing, research, etc.
Testing results:
Telnet connections are very reliable on a LAN or a decent point-to-point WAN (such as fractional T1), idle time is not a problem, connections will stay for 2+ hours at least. The internet connections are un-predictable - anywhere from 20 min. to 2.5 hours of idle time will cause a disconnect. It is variable, some times the connection will drop 3 times in a row after 30 min. idle, then it will stay connected 3 times in a row with 30 min. idle. The testing was done over the internet with and without vpn involvement with no noticeable difference - frequent and un-predictable drops after 30+ min. of idle time with variable patterns. Pinging test with 10,000 cycles showed no relation to the drops, pings were 99% successful with occasional slow times but a decent average (times varied from 40ms to 1400ms with an average of about 70ms). Packet tracing did not show anything significant (not to me anyway) except:
after a certain amount of idle time a keystroke (usually the letter "c" but anything causes the same result) from the client causes a RST from the server. Then the client is disconnected from the session. netstat shows the session is gone on the client but netstat on the server shows the connection "established".

I have duplicated these results consistently on several LANS using different DSL providers for both client and server, with and without vpn's, using different Telnet client software and different server software.

Any suggestions would be appreciated.

George

This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.

Subscribe now for full access to Experts Exchange and get

Instant Access to this Solution

  • Plus...
  • 30 Day FREE access, no risk, no obligation
  • Collaborate with the world's top tech experts
  • Unlimited access to our exclusive solution database
  • Never be left without tech help again

Subscribe Now

Asked On
2006-11-05 at 17:29:31ID22049863
Tags

telnet

,

connection

Topics

Miscellaneous Networking

,

Application Protocols

Participating Experts
3
Points
500
Comments
36

Trusted by hundreds of thousands everyday for fast, accurate and reliable tech support.

  • "The time we save is the biggest benefit of Experts Exchange to Warner Bros. What could take multiple guys 2 hours or more each to find is accessed in around 15 minutes on Experts Exchange." Mike Kapnisakis, Warner Bros.
  • "Our team likes having a resource that is more secure than just using Google and most experts using this service really know their stuff. It's nice to look here first versus using Google." Dayna Sellner, Lockheed Martin
  • "Anytime that I've been stumped with a problem, 9 out of 10 times Experts Exchange has either the accepted solution or an open discussion of the potential solution to the problem." Kenny Red, eBay Inc.

See what Experts Exchange can do for you.

Got a question?

We've got the answer.

Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.

Screenshot of Experts Exchange Knowledgebase

Need individual assistance?

Our experts are ready to help.

If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.

Screenshot of Experts Exchange Knowledgebase

Want to learn from the best?

Read articles from industry experts.

Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.

Screenshot of an Article

Working on a long term project?

Store your work and research.

Save solutions to your questions, answers you’ve discovered through searching plus helpful articles in your personal knowledgebase for easy future access.

Screenshot of Experts Exchange Knowledgebase

Access the answers to your technology questions today.

Subscribe Now

30-day free trial. Register in 60 seconds.

What Makes Experts Exchange Unique?

Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Trusted by the world's most respected brands.

image of each brand's logo

Faithfully serving IT professionals since 1996.

Experts Exchange Logo

Try it out and discover for yourself.

Subscribe Now

30-day free trial. Register in 60 seconds.

Related Solutions

  1. Pix to Pix VPN
    Ok, here goes my first question at Experts-Exchange, which looks to me like the perfect place to get an answer. This first post is intended to explain the case and get some preliminary advice (if needed). Hopefully tomorrow i will be able to post more specific information abo...
  2. PIX VPN QM_IDLE
    I am in the progress of creating a VPN tunnel through a PIX 515 to a PIX 501. Both ends shows the state of QM_IDLE. What does this mean, and shouldn't it show connected instead? I'm pretty sure I got my configuration on both ends correctly.
  3. Getting VPN clients to telnet PIX
    Hi there I have a PIX 501 and remote vpn clients connecting through Microsoft XP's VPN client. I can telnet to the PIX from within the local network, and the VPN clients get the same ip range but they cannot ping the pix nor telnet to the pix. How can I solve this? I also hav...
  4. Telneting to PIX through VPN
    I know this topic has been discussed in other postings but I have not seen a definitive answer. I am hoping that this is possible. I have a PIX 515E at our central office and a PIX 501 at a remote site. I have an IPSec VPN set up between the two sites that is functioning c...

Free Tech Articles

  1. WARNING: 5 Reasons why you should NEVER fix a computer for free.
    It is in our nature to love the puzzle. We are obsessed. The lot of us. We love puzzles. We love the challenge. We thrive on finding the answer. We hate disarray. It bothers us deep in our soul. W...
  2. SCCM OSD Basic troubleshooting
    SCCM 2007 OSD is a fantastic way to deploy operating systems, however, like most things SCCM issues can sometimes be difficult to resolve due to the sheer volume of logs to sift through and the dispe...
  3. Migrate Small Business Server 2003 to Exchange 2010 and Windows 2008 R2
    This guide is intended to provide step by step instructions on how to migrate from Small Business Server 2003 to Windows 2008 R2 with Exchange 2010. For this migration to work you will need the fo...
  4. Create a Win7 Gadget
    This article shows you how to create a simple "Gadget" -- a sort of mini-application supported by Windows 7 and Vista. Gadgets can be dropped anywhere on the desktop to provide instant information, ...
  5. Outlook continually prompting for username and password
    There have been a lot of questions recently regarding Outlook prompting for a username and password whilst using Exchange 2007. There are a few reasons why this would happen and I will try to cover t...
  6. Backup Exchange 2010 Information Store using Windows Backup
    There seems to be quite a lot of confusion around the ability to backup Exchange 2010 using the built in Windows Backup feature. This stems from the omission of this feature prior to Exchange 2007 s...

Cloud Class Webinars

  1. Avoiding Bugs in Microsoft Access
    Alison Balter takes and in-depth look at avoiding bugs in Access. In this webinar you will learn about using the immediate window to debug your applications, invoking the debugger, using breakpoints to troubleshoot, stepping through code, setting the next statement to execute, ...
  2. Top 10 Best New Features in Visio 2010
    Scott Helmers gives live demonstrations of the top 10 new features in Visio 2010. This webinar will teach you how to create compelling diagrams by adding shapes to the page with a single click, linking the shapes in a diagram to data in Excel (or SQL Server, or SharePoint), ...
  3. IT Consultant Business Secrets Revealed
    Michael Munger, Experts Exchange tech pro and IT consultant, pulls back the curtain on his very successful businesses and answers question on every IT consultant and business owner should know about. He shares secrets on what he did to solve the 5 most common problems in IT, ...
  4. Disaster Recovery and Business Continuity
    Quest CTO, Mike Billon, gives an overview of the steps involved in building a dunamic disaster recovery plan. Through case studies and an examination of software/hardware tooles for monitoring and testing, you'll gain a better understandin of where you are, where you want ...
  5. Organize Your Visio Diagrams with Containers and Lists
    Scott Helmers uses cross functional flowcharts, wireframe diagrams, data graphic legends and seating charts to teach you: how to ustilize all three new structured diagram components in Visio 2010, the best practices for organizeing shapes in previous version of Visio, how to organize ...
  6. How to Us Objects, Properties, Events and Methods in Microsoft Access
    Alison Dalter gives an in-depbth look at objects, properties, events and methods in Microsoft Access. In this webinar you will learn about using the object browser, referring to objects, working with properties and methods, working with object variables, understanding the ...

Join the Community

Give a Little. Get a Lot.

Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.

Join the Community

Answers

 

by: bmedwardPosted on 2006-11-05 at 20:07:59ID: 17878780

Do you have a similar telnet session drops with clients when some kind of 'keep-alive' is being used?  Also, are all of the clients on wired networks or are the remote LAN’s wirelesses?

As I am sure that you can attest to, telnet sessions are inherently different from pings - each ping only lasts as long as is needed to reach the destination and return back.  The telnet session has to have good connectivity from when the session is originated until it is closed.  Not all telnet clients are equal in terms of being able to handle errors - Wavelink (www.wavelink.com) has a good enterprise level client (demo available) for most platforms.  For some situations, Wavelink (and others) offer proxy session controllers or gateway applications that are much more tolerant of network errors.  

On the network traces, were you able to determine if the packet was being re-sent (from either the client or server) prior to the RST?  Keep in mind that the reset will usually be generated when the connection tries to re-establish even though the failure happened much earlier.  Also, have you compared traces of the same telnet session from both the client's side and the server's?  The re-try should be easily identifiable through ethereal (ethereal.com) or other network trace utilities.  There will be a sequence of (usually around 6) re-transmits of the same packet, and the time between re-transmits will double with each one.  

Is there any chance of getting the application to operate through a web interface or other transaction oriented client?  If it is determined that you cannot keep a reliable session it may be easier to change how the data entry is being transmitted.

 

by: prashsaxPosted on 2006-11-06 at 13:52:15ID: 17885022

PIX, has a some time limit on the NAT sessions. So If you are using dynamic natting, It could be possible that pix will clear the nat session are that idle time limit.

Are you using natting in between?

If yes, you can try and do a static nat for just one client and see what happens.

 

by: George46227Posted on 2006-11-06 at 14:20:37ID: 17885299

bmedward:

Tell me more about the "keep-alive". None of the telnet clients I use offer a keep alive option (at least not that I can find). I tried setting up keep alive settings in the registry (win98 SE test pc) but it didn't help - in fact when I sniff the session I see no evidence of any keep-alive packets.

I do have one network that is wireless DSL, the others are hard-wired telecom (POTS) line DSL. Maybe the wireless DSL is a little worse than the others but they are all dropping a lot.

Can you explain - why does the telnet session need to have good connectivity from the beginning to the end? I don't see any background (polling or whatever) activity during the idle time either on LAN or internet.

I don't see any evidence of either failed packets or re-transmission. Re-transmission would presume something failed - but I can't see where anything failed. The session works perfect until it sets idle for a while, then a keystroke from the client causes the RST disconnection. I am not a sniffer expert so maybe I am missing something in the logs or just mis-interpreting what I see.

No - there is no chance whatsoever of changing data platforms. A ton of money was spent on the applications which are only accessible through telnet sessions. We have to live with whatever happens (or maybe go crazy trying to get any work done!).

prashsax:

one of the networks I am having trouble with is using PIX for vpn between 5 remote branches, 3 branches are complaining of drops, the other 2 are not. The vpn is all hard-wired internet access (DSL) at corp and branches.

I am also having the same problem with another network which is not using pix or any vpn. Some of the remote users use outbound NAT (client) to "inbound NAT" port forward to port 23(server). Other users are outbound thru a proxy server (client) to "inbound NAT" with port FW (server). All seem to be having the same problem.

I am suspicious of the NAT because all systems have some type of NAT involved - inbound or outbound or both. I wonder if the DSL modem or the NAT routers are "timing-out" or discarding the connections, maybe clearing the NAT port-mapping tables but I have not found any evidence of this (again maybe my sniffer skills are not too good).

Maybe I should try testing public ip-to-public ip telnet with no NAT or proxy involved, that would be more like the LAN sessions - LAN sessions never go down even with very long idle times.

Thanks
George




 

by: bmedwardPosted on 2006-11-06 at 14:45:16ID: 17885489

A keep-alive signal could be any type of activity that makes sure that there is some TCP/IP communication going on.  For experimental purposes, set up the server to ping a client (or set of clients) every 4 minutes or so.  This communication will reset the timeout counters for most devices in route between the client and server.  Note that if the NAT or PIX has a maximum session time limit, it would not be reset by keep-alive data.

As for the duration of a connection, a Ping transaction completes very quickly.  Usually it is only a matter of milliseconds from the start to end of a Ping cycle.   Telnet sessions, on the other hand, can easily span several days.  Often times, if the lower layers of the network stack drop – even momentarily – the session will be forced to reset by the client or another device (router, access point, etc) along the route.

Another method to test if this is a maximum time-limit issue - have some end users manually stop and re-establish their connections (re-boot pc / dsl modem) if necessary.  Track how long the failures are from the end user's start or re-connect time.  This may be a better metric for this type of issue as their idle time until disconnection will be variable.

 

by: George46227Posted on 2006-11-06 at 19:15:20ID: 17886674

11/6/06
9:40pm

bmedward:

I appreciate your advice but I am past the point of doing general diagnostics, intruding into the user's environment, etc. No matter how much data I collect about the problem it won't matter unless it leads to a solution or at least a definitive cause. I have spent a lot of time collecting data, testing, talking to users, etc. I need to tell the management at this point either-
it's the nature of the beast - telnet over internet is just flaky
or
I have a very good reason to believe what the cause is, what it will take to fix it, downtime, cost, etc.
I have used up all my "maybe it's this or maybe it's that, let's try this idea and see what happens"

I am leaning toward option #1 - this is what it is and nothing significant can be done short of ripping out the networks and a complete re-do of the system, maybe go to T1 internet instead of DSL? they won't like it and probably won't do it - "too expensive, how do we know it will even fix the problem, etc."

I would really like to at least have an intelligent understanding of the technical side of the problem, at least I could give an explanation. Like - why does a LAN connection never go down? why does the server send an immediate RST as soon as I press a key after some idle time? Obviously the server and client are communicating at that point because I get the RST!

Interesting note - if we state that the session goes down during idle time because of a "bad" internet connection, or network equipment, switch, router or whatever-
I can unplug the cat5 cable from my pc during a live telnet session on the LAN, have a long period of idle time, plug the cable back in and resume my session without a RST disconnection! As I said earlier the packet sniff shows no activity during idle time (whether plugged in or un-plugged) - so how does the server even know the network has been down (because the cat5 was un-plugged) for a long time? it resumes the session with no problem!

I do feel like something is actively terminating the session ( not just a unstable or slow/congested internet connection), I just can't find it - I don't think it is the client or the server. Something which is part of the internet connections and is common to several different locations, different ISP's, different DSL modems and routers, etc. My guess is it will turn out to be NAT-related and possibly not even solve-able. Some locations are using PIX for NAT and VPN, others are using low-end NAT DSL routers. I don't see any time-out values on the DSL NAT devices. The PIX are managed by a 3rd party telecom/voice/data provider which I have no access to - the guy says he has checked out the PIX and it's set up good without any time-outs.

Thanks for the advice and support
George

 

by: bmedwardPosted on 2006-11-07 at 07:48:37ID: 17890287

When troubleshooting wireless systems, I had similar problems with telent sessions.  In my case, a mobile computer would suspend or roam out of range for a period of time.  One problem arose from the ARP cache timeout in wireless access points - devices would be flushed after 7 (?) minutes of idle time.  When the mobile device re-connected, the AP had to refresh its ARP cache, re-discover a path to the server, and in most configurations would force a reset of any telnet sessions.  If this were a wired environment and some infrastructure device were behaving the same way as these access points, a occasional ping would reset ARP timeout counter.  However, it could just as likely be any of a thousand other things.  

Wavelink's TermProxy may work in this setting. (Note, I do not work for Wavelink)
http://www.wavelink.com/wavelink/emulators/wavelink_termproxy.aspx
This software works by establishing the telnet sessions on a intermediary computer and communicating with the clients over a more robust protocol.  

In my experience, they have been good about providing demo software.

 

by: George46227Posted on 2006-11-07 at 12:16:56ID: 17892550

11/7/06
3:05pm

bmedward:

thanks for the advice, I may look into it - if management decides the solution involves spending money on software/hardware, technical service time (me) to install/config, etc.

Today I am testing:
client: win98 se using built-in telnet with a public IP thru T1
server: w2k3 srv built-in telnet service with a public ip thru DSL
           - I am not sure if the server goes straight out thru the DSL device or passes thru the pix first; I do know it is not part of a vpn and it is not using NAT

I really thought this might work well but it works just as bad if not worse!!!
Sometimes I get "Connection... lost" after only 10 minutes idle without even touching the keyboard!!!
the server will just send a RST out of the blue for no apparent reason, the client is not doing anything just idling - then boom here comes a RST
On another occasion I seemed to stay connected for over an hour (no "Connection...lost") but when I tried to send a keystroke nothing whatsoever happened!!! The client netstat said "Established" but the server netstat showed no connection at all (checked it after the keystroke didn't respond). This time I could see in the trace the client re-sending 10 times with no response from the server - maybe because I have MaxDataRetries in the reg set to = 10. Never got a RST or anything from the server, the client eventually dis-connected itself.

I am afraid the ISP may be doing something which interferes with the session but I'm not sure what that could be.

I am thinking of posting part of my sniffer log to see if I am missing something. I hope I don't get yelled at by EE Admins! (like the time I posted a HJT log!). I will try to make it very small.

Thanks
George



 

by: bmedwardPosted on 2006-11-07 at 12:37:10ID: 17892738

If you do want to include sniffer data, posting the 4 to 10 packets from the telnet session up to, and including the RST, should be sufficient.  Be sure to identify the client and the server, and filter out any sensitive data.  The telnet data is likely to be less important than the packet overhead data - sequence number, fragmenting, retries, window size.

For fun, you could also try NetCat as a telnet client - you can get a windows version here http://www.vulnwatch.org/netcat/ .

The format to run this from a client computer would be 'nc -t hostname 23'.  

Also note that you can enable some level of logging with Windows default telnet clients - newer ones at least.

Good luck.

 

by: George46227Posted on 2006-11-07 at 18:04:00ID: 17894993

11/7/06
9:00pm

Would netcat be of any diagnostic value - I have used before as a telnet server. Win98 telnet logging doesn't so much - is 2k or xp any better?

I believe someone with better packet sniffer skills and packet-level knowledge of tcp might be able to see an error which causes the RST, it's too subtle for me to see - I don't know and understand the header fields that well, how the tcp error-control process really works, etc.

I will try to post part of a log tomorrow.

Thanks
George

 

by: George46227Posted on 2006-11-08 at 06:26:52ID: 17898246

11/8/06
9:20am

Here is a log. The ip's have been changed, I have included the headers and minimal data:

---------------------------------------------------------------------------------
#68       Receive time: 8826.923 (delta = 1.813)  packet length: 55    received length: 55  
Ethernet:   (00a024f0746c -> 00a0c81bde63)  type: IP(0x800)
Internet:   64.199.1.1 -> 209.254.1.1   hl: 5  ver: 4  tos: 00  len: 41  id: 0xfed7  fragoff: 0  flags: 0x2  ttl: 128  prot: TCP(6)  xsum: 0x1641
TCP: 1104 -> telnet(23)  seq: 00e35fdd  ack: acbfc076  win: 8102  hl: 5   xsum: 0x7411  urg: 0  flags: <ACK><PUSH>
data (1/1): d
---------------------------------------------------------------------------------
#69       Receive time: 8827.051 (delta = 0.128)  packet length: 60    received length: 60  
Ethernet:   (00a0c81bde63 -> 00a024f0746c)  type: IP(0x800)
Internet:  209.254.1.1 -> 64.199.1.1    hl: 5  ver: 4  tos: 00  len: 41  id: 0xf91f  fragoff: 0  flags: 0x2  ttl: 122  prot: TCP(6)  xsum: 0x21f9
TCP: telnet(23) -> 1104  seq: acbfc076  ack: 00e35fde  win: 65473  hl: 5   xsum: 0x93f4  urg: 0  flags: <ACK><PUSH>
data (1/1): d
---------------------------------------------------------------------------------
#70       Receive time: 8827.053 (delta = 0.002)  packet length: 55    received length: 55  
Ethernet:   (00a024f0746c -> 00a0c81bde63)  type: IP(0x800)
Internet:   64.199.1.1 -> 209.254.1.1   hl: 5  ver: 4  tos: 00  len: 41  id: 0xffd7  fragoff: 0  flags: 0x2  ttl: 128  prot: TCP(6)  xsum: 0x1541
TCP: 1104 -> telnet(23)  seq: 00e35fde  ack: acbfc077  win: 8101  hl: 5   xsum: 0x6f10  urg: 0  flags: <ACK><PUSH>
data (1/1): i
---------------------------------------------------------------------------------
#71       Receive time: 8827.190 (delta = 0.137)  packet length: 60    received length: 60  
Ethernet:   (00a0c81bde63 -> 00a024f0746c)  type: IP(0x800)
Internet:  209.254.1.1 -> 64.199.1.1    hl: 5  ver: 4  tos: 00  len: 41  id: 0xf920  fragoff: 0  flags: 0x2  ttl: 122  prot: TCP(6)  xsum: 0x21f8
TCP: telnet(23) -> 1104  seq: acbfc077  ack: 00e35fdf  win: 65472  hl: 5   xsum: 0x8ef3  urg: 0  flags: <ACK><PUSH>
data (1/1): i
---------------------------------------------------------------------------------
#72       Receive time: 8827.193 (delta = 0.003)  packet length: 55    received length: 55  
Ethernet:   (00a024f0746c -> 00a0c81bde63)  type: IP(0x800)
Internet:   64.199.1.1 -> 209.254.1.1   hl: 5  ver: 4  tos: 00  len: 41  id: 0xd8  fragoff: 0  flags: 0x2  ttl: 128  prot: TCP(6)  xsum: 0x1441
TCP: 1104 -> telnet(23)  seq: 00e35fdf  ack: acbfc078  win: 8100  hl: 5   xsum: 0x660f  urg: 0  flags: <ACK><PUSH>
data (1/1): r
---------------------------------------------------------------------------------
#73       Receive time: 8827.345 (delta = 0.152)  packet length: 60    received length: 60  
Ethernet:   (00a0c81bde63 -> 00a024f0746c)  type: IP(0x800)
Internet:  209.254.1.1 -> 64.199.1.1    hl: 5  ver: 4  tos: 00  len: 41  id: 0xf921  fragoff: 0  flags: 0x2  ttl: 122  prot: TCP(6)  xsum: 0x21f7
TCP: telnet(23) -> 1104  seq: acbfc078  ack: 00e35fe0  win: 65471  hl: 5   xsum: 0x85f2  urg: 0  flags: <ACK><PUSH>
data (1/1): r
---------------------------------------------------------------------------------
#74       Receive time: 8827.540 (delta = 0.195)  packet length: 54    received length: 54  
Ethernet:   (00a024f0746c -> 00a0c81bde63)  type: IP(0x800)
Internet:   64.199.1.1 -> 209.254.1.1   hl: 5  ver: 4  tos: 00  len: 40  id: 0x1d8  fragoff: 0  flags: 0x2  ttl: 128  prot: TCP(6)  xsum: 0x1342
TCP: 1104 -> telnet(23)  seq: 00e35fe0  ack: acbfc079  win: 8099  hl: 5   xsum: 0xd817  urg: 0  flags: <ACK>
---------------------------------------------------------------------------------
#75       Receive time: 8827.711 (delta = 0.171)  packet length: 56    received length: 56  
Ethernet:   (00a024f0746c -> 00a0c81bde63)  type: IP(0x800)
Internet:   64.199.1.1 -> 209.254.1.1   hl: 5  ver: 4  tos: 00  len: 42  id: 0x2d8  fragoff: 0  flags: 0x2  ttl: 128  prot: TCP(6)  xsum: 0x1240
TCP: 1104 -> telnet(23)  seq: 00e35fe0  ack: acbfc079  win: 8099  hl: 5   xsum: 0xcb03  urg: 0  flags: <ACK><PUSH>
data (2/2): ..
---------------------------------------------------------------------------------
#76       Receive time: 8827.860 (delta = 0.149)  packet length: 694   received length: 694
Ethernet:   (00a0c81bde63 -> 00a024f0746c)  type: IP(0x800)
Internet:  209.254.1.1 -> 64.199.1.1    hl: 5  ver: 4  tos: 00  len: 680  id: 0xf923  fragoff: 0  flags: 0x2  ttl: 122  prot: TCP(6)  xsum: 0x1f76
TCP: telnet(23) -> 1104  seq: acbfc079  ack: 00e35fe2  win: 65469  hl: 5   xsum: 0x293d  urg: 0  flags: <ACK><PUSH>
data (60/640): .[5;2HVolume in drive C has no label..[6;2HVolume Serial Num
---------------------------------------------------------------------------------
#77       Receive time: 8828.040 (delta = 0.180)  packet length: 54    received length: 54  
Ethernet:   (00a024f0746c -> 00a0c81bde63)  type: IP(0x800)
Internet:   64.199.1.1 -> 209.254.1.1   hl: 5  ver: 4  tos: 00  len: 40  id: 0x3d8  fragoff: 0  flags: 0x2  ttl: 128  prot: TCP(6)  xsum: 0x1142
TCP: 1104 -> telnet(23)  seq: 00e35fe2  ack: acbfc2f9  win: 7459  hl: 5   xsum: 0xd815  urg: 0  flags: <ACK>
---------------------------------------------------------------------------------
#78       Receive time: 9429.445 (delta = 601.405)  packet length: 60    received length: 60  
Ethernet:   (00a0c81bde63 -> 00a024f0746c)  type: IP(0x800)
Internet:  209.254.1.1 -> 64.199.1.1    hl: 5  ver: 4  tos: 00  len: 40  id: 0x9b57  fragoff: 0  flags: 00  ttl: 254  prot: TCP(6)  xsum: 0x3bc2
TCP: telnet(23) -> 1104  seq: acbfc2f9  ack: ----  win: 0  hl: 5   xsum: 0x560a  urg: 0  flags: <RST>


Thanks
George

 

by: George46227Posted on 2006-11-09 at 08:09:11ID: 17907091

11/9/06
11:00am

possible new relevant info:

I now have sniffers running on the client and the server-
client win98 se telnet with public IP thru T1 (no nat, no proxy)
server w2k telnet server behind NAT/port forwarded

in this test the session is dropping after 10-15 min idle without any keystroke entered (after the idle time) - spontaneous "Connection to host lost" on the client

both sniff logs show RST in the last packet - client sends RST to the server and server send RST to the client! How can each machine RST each other? RST terminates the session immediately - right? Makes me think something in the "middle" is sending the RST to both machines??!!

George

 

by: bmedwardPosted on 2006-11-09 at 09:42:34ID: 17907877

I wouldn't be alarmed by the mutual RST's - I would have expected the client to send a FIN, but I think that responding to a RST with an RST is a form of ACK'ing the RST.  Interesting behavior with your test - does it reliable behave this same way? Also, are you logged in? - if your client is just sitting at a login prompt, the server will usually reset the connection after a login timeout period.  

This could be a re-try issue - the time difference of almost exactly 600 seconds (5 min) seems like it would originate from a software controlled parameter, not just random chance.  

Do you have any other telnet servers to test a WAN client connection to?  It would be good to know if the same client to a different server (Solaris, AIX, AS/400, Linux, Cisco Router, or other) acts differently.

Here is some decent info from MS on Win2k network parameters.  Much of this should be portable to Win98.   Check out the section labeled "Transmission Control Protocol (TCP)" and the registry configurable parameters in Appendix A.  Adjusting the client's tcpWindowSize to a fixed value could be a solution - this seemed pretty erratic in the trace segment.  
http://www.microsoft.com/technet/itsolutions/network/deploy/depovg/tcpip2k.mspx

Have you tried Putty's telnet client (free) - it looks like it has a built in option for keep-alives.
They also have some FAQ info for dropped telnet sessions. http://www.chiark.greenend.org.uk/~sgtatham/putty/faq.html#faq-idleout

This could get ugly quickly, especially if you have to use production clients for testing and cannot easily reproduce the fault.  Researching all of the factors that could be contributing to this failure would be a very time consuming (expensive) process.  Furthermore, you may still find out that the corrective action is beyond your control.   Make sure that the powers-that-be are kept in the loop and have some grasp of what you are up against.  As a spare-time troubleshooting activity, this could span a few (more) months.

 

by: bmedwardPosted on 2006-11-09 at 09:59:03ID: 17907984

Couple more links on the PIX angle - no real answers that I saw from these.

http://www.velocityreviews.com/forums/t30867-pix-vpn-telnet-problem.html
http://www.velocityreviews.com/forums/t34032-intermittent-dropped-telnet-connection-through-vpn.html

Cisco example configuring the telnet session timeout - I would expect the dropped session pattern to be much more identifiable if this were the issue.

http://www.cisco.com/en/US/products/hw/vpndevc/ps2030/products_configuration_example09186a0080624e19.shtml

 

by: George46227Posted on 2006-11-09 at 13:31:53ID: 17909698

11/09/06
4:15pm

Thanks for the ideas.

The log above I posted is from the client side, I just today was able to sniff both sides (see my recent post above about the RST's on both sides). Sounds like you think the RST's on both sides is normal?

The log was-
Win98se telnet client thru Addtran T1 box public ip no nat/no proxy
W2k telnet server thru Netopia DSL modem public ip no nat/no proxy (although I am not sure how the boxes are wired - it could be that the server passes thru the pix first before it gets to the DSL modem or it could go from the switch to the DSL modem)

I have several servers I can test - w2k, w2k3, IBM AS400
clients will be windows telnet and AS400 emulation (mostly IBM Client Access, I could also use Netmanage ViewNow)
I have a variety of networks on client and server sides (I have remote control of desktops inside the LAN also) - T1, DSL, wireless DSL; proxy servers, NAT, pix, Netgear vpn routers

I can't say anything yet about the RST-to-RST being consistent on other systems.

I connect to the server, login with no problem, then run a command - I use "dir"
then I just leave it alone for a while
sometimes it spontaneously drops - the "Connection to host lost" window pops up on the client without any keystroke entered
other times nothing happens until I enter a ketstroke - I attempt a "cls", soon as I hit the "c" I get the pop up "Connection to host lost", I don't have to hit the enter key - just press "c"

the time period of 600 seconds I believe is 10 minutes - not 5

I have done some windows client to AS400 testing with similar results (unfortunately I have no access to the 400 for sniffing on the server side, checking the logs, changing the ip config, etc.)

LAN and point-to-point WAN (not internet) telnet is very stable, never goes down, I can approximate the problem by pulling the cat5 cable out (client) and hitting a ketstroke - the session drops after a short while, I'm not sure what the sniff would look like in that situation ( I assume the client tries to send a RST then disconnects the session, the server can't do a RST or anything since the client cat5 cable is un-plugged, it doesn't know anything is going on).

Thanks
George



 

by: George46227Posted on 2006-11-12 at 17:31:07ID: 17927466

11/12/06
7:55pm

Update:

after some further testing I have some interesting results:

client w98 se telnet behind DLink DSL router over cable (NAT)
server w2k srv telnet public IP DSL (no NAT/no proxy)

I see something like this:
client log
192.168.1.1:1025 > 209.1.1.1:23
209.1.1.1:23 > 192.168.1.1:1025
server log
70.1.1.1:60100 > 209.1.1.1:23
209.1.1.1:23 > 70.1.1.1:60100

note the server sees the public IP and port of the client's NAT router which is different than what the actual win98 client is using

note what happens after approximately 10 - 12 minutes of idle time:
server log
70.1.1.1:60101 > 209.1.1.1:23
209.1.1.1:23 > 70.1.1.1:60101 RST
client log
192.168.1.1:1025 > 209.1.1.1:23
209.1.1.1:23 > 192.168.1.1:1025 RST

the client's NAT router has changed the port from 60100 to 60101!! When the server see a connection from port 60101 it does not recognize it as an established connection - it's looking for port 60100. So it sends RST to port 60101 which is now mapped to the client port 1025. The client gets the RST and the connection is ended.
netstat on the server show 70.1.1.1:60100 ESTABLISHED - still connected and listening, no connection is shown for 70.1.1.1:60101
netstat on the client shows no connection to 209.1.1.1:23

Since most of the users are behind NAT I think this may be what is causing the dropped connections.

note: the logs I posted earlier on 11/8 do not show this behavior,  I will have to go back and re-check the logs and maybe re-do the test, the earlier tests were using client logs only, I only recently got a couple of servers setup with logging. The server logging is the only place the problem shows up (the change of the client's NAT port ).

George

 

by: bmedwardPosted on 2006-11-12 at 19:43:18ID: 17927782

The client port number stepping up by one is an indication that the client (or NAT/PAT) has reset one session and is trying to establish a new session - unless the telnet client is terribly messed up, it would not try to change ports mid-session.  

It looks like this is going through port translation, not just NAT.  When remote clients connect, are they always connecting through VPN tunnel to corporate network, and issued an internal IP address (if so, are they being NAT/PAT -ed at this point)?  Or, can public PC's talk to the telnet server directly?  

 

by: George46227Posted on 2006-11-13 at 10:34:49ID: 17932207

11/13/06
1:35pm

Yes I understand. Maybe I didn't explain it well.

It is not the telnet client that is changing the port - client port on the pc stays the same. It is the telnet client's NAT router that is re-setting the source port and changing the source port to a different number. Yes I think the NAT router is basically starting a "new" session ( a new tcp session not a new telnet session).

I am using the term NAT in the generic sense as everyone seems to do, although technically you are correct - it is really PAT, there is only 1 public IP, the NAT/PAT DSL router maps ports for each connection.

This is my test environment, I do not have direct access to the production environments. One of the main telnet servers having this idle/disconnection problem is inside a pix vpn tunnel, the clients at the remote branches also inside pix vpn, all connected by some type of DSL internet. Each pix device at the different locations provide internet access for web, email, etc. using NAT/PAT and also provides the vpn tunnel.

It is a pix router-to-router vpn over the internet, the clients do not get issued an internal ip. Each location has its own ip subnet like 192.168.100, 192.168.101, etc. It's all handled by the pix vpn setup - the telnet server points to the inside address of the local pix as the default gateway, the branch clients point to the inside address of their local pix as the default gateway.

We don't use any pc's with public ip's.

Thanks
George

 

by: prashsaxPosted on 2006-11-13 at 14:00:41ID: 17933988

Yes, you are correct, DSL router generally do PAT.

Now, can you find some connection timeout setting in DSL router configuration.

What DSL router do you have right now. What make, model??

 

by: George46227Posted on 2006-11-13 at 17:13:10ID: 17935104

11/13/06
7:30pm

No - I don't see anything in the routers setup about time-out, but I will check again - I am familiar with several makes and models, have set them up myself, have never seen any type of relevant time-out config, I fear it is hard-coded built-in.

One of the routers is a DI-604 Ethernet Broadband Router (according to the admin page). It does not have any relevant time-out setting, I checked all the admin pages.
Also there is a Airlink101 4-Port Internet Broadband Router (according to the admin page). It also does not have any relevant time-out setting on the admin pages.

Both above routers exhibit the time-out client source-port change behavior. It does not seem to occur on the server side even when the same router is used on the server (reverse the telnet client and server - the problem is when the telnet client is behind NAT/PAT
router, telnet server behind NAT/PAT does not seem to be a problem).

I do not typically see the behavior when the client has a public IP (no NAT, no Proxy) - but there is one exception which is similar but not exactly the same. My Addtran T1 box will kill idle telnet sessions predictably after 10 minutes. It does not change the NAT/PAT port - it does not do any NAT, the client port stays the same. The logs indicate that the client and the server will always receive a RST from each other after 10 min. idle - but the logs do not show either client or server sending any RST!! Any the disconnect is spontaneous - it does not require a keystroke from the client. My guess is the Addtran is sending RST to the client and the server - but impersonating the client and server IP's and ports!!

Tomorrow I am going to do some more tests involving Windows ICS and also pix NAT.

Thanks
George

 

by: bmedwardPosted on 2006-11-14 at 07:13:56ID: 17938640

evil computers

 

by: George46227Posted on 2006-12-10 at 18:48:01ID: 18112658

12/10/06
9:50pm

I have not abandoned the question, I would like to post some test results
for the benefit of others to see. I hope to do this in the next few days,
sometime this week.

George

 

by: George46227Posted on 2006-12-26 at 08:26:05ID: 18198526

12/26/06
11:25am

I would like to present some testing results which may be useful to others who have a similar problem. I will post more details when I have time to organize the data:

Telnet testing of idle sessions summary:

1. Telnet internal LAN connections (client and server on same LAN with no NAT no Proxy between the client and server) are very stable
2. Telnet external Internet connections (client and server on the public internet with no NAT no Proxy between the client and server) are usually very stable with at least one exception
-telnet client with a public ip T1 line Adtran Total Access 912 is consistently dis-connected after 10 minutes idle; the client and server both receive a RST but neither sends a RST, presumption is the Adtran is sending the RST to both ends
3. Telnet behind a Proxy Server (telnet client local ip behind Proxy Server public ip, telnet server with public ip) is very stable
4. Telnet behind NAT (telnet client local ip behind NAT router public ip, telnet server with public ip) is often un-stable
-after some period of idle time the NAT router causes dis-connection; the NAT router connects to the server on some source port to establish the session (ex. 1024); after some idle time the port (ex. 1024) is deleted by the NAT router; when the client tries to resume the session the NAT router uses a new source port (ex. 1025); the server does not recognize port 1025 - no session has been established to port 1025, it has a session with port 1024 not 1025, the server sends RST to the client which dis-connects the session
-this seems to only be a problem when the client is behind NAT router; server behind NAT router does not seem to be a problem
-the DSL modem does not seem to be the problem, telnet clients with a public ip going thru the same DSL modem do not have the problem, only clients using a NAT router have the problem (the NAT router is going thru the same DSL modem)

George

 

by: bmedwardPosted on 2006-12-26 at 08:39:01ID: 18198565

These are some good observations - sounds like you've been keeping busy!  I always find it frustrating when I have to engineer around the undocumented features of segments that I have no control over.  Evil computers.

Have you settled on a solution that fits your technical and financial needs?

 

by: bimmermanPosted on 2007-03-01 at 10:56:08ID: 18634449

I am currently testing a new setting with Client Access Express to AS400. Telnet connections over a firewall and a router were dropped without any message on client side - just a black screen.
If you use Client Access Express as telnet client, here is what I am trying as of now on some of our workstations: in the .ws file (which in fact is an .ini file for the telnet session) there is a section called [Telnet5250] under which I added the following line:

KeepAlive=Y

 

by: George46227Posted on 2007-03-05 at 11:40:16ID: 18656730

3/5/07
2:30pm

A few last comments:

see the above post from me for previous detail (12/2/6/06 11:26am EST)

the problem seems to be:
NAT/PAT which re-sets the client source port after a period of inactivity, the causes the server to fail to recognize the session as an established session, server sends RST which dis-connects the client

in general order of stablility from high (best) to low (worst)
the proxy server seems to be the most stable - proxy tested was Fortech Proxy Plus on NT4 and Win98 SE, both machines had a public ip with DSL, server usually did not cause disconnects after long periods of inactivity, did see one case of a FIN apparently sent by the proxy after a long period of inactivity (this was on a Win98 also running Internet Connection Sharing-NAT, so hard to say for sure whether the ICS-NAT may have caused it instead of the proxy), proxy was especially stable on NT4
PIX NAT-PAT - stablity good, RST seen after long inactivity time (60 to 90 min. range), I don't have access to the configuration so maybe this is configurable?
Win98 ICS-NAT - fairly stable, RST seen after a long inactivity time
LinkSys NAT-PAT DSL router - no RST seen but the connection just fails/hangs after a period of inactivity
Airlink NAT-PAT DSL router - RST seen after period of inactivity
Dlink NAT-PAT DSL router - RST seen after inactivity, sometimes only 15 minutes
Adtran T1 box - worst stablity, RST sent to both the client and the server ALWAYS every 10 minutes on schedule!
LAN connections last forever with no problem
WAN connections using DSL without NAT-PAT last foreverv more or less

George

 

by: George46227Posted on 2007-03-05 at 11:43:17ID: 18656759

3/5/07
2:40pm

bimmerman

how did your test come out? I am also using IBM Client Access to 400. The "KeepAlive=Y"??

George

 

by: George46227Posted on 2007-03-05 at 12:03:58ID: 18656919

3/5/07
3:00pm

I am going to close
Although no specific solution was presented - I am awarding points based on response and effort

Thanks to everyone
Georgd

 

by: bimmermanPosted on 2007-03-05 at 12:29:28ID: 18657103

Three days as of now and still works fine. I am talking about wkstations using the connection for a work day long 8am to 4pm.  

 

by: George46227Posted on 2007-03-05 at 13:45:53ID: 18657664

3/5/07
4:50pm

You are using IBM Client Access to connect to IBM AS400? What ver - I am using mostly V4R5, I think the 400 is V5R1 or 2

What type of problem were you having, what symptom or error?

Has this config solved your problem? I tried to search the IBM help files and web site docs but couldn't find anything useful. There was a tool called Comm Power Tool or something but it required V5R1, even using V5R1 for testing it didn't solve the problem (OS was w98 se, maybe it needed w2k or xp?)

let me know how it is working, I have a better understanding of the cause of my problem but still no solution (the guys in charge of the network equipment - DSL routers, PIX's, etc. insist the problem is not with any of the network LAN/internet/vpn equipment)

George

 

by: bimmermanPosted on 2007-03-06 at 08:58:44ID: 18663283

I am using Client Access Express to connect to V5R1.

My problem was that connections were dropped without any message (Windows or CA) on the wkstations and no traces in the logs on the AS400. An important thing that lead me into thinking is a timed out connection was that the wkstations were disconnected at randoom and not all at the same time.

And yes, the setting works for me.

Here's the document I have found:

http://207.181.121.77:999/George46227/as400.pdf

 

by: George46227Posted on 2007-03-06 at 10:22:58ID: 18663991

3/6/07
1:15pm

bimmerman:

Please keep in touch, post updates if things are working or not, if this appears to be a long-term permanent solution.

I have been working on this problem almost 6 months with no real effective solution, just crummy work-arounds that haven't really done much good. I have searched everywhere, Google, IBM web site, etc. with no answer.

I will try to implement this config tomorrow, I will let you know how it works out.

George


 

by: bimmermanPosted on 2007-03-06 at 10:46:21ID: 18664200

Save the pdf I posted above cause I will remove it after you confirm you have it saved.

 

by: George46227Posted on 2007-03-07 at 09:11:12ID: 18671793

3/7/07
12:10pm

bimmerman:

I have saved the posted document from your link

thanks
George

 

by: George46227Posted on 2007-03-08 at 07:59:06ID: 18679739

3/8/07
10:45am
bimmerman

do you have any other info on the KeepAlive? is it configurable? How often is it sent? is it just an ACK? Did you have to re-config your MS TCP - registry settings for KeepAlive, etc.?

George

 

by: bimmermanPosted on 2007-03-08 at 08:11:24ID: 18679867

All I did was to add KeepAlive=Y in [Telnet5250] section of the .ws config file for Client Access Express.

A note of importance maybe: In my case, when pinging the AS400, the replies are in the the low 60s for TTL as compared to other OS servers which are around 125.

20120131-EE-VQP-002

3 Ways to Join

30-Day Free Trial

The Experts

98% positive feedback on 31,087 answers since March 2000. angeliii is a Microsoft Most Valuable Professional for his work with MS SQL Server & Develoment.

He has also proven his knowledge of Visual Basic Programming, PHP Scripting and Oracle Databases.

The Experts

97% positive feedback on 10,752 answers since July 2000. lrmoore has more than 18 years experience in the networking industry.

The six-time Mircosoft MVPs specialties include firewalls, virtual private networking, and network management.

Testimonials

"...and excellent source for support... Kind of like having your very own IT dept." Electriciansnet

Testimonials

"I was apprehensive at signing up at first. However... it has already made my life as an IT administrator much easier." JaCrews

Testimonials

"WOW! You guys have great, active, and knowledgeable people on here." moore50

Business Clients

Business Clients

In the Press

"If you’ve got a question... Experts Exchange can supply an answer.”

In the Press

"...an invaluable aid for both IT professionals and those who require tech support."

In the Press

"where IT professionals provide quick answers on just about any topic"

Business Account Plans

Loading Advertisement...