Link to home
Start Free TrialLog in
Avatar of Melange
MelangeFlag for United States of America

asked on

Network timeouts over WAN

We have a dedicated 56K line to an ISP using a Cisco router and a Motorola CSU/DSU. We seem to be having a lot of  timeout problems with traffic crossing the WAN. It seems to occur only with large blocks of data (files, mail messages, etc.) of at least 100K. It may be that any size is
succeptable, but that it shows up easier with larger transfers.

For example, I set up an FTP server on our site and telneted into a machine at the ISP site. I tried uploading a file to our server (about a 1MB file). There is a 100% failure rate with this test. The connection is always
established with no trouble (as far as I know anyway), then proceeds along happily until the transmission suddenly stops (forever) until the TCP timeout.

Note, this is NOT isolated to FTP. Our mail server has trouble receiving large messages as well. The server is running NT Server 4.0.

I thought originally that there may be a problem with that particular machine, so I set up a slightly faster machine with a better NIC with the same results. Testing with other machines running Internet Explorer "seem" to have no trouble with web pages. However, though I am able to download files, there does seem to be an unusually high timeout rate here as well.

Traffic on the LAN seems to have no problem whatsoever.

What do you think could be the problem?
Avatar of wayneb
wayneb
Flag of United States of America image

This link that I have included has the registry entrys for tcp and nt the only problem is that you will have add some registry parameters.  You will have to set the mtu or maximum transmition unit of windows nt and maybe the tcp receive window. I have included the link to look at.  If you follow the faq below and set the 2 parameters to something less then the default it will work. Or at least it should.

http://support.microsoft.com/support/kb/articles/q140/5/52.asp
Start out with an mtu of hex 3e9 or 1001 and see what happens, from there.
Avatar of agolan
agolan

I think that correct reconfiguration of the router would help
much more than changing NT parameters that will affect the LAN users at the same time, but <wayneb> answer should definatly help as well, maybe you need a combo. Let me know if you need help with this.
Avatar of Melange

ASKER

Ok, sorry it's taken so long to reply. I tried setting the TcpWindowSize value, but there was no effect at all. What is the registry setting for mtu? It wasn't listed in the doc you mentioned.

Anyway, I don't think it'll make any difference either. It does seem to be outside the NT machine.

Agolan, you have a good point here. Can you elaborate?

Avatar of Melange

ASKER

For reference, we're using a Cisco 2501 router with a Motorola 56k CSU/DSU.
Cool,
The only elaboration I can do curently is to tell you that fine-tuning a router is very dependent of your connectivity.
We can proceed in two ways:
1) email to noc@golan.net the result of "show tech-support" on
your cisco. (you should be in enabled mode on the cisco, I.E. use the "enable" command to get there). The outpout is VERY long so I guess that it doesn't make sense to send it over here.
(and double check that the password were <removed> by the cisco
before emailing it.
2) Step by step, I'll ask questions and you'll answer them,
and I'll ask again etc...
The first set of questions is:
1) who is your ISP,
2) Is the 56K link a frame-relay link.
3) What is the software version of your router (use: "show version" and give me the output.

I do really preffer the first method, it's both faster, more secure, and this will answer most of the questions, if you
go for the first, don't bother about the qeustions in the second.

In any case we will continue here the discussion and suggested solutions.

Note: You are right, it's an HDLC 56K clear line.

1) Are you aware that your router was rebooted by power-on
(that is either manualy shutdown or bad electrical connection)
one hour before the log you sent me ?

2)Your router is running a software version from flash that is older than the version in ROM ?!? It might have been because the version in flash (10.3(16)) was more stable than rom 11.0(10c),
but since then alot of water.... and bug fixes ;-)

3)Something is a litle weird with your serial interface setup,
on one hand the network used has a netmask of 255.255.255.224,
on a network supposed to be point-to-point. I would have expected
a netmask of 255.255.255.252, especially that real address space
is used for this net. (can you please double check with your ISP ? )

4) In the same mind, You are sending default traffic to Serial0
- ip route 0.0.0.0 0.0.0.0 Serial0 -
This makes sense with point-to-point network, and it doesn't
seems to be the case (at least with the current network masks settings. (Please check with Your ISP what is the IP address that
you should send your default traffic to).

5) By default your Cisco is sending keepalive packets on the serial link every 10 seconds... I wonder who it is sending to.
I wouldn't care too much about it otherwise, but in our case
we see that there was 1 interface reset in 1 hour, it's way too much, usualy once a day is acceptable, but yet, it might be the reset at boot time. However since you have a small link, if the line is loaded the keepalive or the keepalive response might be droped under heavy load, maybe setting
"no keepalive" on serial0 would make sense, the interface will
still be overloaded on heavy transfer but won't turn itself off.
keepalive is VERY important when you have multiple paths out,
it is not the case.

6) One thing even more amazing and ALARMING is the:
"156 carrier transitions" on serial0
You have definatly a problem with either with your CSU/DSU, or the cable between the CSU/DSU and the Cisco, or the power outlet,
or there is something wrong with the link.
I suggest that you'll:
a) open a case with your local Telco to check the link.
 (it's usualy a slow process so let's get it started)
b) Try to ask you ISP if they can provide you another CSU/DSU
+ Cisco cable for testing.
c) It seems to me that something might be really wrong with
your power, unless you know for sure why the Cisco was switched off, and you are sure that the CSU/DSU is not affected by the mains power, I would suggest that you'll connect one of those old digital clocks that reset themself when there is a power failure to the same power outlet, and see if it keeps the time. It is not a 100% reliable test because some of those clocks can take
more voltage variations than the Cisco or CSU/DSU, but it had already pointed out problems several times.
If the system is on UPS... DOUBLE CHECK the UPS !!!! a malfunctioning UPS could cause such troubles.
If the line (56k) termination near the CSU/DSU is not the Telco
termination but an extension, tell me about it, and I'll try to guide you on what to check.

It was wise to send me the conf, we have really advanced fast.
I suggest that you'll ask your ISP if you are eligible for
a software version upgrade for your router, at least some 11.2
version. (if he ask you, you have 8Mb flash).

I have also another idea that we could test, do you feel
comfortable about changing the router configuration ?
If so we could move the ISP connection to serial 1,
just in case something is wrong with serial0.
(it's not likely to happend, from My 25xx experience, but
 if everything else fail we should test this as well).
Let me know how do you feel about it.


Well, that's it for now.

Keep me tuned.

P.S. I recorded the "156 carrier  in "1 hour, 8 minutes" into my book of records ;-) it's over two transitions per minute !!
Avatar of Melange

ASKER

1. I MANUALLY reset the router. So that's not a problem. I'm sorry I forgot to mention this. I was talking to my ISP about this (their still a little baffled, by the way), but they had recommended shutting down the router and CSU/DSU. I didn't think it would solve anything. It didn't.

2. I suspected that the software might be old. The router is ours and not from the ISP, so how would be the best way to upgrade this? Talk to Cisco? Someone else?

3. I configured everything on the ethernet side, but the ISP set up the serial side. So I don't know why on this. But I will ask.

4. Yeah, I'll check this too. I've looked at the configuration in the past and couldn't tell how it knew to send the packets across, but it does (otherwise you wouldn't be seeing this).

5. The interface reset? Do you mean the whole system shutdown/restart in this case? Or something different? If something different can you elaborate?

6. OK. What do the carrier transitions mean? Is this like a quick line reset? Now what I have noticed is that the serial line does go down for 1 second at a time every so often - approx. 2-5 times every day. I don't have any other experience in this; so do you know if this is normal or not? Perhaps these carrier transitions are the crux of the whole problem. Who knows? Anyway for further reference after being up for 22 hours, 6 minutes the router shows 2,258 carrier transitions.

Also for reference. These are the full serial line drops/resets that have ocurred since I restarted the router:
    Jun 25 18:43:46 (1 second)
    Jun 25 19:33:38 (1 second)
    Jun 26 00:25:56 (2 seconds)

I believe that the 56k line is terminated at a building phone closet and then extended to here. Although I'm not 100% sure about this. I'll try to find out. What can I check about this line?

I'm fully comfortable with configuring the router.

ASKER CERTIFIED SOLUTION
Avatar of agolan
agolan

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Melange

ASKER

3) Yep, it is supposed to be 255.255.255.252.

4) 0.0.0.0 is correct. I think it is treated as a point to point. Probably the difference is that we're not getting a dynamic IP address for the serial interface, which I guess makes sense with a dedicated connection.

6) Still looking into the carrier transitions


3) So go ahead, change it if you didn't allready ;-)
4) 0.0.0.0 0.0.0.0 IS correct, the Serial0 after it is correct as well but I would prefer to see :
IP ROUTE 0.0.0.0 0.0.0.0 s.s.s.201 4
instead. where the s' before 201 are the same as in your serial interface. The "4" is to give a weight to this route, that is
something that will tell how valuable it is, 4 is a good value.
6) Did you opened a case with your local Telco already ?
if they view the demarcation as the end inside your place and not
at the building phone closet, they'll check it up all the way, it depends much on the Telco and on the good willing of the technicians.