Link to home
Start Free TrialLog in
Avatar of webworkhouse
webworkhouse

asked on

Packet Loss and changing MTU settings

While trying to find a solution to a packet loss problem we are getting with our internet connection our ISP has asked us to change the MTU packet size on our router from the default value of 1500 to 2000.

However when i go to change the MTU size on the ethernet interface(cisco 2811 router) i get the following error:

webworkhouse(config-if)#mtu 2000
% Interface FastEthernet0/1 does not support user settable mtu.

So my question is, am i doing something wrong, or is it just not possible to change the MTU on this interface ?

Some more details about the connection:
Cisco 2811 router with 2 ethernet interface ports, 1 connected directly to our lan, the other connected to our isps device with takes a standard ethernet connection.
The actual internet connection is ethernet over fiber which is all handled by the isps hardware.

As a side note, im not sure how increasing the MTU could fix this problem.  The suggestion came after i started noticing packet loss when i ping with large datagram sizes (above the 1500 byte size of a single packet).  
The Isp suggests that the problem is caused because recombining the ping packets is taking too long and so the larger MTU should help, but this would seem to me that it is simply hiding the problem rather than fixing it.
Any insight into this would be appreciated as this problem needs to be resolved and im not sure the ISP is  on the right track.





Avatar of Reid Palmeira
Reid Palmeira
Flag of United States of America image

when you ping with the larger datagram sizes some devices may not even bother to recombine them at all. the standard Ethernet MTU size is 1500. DSL connections might require it to be lowered but I'm surpised that an ISP would tell you to increase the MTU.

before you go around trying to reconfigure interfaces, run a path MTU discovery to some of the server's you're trying to ping. My guess would be the larger ICMP packets are just getting dropped because they're too big and ICMP isn't really meant as a data transmission protocol so it's just getting discarded.
in some circumstances, MTU size helps with packet loss - specifically, where packets have the df flag set.  this is more specifically experienced in vpn environments where the VPN encryption adds an overhead to the packet, and the underlying application sets the do not fragment flag to yes.  there are ways to overcome this however.

Avatar of dannlh
dannlh

Ummm.... Black Hole Router somewhere? If you're Windows based there is a way to find the Black Hole and a couple ways to fix it.

http://support.microsoft.com/kb/314825

dh
Avatar of webworkhouse

ASKER


A quick update on this..

Since im only pinging my next hop router there isnt too much in the way that could be causing trouble with the mtu, My ISP supports an MTU of up to 2000 between my router and theirs, and an MTU of 10000 beyond that, within their own network(which doesnt really make much sense to me, why run a higher mtu between a customer & the internet ?)

rpalmeira22: you suggest that the packets are just being dropped ? would that not be an all or nothing response that i would be seeing ? as in the router at the other end would drop all large icmp packets, or return all correctly if that was the way it was configured?

Can you suggest a way to test some more real world type data rather than using pings.  After all thats the traffic i want to travel correctly, i am just using pings to highlight some problems.  I have tried some Voip quality testing sites on the net, but the results they give are inconsistent and also not very informative.

naughton: Ill keep that in mind if i need to set up a VPN, but in this case its just a normal connection, no encrytion or tunneling & the do not fragment flag is set to no.

dannlh:  Unfortunatly this solution is also ruled out because the "dont fragment" flag is set to no & i can be 100% sure the router i am sending the ping requests to can handle the MTU setting.


The only other relevant info ive found is that I have seen some improvements in the packet loss while the connection is not in heavy use.  This morning i actually got 100% success with 1000 pings or 1900byte packets (fragmented into two packets per ping).  But as soon as their is some load on the line i begin to see loss.

Thanks for the suggestions so far, everything i can knock off the list is a tiny step in the right direction







SOLUTION
Avatar of dannlh
dannlh

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
if its dropping packets under load, then it could simply be a hardware issue.

do you have a spare, or are they able to provide a replacement router>?

There are applications that you can use for stress testing - you can also play with the ping settings and use larger packets and force not to fragment  
ping -t -f -l 1500 ExternalIP  will set a persistent ping with packet size of 1500, but with the do not fragment packet flag set.  you can play with the packet size to see the effective mtu - and also if larger packets are causing the packet loss.




I have attached a few graphs of pings @ 10 second intervals from my server to the outside interface of my router, the next hop router & the bandwidth being used at the time.  

You can see that @ 10:35 the bandwidth usage spikes as i started a few downloads to stress the line, however it doesnt seem to have a huge effect on the ping time.

(one thing to note with the graphs, on the bandwidth graph, the monitor only seems to read and reset the counter every second attempt, which is causing the reading to be double what it should on the first reading, and 0 on the next.  The true bandwidth usage seems to be the average over the two readings, so each spike on really only half what it shows on the graph)

Dannlh: Im not 100% sure what the transport protocal is, i know its ethernet over something else, running on fiber.  Ill try and find that out and post the details.

We do not have any bursting ability set up with out provider.  It is a 10 Megabit dedicated connection and we should not be allowed to go above that at all.  This should mean that all our packets are, in the case thats its frame realy under the CIR ?

As for the pings ive been doing.  Most have been from my routers external interface and i have been pinging my next hop router as the problem seems to be limited to between these two devices.

The issue of recombining the packet stream was really what i was trying to test with the larger packets.  I assumed it was not a lot of work for my router (cisco 2811) to recombine 2 packets for each ping.  

Increasing the MTU to allow the larger pings to be sent unfragmented might solve the problem of the bad pings, but will still leave the problem of multi packet data becoming discarder because of single packets failing which is what seems to be happening with the pings once i fragment them.

This bring up the question of how similar a multi packet ping is to any other multi packet chuck of data, and how similarly does the router treat it.

Naughton: Yes i have a spare, but the line is in constant use(24/7 business).  It looks like i might have to spend a night in here soon to do some testing, but i cannot swap out for a spare right now.

Your suggestion with the pings is how i noticed the problem in the first place, setting a larger packet size shows up the problem more readily, but setting the Packet size to anything over 1500 with the 'do not fragment' bit fails as my router is fixed at an MTU of 1500.


tests.JPG


The transport layer seems to be ip over sdh.

I dont have a lot of knowledge of sdh so im not 100% sure this is correct.

1. Determine what actual MTU you need to have on network -
http://www.dslreports.com/faq/695
http://www.dslreports.com/faq/7801
2. set up all devices on network with same determined MTU
http://www.dslreports.com/faq/8873
and some others advises- can be a lot of them, main idea, that MTU needs to be set to lowest, to prevent packet drop.
And of couse after each settinig should be good idea check speed performance to remote PC by using
kperf_setup.exe  - http://dast.nlanr.net/Projects/Iperf/
to see exactly where is the problem is.
Well ultimately it is ethernet, so it could just be dropping packets. The SDH should be transporting everything it sees. Because of the synchronous nature of the SDH, everything that gets on it gets transported. The next question is, how are you getting on the SDH network? You said you are ethernet out of your router? What kind of equipment does the carrier have to convert you to the SDH for signaling on the fiber? And now this is going to sound like a stupid question, but you do have a -good- patch cord between your router and the carrier's equipment right? :-) Have you asked the carrier to check error rates on the ethernet port that converts from ethernet to sdh?

dh

Dannlh,

yeah, the connection is ethernet out of the router to the carriers hardware onsite which does all the SDH work.  That equipment is a tellabs 6310 edge node.  Conveniently we can be pretty sure that isnt the cause of the problem as our original 6310 that was installed had a fan failure about a week ago and was replaced.

:) not a stupid question at all.  It was one of the first things i tried.  The patch cable has been swapped twice with cables that are happy to run gigabit speeds without a problem.

Before the 6310 we were having some problems, but between getting the hardware replaced and by forcing my router to run in full duplex( we were getting late collisions because of a mismatched duplex setting) the carrier now reports that there is a clean connection from our site to the NOC where we patch to our isp.



Is it possible the ISP has their router overloaded at the NOC? Have they checked the stats on their router? Are you on a dedicated hookup to the ISP with your own set of IPs, or are you on a shared subnet with other customers? Does this happen only when you load down the link from your site? Day/Night differences? Can you steal the connection for a while and hook a PC up to it directly and stress-test the link that way? If you can eliminate your router as the problem then its a step forward.

Sorry, I know its a million questions, but sometimes they get us to the correct answer.

dh
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
ive scheduled some time monday that i can take down the interface and hook up a pc, so ill run some tests and post an update then.

the millions of questions have been great dannlh. everything little bit helps
This problem seemed to go away on its own after a while.

thanks for all the good suggestions.  Im not sure if any of them had any effect but they helped me at least monitor the issue.

ill split the points over the main contributors.
Thanks for the help.  I hope the points are ok.