Solved

Persistent Packet Loss on T1 Connection

Posted on 2010-09-20
12
1,162 Views
Last Modified: 2013-11-29
Experts,

We have a bonded-T1 connection at our office we use for our internet access and phones.  It has been is use just over a year.  For the first 11 to 12 months it has served us fine.  However, over the past couple of months we are seeing a persistent 2%-4% packet loss.  We see this by running constant pings both to our T1 gateway as well as sites like MSN, Yahoo, Goodle, etc.

Our T1 provider insists the lines are clean and error free plus they've replaced all our on-site equipment.  The phone company providing the local loop insists the lines are clean.  They've pointed the finger at our firewall.

Our network config is not complicated.  The T1 connection runs straight into our Firewall which is an Astaro Security Gateway running version 7.507.  From there we have a NetGear gigabit switch serving around 20 Windows (2000 through 7) and Mac machines plus a few iPhones, Droids, BlackBerries that connect to a WiFi (airport) we have set up.  We use a Windows 2003 SBS as a domain server.

We disconnected the T1 from our network and plugged it directly into a laptop and pinged our gateway same as we had been doing before.  Much better.  There were only 2 timeouts after running for 20 minutes.  So, we thought, maybe it WAS our firewall.  To test, we dusted off an old Astaro V6 firewall that was no longer in use and hooked it up.  Same result!  Right about 4% loss!  This leads me to believe it could be something on our internal network causing the trouble?

Any ideas what could be causing this or suggestions of tools for tracking this down?  I've downloaded pingplotter and wireshark but I'm a software developer by trade so I'm not sure how to interpret much of their networking mumbo-jumbo.

Our internet access is still usable and on a good day our throughput measures around 2600 kbps up and down.  But the frequent packet loss causes delays or timeouts in page loads, disconnections of FTP and webinars.  The majority of use is simple web browsing.  The timeouts and disconnections continue over night even with little activity.

Our phone lines are not used heavily.  Rarely is more than one person on the phone at a time so that should not be a problem as far as our bandwidth over the T1s is concerned.

Thanks!
0
Comment
Question by:pmascari
  • 5
  • 2
  • 2
  • +3
12 Comments
 
LVL 8

Expert Comment

by:moonie42
ID: 33719780
Have you tried the most simple fix yet?  Swap out the patch cable between the SmartJack and the router?  If not, that's the first thing I'd try.
0
 
LVL 8

Expert Comment

by:moonie42
ID: 33719882
Note...when swapping out the cable, make sure you replace in kind (pass through for pass through, or crossover for crossover).
0
 
LVL 8

Author Comment

by:pmascari
ID: 33719952
Yes.  Tried replacing cables.

Also, tried running from the T1 through a small switch to the FW, rather than directly into it.

Also tried changing our FW interface card from Auto-duplex to 10BaseFull to at the suggestion of our T1 provider.

No change.
0
 
LVL 8

Author Comment

by:pmascari
ID: 33719960
And yes, we made sure we were using the correct type of cable (passthrough vs. crossover) at each configuration.
0
 
LVL 24

Expert Comment

by:rfc1180
ID: 33720286
if this is a Cisco router, can you please output show int serialx/x?
also, show controllers serialx/x

this will be for both T1s

Billy
0
 
LVL 3

Expert Comment

by:JDavis1
ID: 33720560
First of all, I have to say that it is probably not a very good practice to try and judge local network performance by pinging internet sites.  You introduce unknown paths and unknown variables into the mix when you do that, so keep your testing limited to the devices and paths you want to test.  When you refer to your "T1 gateway", what do you mean exactly?  Is that an on-site router provided by the ISP?  If so then this may be as far as you need to go to test your firewall.

Another good tool to use to test for packet loss is "pathping" which is included with XP.  This tool sends multiple pings on a per-hop basis and reports on the packet loss between the workstation running pathping and the hop that it is pinging.  Try running pathping from a workstation behind the firewall to the address of a device that is local but on the other side of the firewall.
0
Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 
LVL 1

Assisted Solution

by:FrankWalters
FrankWalters earned 166 total points
ID: 33724950
First, if you are using multiple T1s bonded, I would try a single T1 and see if you can replicate the problem. When replicating the problem, be sure to try to replicate it with similar load demands...

That brings me to my second suggestion/note.  A laptop pinging creates far less load.  When testing a circuit that fails only under load, a laptop with pings may not trigger the problem.  We've run into alot of strange issues with circuits, especially bonding.

Is there a way to test with 1 T1, not bonded?  (Sometimes simply disconnecting the T1 cable can do the trick, although I'd check with your provider.)  If so, test with each T1 separately, in case it's just one.

Is there a way you can load up the line when testing outside the firewall?  This would provide more data and help isolate the problem further.

As a note regarding software for troubleshooting, a wireshark sniffing outside and inside the firewall might provide additional info, you're looking for something different between the two.

Additionally, tools that can load the circuit would be helpful, something mor elike hping or iperf.
0
 
LVL 8

Author Comment

by:pmascari
ID: 33726350
Thanks for the replies.  

I'm not really judging my local network performance here.  I'm judging my connectivity to the internet.  We don't have any internal performance issues.  We've tried pinging not only internet sites but our T1 gateway...the gateway being the network of our provider outside the firewall.

We have remote machines that we're able to use to ping our FW from the outside.  Same issue: timeouts.

I'm going call our provider (again) and then try pathping, and testing an unbonded T1 and will keep you updated.
0
 
LVL 3

Assisted Solution

by:JDavis1
JDavis1 earned 167 total points
ID: 33728880
Ok, well you did state that you believed that the the packet loss might be caused by your firewall.  And my point was that if that is what you are focusing on as the culprit you should eliminate as many other variables as you can when testing it.  Thus, if you can recreate the issue when pinging from an internal host to the on-site ISP router then you can rule out anything on the other side of the ISP router, i.e. your T1 circuits and the internet.
0
 
LVL 8

Accepted Solution

by:
russell124 earned 167 total points
ID: 33738849
Couple things to check on the Astaro, review the IPS logs, they really shouldn't filter ICMP stuff, but you never know.  Also make sure you don't have any sort of ICMP flood control enabled that might be causing issues.

Another thing, verify the MTU settings on your WAN interface with your ISP.  1500 is the default, I had one firewall that had the default MTU changed, and this caused all sorts of random traffic issues, like certain workstations being able to browse certain websites, while others couldn't.  
0
 
LVL 8

Author Comment

by:pmascari
ID: 33853650
Sorry for the long delay.  This issue has been infuriating.  Turns out to be a communication problem between our T1 provider and our firewall.  Something must have changed on their end because we ran fine up until this past July.  We tried two different Astaro boxes (running different versions) and also tried an off-the-shelf router.  Each exhibited the same timeout issues.  We've been on the phone for hours with our provider changing duplex settings, they've been out to replace our equipment twice, and even had AT&T out here testing the local loop.  All to no avail.

Finally, grasping at straws, we hooked up an old WinXP machine and enabled Internet Connection Sharing.  For whatever reason, this seems to be holding...no more timeouts.  WTF?  Obviously this is not the ideal setup so we're starting to build a new Linux box we hope to use as our FW.  Hopefully it will be able to talk to the T1.  Fingers crossed.
0
 
LVL 8

Author Closing Comment

by:pmascari
ID: 33853659
Thanks for your responses.
0

Featured Post

6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

Join & Write a Comment

Please see preceding article here: http://www.experts-exchange.com/Networking/Operating_Systems/A_11209-Root-Bridge-Election.html Figure 1 After Root Bridge has been elected, then what?..... Let's start by defining a Root Port in la…
SSL is a very common protocol used these days when browsing the web.  The purpose is to provide security to communication, but how does it do it?  There are several pieces at work that have to be setup before SSL will even work and it requires both …
Viewers will learn how to properly install and use Secure Shell (SSH) to work on projects or homework remotely. Download Secure Shell: Follow basic installation instructions: Open Secure Shell and use "Quick Connect" to enter credentials includi…
After creating this article (http://www.experts-exchange.com/articles/23699/Setup-Mikrotik-routers-with-OSPF.html), I decided to make a video (no audio) to show you how to configure the routers and run some trace routes and pings between the 7 sites…

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now