Solved

Ping Timeouts to one destination before but not at the firewall.

Posted on 2015-01-29
17
133 Views
Last Modified: 2015-02-18
I am troubleshooting ping timeouts at a remote location.  A VPN Exists between Remote and home locations by virtue of 2 Cisco ASA5505s.  If I remote into any machine on the remote network (from a tertiary location) and ping either the public IP or the private IP of the home network I get about 20-30% timeouts when issuing ping -t.  If I ping any other address no such timeout occurs.

If I log into the remote ASA and issue a ping command to the home public IP I consistently get 100% success with one instance of 80% which I was unable to reproduce.  The ISP has also performed this test on their router with similar results.

When pinging The remote site public IP from home base I have few timeouts 2-3%.  When pinging the remote private IP I get many timeouts 20-30%

We are getting AD DFSR errors between the sites as well as scan to email failures across the VPN connection.  It does replicate and users can scan but not reliably.

We have VPN connections between the remote location and several other sites as well and none of those connections exhibit this behavior the same being true of Home Base as well.

Any help isolating the cause of this behavior would be much appreciated.

Tracert Remote>Home  
  1    <1 ms    <1 ms    <1 ms  Remote FW
  2     1 ms     1 ms     1 ms  Remote Router
  3     2 ms     5 ms     2 ms  te-0-2-0-3.atlngamah66cr03.paetec.net [169.130.8
3.241]
  4     1 ms     1 ms     2 ms  gi-5-0-0-12.atlngamah66cr01.paetec.net [169.130.
96.198]
  5     2 ms     2 ms     2 ms  64.80.13.90
  6     2 ms     2 ms     1 ms  h202.58.132.40.static.ip.windstream.net [40.132.
58.202]
  7     2 ms     2 ms     2 ms  atl-bb1-link.telia.net [80.239.194.9]
  8    14 ms    14 ms    14 ms  ash-bb3-link.telia.net [80.91.252.213]
  9    14 ms    14 ms    14 ms  ash-b1-link.telia.net [80.91.248.157]
 10    17 ms    16 ms    16 ms  213.200.66.97
 11    34 ms    33 ms    33 ms  xe-4-2-0.dal20.ip4.gtt.net [89.149.128.70]
 12     *        *        *     Request timed out.
 13     *        *        *     Request timed out.
 14     *        *        *     Request timed out.
 15    42 ms    47 ms    46 ms  beyondtekit.utl.dfw.unsi.net 
 16     *       40 ms    47 ms  Home

Trace complete.

Open in new window


Trace Home>Remote
  1     2 ms     3 ms     6 ms  Home
  2    14 ms     *        8 ms  Home Router
  3    16 ms    19 ms     *     199.116.146.146
  4     8 ms     9 ms    17 ms  100.123.213.116
  5     *       13 ms    14 ms  ip4.gtt.net [173.205.59.137]
  6     8 ms     8 ms    13 ms  xe-5-0-2.dal33.ip4.gtt.net [141.136.111.150]
  7    14 ms    18 ms    16 ms  as2828.dal33.ip4.gtt.net [199.168.63.250]
  8    35 ms    40 ms    43 ms  207.88.14.242.ptr.us.xo.net [207.88.14.242]
  9    39 ms    35 ms    30 ms  te-4-0-0.rar3.atlanta-ga.us.xo.net [207.88.12.1]
 10    41 ms    39 ms    39 ms  ae0d0.cir2.atlanta6-ga.us.xo.net [207.88.13.9]
 11    42 ms    47 ms    46 ms  67.106.215.82.ptr.us.xo.net [67.106.215.82]
 12    51 ms    45 ms    49 ms  h203.58.132.40.static.ip.windstream.net [40.132.58.203]
 13    50 ms    51 ms    43 ms  gi-5-0-0-80.atlngamah66cr01.paetec.net [64.80.13.89]
 14    45 ms    52 ms    46 ms  te-0-2-0-0-13.atlngamah66cr03.paetec.net [169.130.96.216]
 15    39 ms    47 ms    46 ms  te10-1.atlngamah66pe02.paetec.net [169.130.83.240]
 16    49 ms    45 ms    41 ms  209.60.167.194
 17    42 ms    44 ms    41 ms  Remote

Trace complete.

Open in new window

0
Comment
Question by:YMartin
  • 8
  • 5
  • 3
  • +1
17 Comments
 
LVL 57

Expert Comment

by:giltjr
Comment Utility
First, ping and traceroute can not be used to determine if a network connection is reliable.

Some devices are configured to ignore ICMP requests and drop them no matter what.
Some devices are configured to only respond to so many ICMP requests from the same source in a specific amount of time.
Most devices are setup so that the task that responds to ICMP packets  are given a "lower" priority and if that device is real busy it will end up no responding.

Ping and traceroute are very basic tools for trying to determine if there is network connectivity between to points or the path between to points, but again, they are not reliable.

Now for your problem.  

Have you looked at the link utilization for all links?  Could you be saturation one or more of the links.
You have more than one VPN connection to a central site.  Do they all terminate in the same box at the central site?  If so are you exceeding that boxes capabilities?
0
 
LVL 1

Author Comment

by:YMartin
Comment Utility
Thank you for the reply.

The other VPN connections are to other remote sites.  Only one VPN connection goes back to home base.  Full Mesh topology if you will.  We are only having problems with the on VPN connection.  ISP is saying because pings are fine it must be our equipment.  I am aware of the ICMP limitations which is why I tested across the VPN as well.  Unfortunately the performance across the VPN is the one which is poor.  The confusing part is that a ping from the firewall comes out OK where as a ping from the local network does not.  We have a 20MBPS up/down connection at both sites which is by no means saturated.  We typically see sub 10% utilization on the ASA Traffic Usage graph at the remote site.

You bring up a good point with saturation on the other side.  We have the same connection on the other side however saturation is a bit higher but not above 20% average with spikes up to 50%.  I have pinged with both ASDM graphs up  The main problem is that I can try to ping across the VPN randomly and each time it will show the same timeouts - always even when almost 0 traffic is going out the FW.  But pinging the target ASA's public IP from the source ASA; pings are fine - no timeout.

I would suspect the VPN however if I ping the public IP of the VPN target from the same machine at the remote site I also receive the timeout issue but not when I do the same ping from the remote site's ASA.  Perhaps some sort of QOS at the NOC or something but I am struggling with what would cause that.  Perhaps I need to put another ticket in with the NOC but wanted to check for some suggestions first.

Here is a choice snippet across the VPN:
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Request timed out.
Reply from 192.168.3.3: bytes=32 time=47ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Request timed out.
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Request timed out.
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from 192.168.3.3: bytes=32 time=59ms TTL=128
Request timed out.
Request timed out.
Reply from 192.168.3.3: bytes=32 time=48ms TTL=128
Request timed out.
Request timed out.
Request timed out.
Reply from 192.168.3.3: bytes=32 time=47ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Request timed out.
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Request timed out.
Reply from 192.168.3.3: bytes=32 time=46ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Request timed out.
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Reply from 192.168.3.3: bytes=32 time=46ms TTL=128
Request timed out.
Request timed out.
Reply from 192.168.3.3: bytes=32 time=47ms TTL=128
Request timed out.
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=44ms TTL=128
Reply from 192.168.3.3: bytes=32 time=47ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Reply from 192.168.3.3: bytes=32 time=47ms TTL=128
Reply from 192.168.3.3: bytes=32 time=50ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.

Open in new window

showrun.txt
0
 
LVL 45

Expert Comment

by:Craig Beck
Comment Utility
I'd agree with giltjr... line congestion is what I'd look at first.

When you say congestion is "not above 20% average with spikes up to 50%", how are you measuring that?
0
 
LVL 57

Expert Comment

by:giltjr
Comment Utility
I have another question about the 10%, 20% and 50% utilization.  Is that the percentage of the ASA's interface or is that the percentage of the 20Mbps?

What I am getting at.  If your ASA is connecting at 100Mbps to whatever is physically next to it (Switch, router, whateve) next to it and the ASA is reporting 10% link utilization, that means it is using 10Mbps, which is 50% of your WAN link.
0
 
LVL 45

Expert Comment

by:Craig Beck
Comment Utility
^^^ That's exactly what I was getting to @giltjr :-)
0
 
LVL 1

Author Comment

by:YMartin
Comment Utility
So I am pulling up the ASDM Outside interface traffic usage graph in Kbps.  20% means the graph is mostly around 4kbps.  50% means the graph spiked up to 10kbps.

for example the graph was right around 0 just now.  I connected via RDP and ran a ping -t.  at least every 10th ping timed out.  During the entire time the outside interface was never above 1600 Kbps and that was a high spike.   I realize that the graph must be averaging every few seconds however I have checked this at least 50 times and never has it pinged more than 20 without a timeout.  It could be that there are very short spikes or something like that which saturate the connection briefly however that would affect other IP addresses as well I would think.

Keep in mind that pings to other locations do not time out.  Only to the home base IP/subnet.

Hope this makes more sense.

Thanks.

(snapshot was a few mins later for clarity.)
graph.png
0
 
LVL 45

Expert Comment

by:Craig Beck
Comment Utility
Keep in mind that pings to other locations do not time out.  Only to the home base IP/subnet.

That's not what you said in the OP...
When pinging The remote site public IP from home base I have few timeouts 2-3%.  When pinging the remote private IP I get many timeouts 20-30%

I'm not trying to catch you out here, but what you just said is the exact opposite of the description in the OP, so which is it?  Or is it both ways?  I'm just trying to understand what I'm looking at :-)
0
 
LVL 57

Expert Comment

by:giltjr
Comment Utility
I would suggest that you get something that monitors the link utilization all the time.  MRTG is free and really good, but runs best under Linux.  I have not used it but PRTG is supposed to be MRTG like and runs under Windows.  I have never used PRTG.

This gives you a better idea of what is going on with your link and a history.

Is the chart from the home site, or your site?
0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 
LVL 1

Author Comment

by:YMartin
Comment Utility
Thanks for the responses.  

The chart is from the ASA off of the remote site.  In looking back I realize I never explicitly stated the location.  I will try to be more clear in my posting.  Here's hoping to clarify other points:

Keep in mind that pings to other locations do not time out.  Only to the home base IP/subnet.
refers to outbound pings from the remote location.  So I can ping google from the remote location and have no timeouts.  However when I ping home base either across the VPN or to the public IP I get timeouts (see screenshot).

When pinging The remote site public IP from home base I have few timeouts 2-3%.  When pinging the remote private IP I get many timeouts 20-30%
specifies the percentage of timeouts when pinging the remote site from home base, so the opposite direction.  Today I am not getting any timeouts to the Public IP where often I will get 2-3% (see screenshot).

I do not have a system with dual NICs to use for monitoring saturation.  It would seem that saturation would affect each destination equally.  It could be saturation on that route with the ISP.  However the same ping off of the firewall does not have any timeouts.  Only pings from one of the windows machines.  The network is simple.  Cisco SG-300's uplink to the ASA.  There is a single Vlan and no QOS or anything of that nature has been configured on the switches.  They are pretty much at factory settings.

Just to be doubly clear.  I ping the public IP of the home base from a server at the remote site and I get the results listed in the attachment.  If I log onto the ASA at the remote site and ping the same public IP (even simultaneously) I get 100% success rates at the ASA.  Even if I click the "ping" button on the ASA ASDM ping tool as fast as possible I get 100% every time (except for one isolated instance mentioned earlier).  This is why the ISP believes it is a problem on the local network.

It is puzzling to me how this would be possible and it seems to be that the windows system is pinging differently than the ASA.  The ASA does 5 pings in quick succession:

Sending 5, 100-byte ICMP Echos to home base IP, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 40/40/40 ms

 where windows does one every few seconds.  Also the bytes are different and the ttl is not reported by the ASA.

These are all just symptoms.  The real problem is that email clients are not reliably connecting.  A 5mb attachment can take 10 minutes to upload while Outlook is stuck on "sending".  We are also seeing replication errors for AD.  

Thanks.
pings-rem-home.png
pings-home-rem.png
0
 
LVL 5

Expert Comment

by:Feroz Ahmed
Comment Utility
Hi,

You can do onething as there is packet dropping from Home Network system try to make a Host Entry at OS system level with the same ip address and Hostname and the path for the same is as below and check once this should work out as you mention the configuration of ASA is fine .Do these changes at OS System level as below:

C:/>windows\system32\drivers\etc\Hosts (make an entry with IP Address and Hostname and try to ping the same Ip address continuosly there should be no packet dropping if it is still dropping then there is problem with ASA configuration.
0
 
LVL 45

Expert Comment

by:Craig Beck
Comment Utility
@sm_feroz - what are you talking about??  Creating a hosts entry will not test anything here.  We're pinging IP addresses, not DNS hostnames.

@YMartin - Windows and ASAs will provide slightly different results when pinging.  They both do it differently.  As you've noticed, Windows will send a ping at longer intervals than the ASA.  That's ok - it's just a troubleshooting tool after all.

Can you show a ping from a Windows client at the remote site to the home base public IP, and in the same shot show a ping from the same remote client to the home base LAN server or PC at the same time?

Can you also re-do the pings-home-rem.png image please so we can see the  remote IP's ping response time (blank the IP of your ASA though please)?
0
 
LVL 1

Author Comment

by:YMartin
Comment Utility
Thanks.  Also the packet size is different on Windows vs. IOS/linux.

One strange thing was that ASDM window lost connection and the graphs were all stopped.  I tried a ping and it came back 100% but I tried a tracert and it said it couldn't reach the ASA.  Seems like the ping tool on the ASDM may not be accurately reporting results.

I am working with the ISP's again to see if they come up with anything but so far nothing.
pings-home-rem2.png
pings-rem-home2.png
0
 
LVL 45

Assisted Solution

by:Craig Beck
Craig Beck earned 500 total points
Comment Utility
So from home to remote public IP it's fine but to a host on the remote LAN through the VPN it's broken.  At the same time pings from the remote site are broken to both the home public IP and the home LAN IP.

I'd say that there's something going on with the home site ASA or internet circuit, for sure.  Have you tested with just a plain ASA config at each end (no VPN) and just ping the public IP from each side?
0
 
LVL 1

Author Comment

by:YMartin
Comment Utility
I have not.  Taking down the home site is not something which can be accomplished easily.  The next step is going to be to bypass the ASA at the remote site taking the internet down for a few minutes and test the same ping with a laptop connected directly to the router.  If that works we are going to look at replacing the ASA at the remote site with a spare.  

Failing that we will look at the ASA at the home site.

I am also considering taking the VPN down and routing traffic to home base through another site (Remote2) to see if that resolves issues.

I have not done any routing of this kind but I believe I need to set a static route on the Remote ASA for the Home subnet with the next hop as the LAN IP of Remote2.  The same would also be done on the home ASA.  This should route traffic through Remote2.

Your assertions are correct CraigBeck.
0
 
LVL 1

Author Comment

by:YMartin
Comment Utility
We had a tech visit the remote site today.  He pinged the home base public IP from:
1. The cable modem
2. The Cisco ASA Firewall
3. The Cisco Catalyst Switch

The timeout issue only showed up on number 3 above.  We took everything off of one of the switches.  Rebooted both it and the ASA and plugged one laptop into the switch and we were still getting timeouts.  These timeouts were only present for that one IP address (the home base public IP).  Despite the problem being at the switch I am still thinking we have a firewall issue.  My plan is to send out another ASA with the same config.
0
 
LVL 1

Accepted Solution

by:
YMartin earned 0 total points
Comment Utility
We finally resolved the issue.  Cisco TAC reviewed the switch and FW closely and was unable to find the problem.  However we did get a lot of packet captures and afterwards I reviewed them closely and found that ICMP packets from the LAN had one small difference from those originating from the FW:  

Cause: the Switch was changing the pbit (CoS parameter) on all LAN packets to 7 (highest priority).  This was the default configuration of the switch and is the same configuration that 2 other sites employ (without issue).  

Packets sent out on this particular route (Remote site to Home base) with a pbit of 7 experience higher packet loss.  One of the hops is seeing this bit and penalizing those packets.  Possibly because that value is usually used for UDP/RTP traffic and this flag was applied to TCP packets?

Needless to say we removed that bit and normal traffic flow is restored.  There is certainly more to know about the CoS tag from my part and I will be looking into this further.  Wanted to update the ticket for posterity in hopes it may help someone else.

The procedure used to change the CoS parameter was to map a different QoS policy (which left the value null) to each port on the switch.

Thanks.
0
 
LVL 1

Author Closing Comment

by:YMartin
Comment Utility
The solution was found outside of EE.  It has been posted for others to benefit.
0

Featured Post

Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

Join & Write a Comment

Suggested Solutions

For a while, I have wanted to connect my HTC Incredible to my corporate network to take advantage of the phone's powerful capabilities. I searched online and came up with varied answers from "it won't work" to super complicated statements that I did…
From Cisco ASA version 8.3, the Network Address Translation (NAT) configuration has been completely redesigned and it may be helpful to have the syntax configuration for both at a glance. You may as well want to read official Cisco published AS…
After creating this article (http://www.experts-exchange.com/articles/23699/Setup-Mikrotik-routers-with-OSPF.html), I decided to make a video (no audio) to show you how to configure the routers and run some trace routes and pings between the 7 sites…
After creating this article (http://www.experts-exchange.com/articles/23699/Setup-Mikrotik-routers-with-OSPF.html), I decided to make a video (no audio) to show you how to configure the routers and run some trace routes and pings between the 7 sites…

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now