Link to home
Start Free TrialLog in
Avatar of YMartin
YMartin

asked on

Ping Timeouts to one destination before but not at the firewall.

I am troubleshooting ping timeouts at a remote location.  A VPN Exists between Remote and home locations by virtue of 2 Cisco ASA5505s.  If I remote into any machine on the remote network (from a tertiary location) and ping either the public IP or the private IP of the home network I get about 20-30% timeouts when issuing ping -t.  If I ping any other address no such timeout occurs.

If I log into the remote ASA and issue a ping command to the home public IP I consistently get 100% success with one instance of 80% which I was unable to reproduce.  The ISP has also performed this test on their router with similar results.

When pinging The remote site public IP from home base I have few timeouts 2-3%.  When pinging the remote private IP I get many timeouts 20-30%

We are getting AD DFSR errors between the sites as well as scan to email failures across the VPN connection.  It does replicate and users can scan but not reliably.

We have VPN connections between the remote location and several other sites as well and none of those connections exhibit this behavior the same being true of Home Base as well.

Any help isolating the cause of this behavior would be much appreciated.

Tracert Remote>Home  
  1    <1 ms    <1 ms    <1 ms  Remote FW
  2     1 ms     1 ms     1 ms  Remote Router
  3     2 ms     5 ms     2 ms  te-0-2-0-3.atlngamah66cr03.paetec.net [169.130.8
3.241]
  4     1 ms     1 ms     2 ms  gi-5-0-0-12.atlngamah66cr01.paetec.net [169.130.
96.198]
  5     2 ms     2 ms     2 ms  64.80.13.90
  6     2 ms     2 ms     1 ms  h202.58.132.40.static.ip.windstream.net [40.132.
58.202]
  7     2 ms     2 ms     2 ms  atl-bb1-link.telia.net [80.239.194.9]
  8    14 ms    14 ms    14 ms  ash-bb3-link.telia.net [80.91.252.213]
  9    14 ms    14 ms    14 ms  ash-b1-link.telia.net [80.91.248.157]
 10    17 ms    16 ms    16 ms  213.200.66.97
 11    34 ms    33 ms    33 ms  xe-4-2-0.dal20.ip4.gtt.net [89.149.128.70]
 12     *        *        *     Request timed out.
 13     *        *        *     Request timed out.
 14     *        *        *     Request timed out.
 15    42 ms    47 ms    46 ms  beyondtekit.utl.dfw.unsi.net 
 16     *       40 ms    47 ms  Home

Trace complete.

Open in new window


Trace Home>Remote
  1     2 ms     3 ms     6 ms  Home
  2    14 ms     *        8 ms  Home Router
  3    16 ms    19 ms     *     199.116.146.146
  4     8 ms     9 ms    17 ms  100.123.213.116
  5     *       13 ms    14 ms  ip4.gtt.net [173.205.59.137]
  6     8 ms     8 ms    13 ms  xe-5-0-2.dal33.ip4.gtt.net [141.136.111.150]
  7    14 ms    18 ms    16 ms  as2828.dal33.ip4.gtt.net [199.168.63.250]
  8    35 ms    40 ms    43 ms  207.88.14.242.ptr.us.xo.net [207.88.14.242]
  9    39 ms    35 ms    30 ms  te-4-0-0.rar3.atlanta-ga.us.xo.net [207.88.12.1]
 10    41 ms    39 ms    39 ms  ae0d0.cir2.atlanta6-ga.us.xo.net [207.88.13.9]
 11    42 ms    47 ms    46 ms  67.106.215.82.ptr.us.xo.net [67.106.215.82]
 12    51 ms    45 ms    49 ms  h203.58.132.40.static.ip.windstream.net [40.132.58.203]
 13    50 ms    51 ms    43 ms  gi-5-0-0-80.atlngamah66cr01.paetec.net [64.80.13.89]
 14    45 ms    52 ms    46 ms  te-0-2-0-0-13.atlngamah66cr03.paetec.net [169.130.96.216]
 15    39 ms    47 ms    46 ms  te10-1.atlngamah66pe02.paetec.net [169.130.83.240]
 16    49 ms    45 ms    41 ms  209.60.167.194
 17    42 ms    44 ms    41 ms  Remote

Trace complete.

Open in new window

Avatar of giltjr
giltjr
Flag of United States of America image

First, ping and traceroute can not be used to determine if a network connection is reliable.

Some devices are configured to ignore ICMP requests and drop them no matter what.
Some devices are configured to only respond to so many ICMP requests from the same source in a specific amount of time.
Most devices are setup so that the task that responds to ICMP packets  are given a "lower" priority and if that device is real busy it will end up no responding.

Ping and traceroute are very basic tools for trying to determine if there is network connectivity between to points or the path between to points, but again, they are not reliable.

Now for your problem.  

Have you looked at the link utilization for all links?  Could you be saturation one or more of the links.
You have more than one VPN connection to a central site.  Do they all terminate in the same box at the central site?  If so are you exceeding that boxes capabilities?
Avatar of YMartin
YMartin

ASKER

Thank you for the reply.

The other VPN connections are to other remote sites.  Only one VPN connection goes back to home base.  Full Mesh topology if you will.  We are only having problems with the on VPN connection.  ISP is saying because pings are fine it must be our equipment.  I am aware of the ICMP limitations which is why I tested across the VPN as well.  Unfortunately the performance across the VPN is the one which is poor.  The confusing part is that a ping from the firewall comes out OK where as a ping from the local network does not.  We have a 20MBPS up/down connection at both sites which is by no means saturated.  We typically see sub 10% utilization on the ASA Traffic Usage graph at the remote site.

You bring up a good point with saturation on the other side.  We have the same connection on the other side however saturation is a bit higher but not above 20% average with spikes up to 50%.  I have pinged with both ASDM graphs up  The main problem is that I can try to ping across the VPN randomly and each time it will show the same timeouts - always even when almost 0 traffic is going out the FW.  But pinging the target ASA's public IP from the source ASA; pings are fine - no timeout.

I would suspect the VPN however if I ping the public IP of the VPN target from the same machine at the remote site I also receive the timeout issue but not when I do the same ping from the remote site's ASA.  Perhaps some sort of QOS at the NOC or something but I am struggling with what would cause that.  Perhaps I need to put another ticket in with the NOC but wanted to check for some suggestions first.

Here is a choice snippet across the VPN:
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Request timed out.
Reply from 192.168.3.3: bytes=32 time=47ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Request timed out.
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Request timed out.
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from 192.168.3.3: bytes=32 time=59ms TTL=128
Request timed out.
Request timed out.
Reply from 192.168.3.3: bytes=32 time=48ms TTL=128
Request timed out.
Request timed out.
Request timed out.
Reply from 192.168.3.3: bytes=32 time=47ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Request timed out.
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Request timed out.
Reply from 192.168.3.3: bytes=32 time=46ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Request timed out.
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Reply from 192.168.3.3: bytes=32 time=46ms TTL=128
Request timed out.
Request timed out.
Reply from 192.168.3.3: bytes=32 time=47ms TTL=128
Request timed out.
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=44ms TTL=128
Reply from 192.168.3.3: bytes=32 time=47ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Reply from 192.168.3.3: bytes=32 time=47ms TTL=128
Reply from 192.168.3.3: bytes=32 time=50ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from 192.168.3.3: bytes=32 time=45ms TTL=128
Request timed out.

Open in new window

showrun.txt
I'd agree with giltjr... line congestion is what I'd look at first.

When you say congestion is "not above 20% average with spikes up to 50%", how are you measuring that?
I have another question about the 10%, 20% and 50% utilization.  Is that the percentage of the ASA's interface or is that the percentage of the 20Mbps?

What I am getting at.  If your ASA is connecting at 100Mbps to whatever is physically next to it (Switch, router, whateve) next to it and the ASA is reporting 10% link utilization, that means it is using 10Mbps, which is 50% of your WAN link.
^^^ That's exactly what I was getting to @giltjr :-)
Avatar of YMartin

ASKER

So I am pulling up the ASDM Outside interface traffic usage graph in Kbps.  20% means the graph is mostly around 4kbps.  50% means the graph spiked up to 10kbps.

for example the graph was right around 0 just now.  I connected via RDP and ran a ping -t.  at least every 10th ping timed out.  During the entire time the outside interface was never above 1600 Kbps and that was a high spike.   I realize that the graph must be averaging every few seconds however I have checked this at least 50 times and never has it pinged more than 20 without a timeout.  It could be that there are very short spikes or something like that which saturate the connection briefly however that would affect other IP addresses as well I would think.

Keep in mind that pings to other locations do not time out.  Only to the home base IP/subnet.

Hope this makes more sense.

Thanks.

(snapshot was a few mins later for clarity.)
graph.png
Keep in mind that pings to other locations do not time out.  Only to the home base IP/subnet.

That's not what you said in the OP...
When pinging The remote site public IP from home base I have few timeouts 2-3%.  When pinging the remote private IP I get many timeouts 20-30%

I'm not trying to catch you out here, but what you just said is the exact opposite of the description in the OP, so which is it?  Or is it both ways?  I'm just trying to understand what I'm looking at :-)
I would suggest that you get something that monitors the link utilization all the time.  MRTG is free and really good, but runs best under Linux.  I have not used it but PRTG is supposed to be MRTG like and runs under Windows.  I have never used PRTG.

This gives you a better idea of what is going on with your link and a history.

Is the chart from the home site, or your site?
Avatar of YMartin

ASKER

Thanks for the responses.  

The chart is from the ASA off of the remote site.  In looking back I realize I never explicitly stated the location.  I will try to be more clear in my posting.  Here's hoping to clarify other points:

Keep in mind that pings to other locations do not time out.  Only to the home base IP/subnet.
refers to outbound pings from the remote location.  So I can ping google from the remote location and have no timeouts.  However when I ping home base either across the VPN or to the public IP I get timeouts (see screenshot).

When pinging The remote site public IP from home base I have few timeouts 2-3%.  When pinging the remote private IP I get many timeouts 20-30%
specifies the percentage of timeouts when pinging the remote site from home base, so the opposite direction.  Today I am not getting any timeouts to the Public IP where often I will get 2-3% (see screenshot).

I do not have a system with dual NICs to use for monitoring saturation.  It would seem that saturation would affect each destination equally.  It could be saturation on that route with the ISP.  However the same ping off of the firewall does not have any timeouts.  Only pings from one of the windows machines.  The network is simple.  Cisco SG-300's uplink to the ASA.  There is a single Vlan and no QOS or anything of that nature has been configured on the switches.  They are pretty much at factory settings.

Just to be doubly clear.  I ping the public IP of the home base from a server at the remote site and I get the results listed in the attachment.  If I log onto the ASA at the remote site and ping the same public IP (even simultaneously) I get 100% success rates at the ASA.  Even if I click the "ping" button on the ASA ASDM ping tool as fast as possible I get 100% every time (except for one isolated instance mentioned earlier).  This is why the ISP believes it is a problem on the local network.

It is puzzling to me how this would be possible and it seems to be that the windows system is pinging differently than the ASA.  The ASA does 5 pings in quick succession:

Sending 5, 100-byte ICMP Echos to home base IP, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 40/40/40 ms

 where windows does one every few seconds.  Also the bytes are different and the ttl is not reported by the ASA.

These are all just symptoms.  The real problem is that email clients are not reliably connecting.  A 5mb attachment can take 10 minutes to upload while Outlook is stuck on "sending".  We are also seeing replication errors for AD.  

Thanks.
pings-rem-home.png
pings-home-rem.png
Hi,

You can do onething as there is packet dropping from Home Network system try to make a Host Entry at OS system level with the same ip address and Hostname and the path for the same is as below and check once this should work out as you mention the configuration of ASA is fine .Do these changes at OS System level as below:

C:/>windows\system32\drivers\etc\Hosts (make an entry with IP Address and Hostname and try to ping the same Ip address continuosly there should be no packet dropping if it is still dropping then there is problem with ASA configuration.
@sm_feroz - what are you talking about??  Creating a hosts entry will not test anything here.  We're pinging IP addresses, not DNS hostnames.

@YMartin - Windows and ASAs will provide slightly different results when pinging.  They both do it differently.  As you've noticed, Windows will send a ping at longer intervals than the ASA.  That's ok - it's just a troubleshooting tool after all.

Can you show a ping from a Windows client at the remote site to the home base public IP, and in the same shot show a ping from the same remote client to the home base LAN server or PC at the same time?

Can you also re-do the pings-home-rem.png image please so we can see the  remote IP's ping response time (blank the IP of your ASA though please)?
Avatar of YMartin

ASKER

Thanks.  Also the packet size is different on Windows vs. IOS/linux.

One strange thing was that ASDM window lost connection and the graphs were all stopped.  I tried a ping and it came back 100% but I tried a tracert and it said it couldn't reach the ASA.  Seems like the ping tool on the ASDM may not be accurately reporting results.

I am working with the ISP's again to see if they come up with anything but so far nothing.
pings-home-rem2.png
pings-rem-home2.png
SOLUTION
Avatar of Craig Beck
Craig Beck
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of YMartin

ASKER

I have not.  Taking down the home site is not something which can be accomplished easily.  The next step is going to be to bypass the ASA at the remote site taking the internet down for a few minutes and test the same ping with a laptop connected directly to the router.  If that works we are going to look at replacing the ASA at the remote site with a spare.  

Failing that we will look at the ASA at the home site.

I am also considering taking the VPN down and routing traffic to home base through another site (Remote2) to see if that resolves issues.

I have not done any routing of this kind but I believe I need to set a static route on the Remote ASA for the Home subnet with the next hop as the LAN IP of Remote2.  The same would also be done on the home ASA.  This should route traffic through Remote2.

Your assertions are correct CraigBeck.
Avatar of YMartin

ASKER

We had a tech visit the remote site today.  He pinged the home base public IP from:
1. The cable modem
2. The Cisco ASA Firewall
3. The Cisco Catalyst Switch

The timeout issue only showed up on number 3 above.  We took everything off of one of the switches.  Rebooted both it and the ASA and plugged one laptop into the switch and we were still getting timeouts.  These timeouts were only present for that one IP address (the home base public IP).  Despite the problem being at the switch I am still thinking we have a firewall issue.  My plan is to send out another ASA with the same config.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of YMartin

ASKER

The solution was found outside of EE.  It has been posted for others to benefit.