Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Stumped troubleshooting ipsec vpn

Posted on 2014-01-27
10
Medium Priority
?
1,125 Views
Last Modified: 2014-01-28
I'm stumped troubleshooting this vpn connection.

Scope of this question is to get ping replies working end to end across an ipsec vpn tunnel (site to site).  currently, the tunnel connects, i can send a ping thru the tunnel from building1 to the datacenter, the ping is received at the datacenter, the datacenter host replies, but the reply never gets back to the building1 side that initiated the ping.

the setup:

building1:
cisco rvs4000 vpn router
lan: 10.100.1.0/24
wan static ip, we'll call it a.b.c.d
upstream gateway a.b.c.z  (cox cable)

vpn setup:
ipsec tunnel
local address a.b.c.d
local group subnet 10.100.1.0 / 24
remote group ip z.x.c.v  (ie: public ip of datacenter)
remote group subnet: 172.16.1.0 / 24
keying mode: ike with preshared key
phase 1 encryption: 3des
auth: md5
group: 1024bit (ie group2)
key life 28800 sec

phase 2 encryption 3des
auth md5
PFS disable
preshared key: password (or whatever, it matches the remote endpoint)
group: 1024bit
key lifetime 28800

datacenter setup:
paloalto pan-4050 router
wan:  vpn endpoint z.x.c.v (public ip)
lan: 172.16.1.0/24
upstream gateway:  z.x.c.z (datacenter core switch, expedient colo)
vpn setup identical to above

the vpn tunnel DOES connect, shows "up"

if i initiate a ping from 10.100.1.5 to 172.16.1.70... with wireshark running on both machines:
10.100.1.5 sends the packet to 10.100.1.254 (the cisco)
z.x.c.v (datacenter wan) receives the encapsulated packet and decrypts it, routes it to 172.16.1.70
on 172.16.1.70, wireshark sees the ping from 10.100.1.5
172.16.1.70 replies to 10.100.1.5
172.16.1.65 (paloalto inside interface) receives it and encapsulates it for a.b.c.d (building1 wan)
z.x.c.v does forward it to z.x.c.z (upstream device) as seen by port-mirroring the wan uplink
it never arrives at a.b.c.d (building1 wan, cisco rvs router)

if i initiate a ping from 172.16.1.70 destined for 10.100.1.5:
the intside interface of the paloalto receives it, packs it up and forwards it to the upstream gateway (as seen on the wire, port mirroring the wan uplink).
no traffic is received at building1

troubleshooting:  

i've had paloalto support in their device for a week, they've proved beyond all doubt that the traffic is being handled properly and being passed upstream correctly

i've replaced the router at building1 (changed from a netgear vpn router, to a cisco vpn router).  both the netgear, and the cisco, have the identical symptoms.  tunnel connects, traffic gets from building1 to the datacenter, but not back.

interesting point:
when i traceroute from building1 (10.100.1.5) to google (8.8.8.8) my first hop is as expected my internal gateway (10.100.1.254, the cisco).  but, the very next hop is 10.16.72.1 (14ms, assume not my cable modem).  the next hop after that is NOT a.b.c.z (upstream public gateway), it is something completely different (but still on cox network)

i've tried asking cox what the heck is 10.16.72.1 and to check my cable modem routing table to make sure it's correct... but the best they could do for me is tell me to reboot my cable modem and router.

the physical wan port of the cisco at building1, is directly connected to the one and only ethernet port on the cable modem.  nothing is in between.

so, i need ideas as to why the return traffic can't get back.
0
Comment
Question by:FocIS
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 4
10 Comments
 
LVL 71

Expert Comment

by:Qlemo
ID: 39813150
Compare the traceroute to each other public IPs. And how do you know a.b.c.d does not get traffic back? Did you use a WAN link mirror with WireShark here, too?

And make sure the paloalto device does not try to create another tunnel because of some parameter mismatch (local and remote subnets in VPN, for example).
0
 
LVL 2

Author Comment

by:FocIS
ID: 39813405
Thanks for the reply!
In the paloalto, the debug logs are very explicit, i'm certain it sends it out the correct tunnel.  The port mirror on the paloalto side does show esp packets going to the building1 wan

I can't port-mirror (yet) at the building1 side though i should be able to tomorrow.  

The symptoms are, as seen from 172.16.1.70, the pings are sent and sent and sent with no replies.

Similarly, as seen from 10.100.1.5, those packets are sent and sent and sent and never received replies, BUT i see matching pings on the destination of 172.16.1.70.   so i see the ping hit the destination, the destination replies, but the replies never hit back to 10.100.1.5

the traceroutes are similar but different:

from building1 to datacenter:
Tracing route to z.x.c.v over a maximum of 30 hops

  1     3 ms     1 ms     1 ms  10.100.1.254
  2    17 ms    11 ms    13 ms  10.16.72.1  <-- not sure what this is
  3    10 ms     9 ms     9 ms  ip98-173-132-214.cl.ri.cox.net [98.173.132.214] <-- not our ip or upstream gw
  4    12 ms    10 ms    11 ms  ip98-173-132-222.cl.ri.cox.net [98.173.132.222]
  5    65 ms    38 ms    49 ms  68.1.4.246
  6   109 ms   207 ms   222 ms  te6-3.ar3.DCA3.gblx.net [67.17.134.45]
  7    65 ms   107 ms    75 ms  CONTINENTAL-BROADBAND.Te6-2.ar5.CHI1.gblx.net [207.138.128.70]
  8    71 ms    67 ms    68 ms  te1-2.4006.cr2.350ec.chcgil.e-xpedient.com [216.130.11.134]
  9    78 ms    72 ms    73 ms  te1-4.4005.cr1.strlng.clevoh.e-xpedient.com [216.130.11.130]
 10    74 ms    69 ms    76 ms  te1-2.4002.cr2.strlng.clevoh.e-xpedient.com [216.130.12.134]
 11    67 ms    74 ms    74 ms  te2-7-1.4007.151-core.expedient.com [216.130.12.202]
 12    72 ms    67 ms    70 ms  z.x.c.v
Trace complete.


from datacenter:
Tracing route to a.b.c.d [68.99.x.x]
over a maximum of 30 hops:

  1    <1 ms    <1 ms    <1 ms  paloalto [172.16.1.65]
  2     1 ms    <1 ms    <1 ms  z.x.c.z  (upstream gateway)
  3    11 ms    11 ms    11 ms  te1-3.4007.cr2.strlng.clevoh.e-xpedient.com [216.130.12.201]
  4    11 ms    11 ms    11 ms  te1-2.4002.cr1.strlng.clevoh.e-xpedient.com [216.130.12.133]
  5    11 ms    11 ms    11 ms  te1-4.4005.cr2.350ec.chcgil.e-xpedient.com [216.130.11.129]
  6    11 ms    11 ms    11 ms  te1-2.4006.cr1.350ec.chcgil.e-xpedient.com [216.130.11.133]
  7    11 ms    11 ms    11 ms  te6-2.ar5.chi1.gblx.net [207.138.128.69]
  8    31 ms    31 ms    31 ms  cox-com.ethernet15-2.ar6.dal2.gblx.net [64.215.187.2]
  9    57 ms    76 ms   114 ms  clvdhdrj01-xe000.0.rd.cl.cox.net [68.1.1.94]
 10    57 ms    57 ms    57 ms  ip98-173-132-221.cl.ri.cox.net [98.173.132.221]
 11    57 ms    57 ms    56 ms  ip98-173-132-217.cl.ri.cox.net [98.173.132.217]
 12    70 ms    68 ms    73 ms  a.b.c.d [our static ip 68.99.x.x]
Trace complete.
0
 
LVL 71

Expert Comment

by:Qlemo
ID: 39813761
Ok, that tells us the packets should flow between both routers. ESP packets might get filtered on their way back, though. Really difficult to tell. Also, there could be a MTU mismatch leading to excess fragmentation - or requiring fragmentation, but not doing that. Depending on firmware bugs this might be an issue if the correct MTU is only a few bytes different (we often see issues with 1500 instead of 1492 bytes to use).
0
Q2 2017 - Latest Malware & Internet Attacks

WatchGuard’s Threat Lab is a group of dedicated threat researchers committed to helping you stay ahead of the bad guys by providing in-depth analysis of the top security threats to your network.  Check out our latest Quarterly Internet Security Report!

 
LVL 2

Author Comment

by:FocIS
ID: 39813974
Good catch qlemo, i was in a hurry and missed that ip address :)

The cisco at the building1 side was already 1492 mtu, but the datacenter side was 1500, so i just changed that to 1492.

having saved the changes, the tunnel remains fully connected but the esp packets of the "ping reply" don't appear to reach back to building1.

when we had a netgear vpn router at building1 last week, it has a built in packet scanner with download to wireshark for review - we could see the ping request leaving, but never saw the ping reply come back.  

i'll hook up a port mirror device at building1 on tuesday and see what can be seen (still have one in place at the datacenter)

happy to try any other ideas at all, and to provide more info if it helps
0
 
LVL 2

Author Comment

by:FocIS
ID: 39813978
i wanted to mention some more aspects:

when i initiate a ping from 172.16.1.70 directly to 10.100.1.5, the route is there, the datacenter router passes esp packets upstream (that's as far as we can track it) but the "ping request" never makes it to 10.100.1.5 (as viewed on the nic of 10.100.1.5)

also, we have an identical cisco rvs4000 router on another cox cable modem in the building (different account, different static ip) with identical vpn settings (tailored to the different static cox ip address), and THAT tunnel connects, and pings pass in both directions

i've requested that the datacenter isp capture our packets on our upstream gateway but their answer was (rightfully so) "no way, that's a core switch" - all i can assume is since the packets left our rack in good health, they should be leaving the building too.

what do you think about the "odd" 2nd hop leaving building1?  is that some sort of internal vpn between cleveland and rhode island (cl.ri.cox.net gets from cleveland to RI for the cox pop).  i wonder if it is a cox tunnel, if that's double/tripple wrapping the packets and killing the crypto (yet it works on the way out from building1 to the datacenter)
0
 
LVL 71

Accepted Solution

by:
Qlemo earned 2000 total points
ID: 39814416
I don't think the 2nd hop is wrapping, just routing. If it did anything, the tunnel would not come up.
Regarding the Netgear packet capture - did you see both unencrpyted and encrypted traffic, or only the unencrypted one? Because I still think something with the VPN settings is not correct. Maybe the other tunnel is getting the reply traffic in error?
0
 
LVL 2

Author Comment

by:FocIS
ID: 39815197
Good thought with the two tunnels - i've just deleted the second tunnel (which worked, but with the "wrong" network).

I'll post some sanitized screenshots of the settings here
building1.png
datacenter.png
0
 
LVL 2

Author Closing Comment

by:FocIS
ID: 39815220
oh wow, i think that actually did it - there was some confusion between "tunnel.2" and "tunnel.3" - when i went to delete tunnel.3 at your suggestion, i noticed the wrong private ip block in tunnel.2

pings in both directions are finally replying all the way thru the tunnel now!
0
 
LVL 71

Expert Comment

by:Qlemo
ID: 39815364
Great! The private networks exchanged in IPSec are often used to map traffic to tunnel. Though that should be not necessary for reply traffic - a stateful firewall should bypass any rules, as long as the corresponding session exists.
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

OpenVPN is a great open source VPN server that is capable of providing quick and easy VPN access to your network on the cheap.  By default the software is configured to allow open access to your network.  But what if you want to restrict users to on…
Shadow IT is coming out of the shadows as more businesses are choosing cloud-based applications. It is now a multi-cloud world for most organizations. Simultaneously, most businesses have yet to consolidate with one cloud provider or define an offic…
After creating this article (http://www.experts-exchange.com/articles/23699/Setup-Mikrotik-routers-with-OSPF.html), I decided to make a video (no audio) to show you how to configure the routers and run some trace routes and pings between the 7 sites…
Windows 10 is mostly good. However the one thing that annoys me is how many clicks you have to do to dial a VPN connection. You have to go to settings from the start menu, (2 clicks), Network and Internet (1 click), Click VPN (another click) then fi…
Suggested Courses

609 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question