Flaky Site-To-Site VPN tunnel (PIX to Astaro)

At our main office we have a 3MEIA internet service connected to an Astaro UTM. At our remote site we have a PIX 506E firewall that uses at&t U-Verse internet. We have a Ipsec VPN tunnel between the two sites using 3-DES encryption that has been working fine for a couple of weeks. Recently we noticed we are unable to direct more than one outside IP address from our U-Verse service to the PIX firewall to run different websites on. This was due to the fact that the 2-Wire modem will not allow any MAC address to have more than one IP address.

Long story short we called at&t and they told us we had to "trick" our 2-Wire modem into thinking there are multiple devices connected to it by setting up virtual MAC addresses. We utilized a Cisco 2801 router to use these IP addresses as standby routes and we made up MAC addresses for all of the standby IP addresses in the IP range we were allotted. Now we are able to make use of all the IP addresses we paid for just by opening a port on the PIX with the corresponding IP address.

Recently we have been noticing our VPN connection has been "flaky" as in there are short intervals in which the VPN is down. I'm not sure if this would have anything to do with the router we now have in between the 2-Wire and the PIX but it is causing issues because we send large amounts of data over the tunnel constantly and when it's down, everyone notices it.

I have attached the ipsec log from our Astaro, Does anything look out of the ordinary that would be causing our intermediate VPN bottlenecks?
Who is Participating?
jpwallenAuthor Commented:
The problem here was definitely with the router. We ended up getting the latest firmware image from our sister company that has a Cisco service contract. The difference in performance from the old 2005 version was night and day! IP NAT Ager no longer goes above 1% and there are no more VPN timeouts or config forgetting. It's funny that this only started happening after it was introduced to the 2WIRE. Oh-well.
jpwallenAuthor Commented:
This is looking like it is definitely a problem with the Cisco router because for some reason I can't even do a "show run" as it returns blank. However after I reload the router it works for about an hour then starts failing.
This could be a problem with a memory leak or alignment errors.  You can check them using "show align" and "show mem".

I would also make sure to remove any NBAR usage on the Cisco router.  Inspecting traffic works great on firewalls (such as PIX and ASA), but it seems to crash routers.
jpwallenAuthor Commented:
Yes it's definitely a memory related issue. When the router starts up the highest process usage of IP NAT Ager is about 10%. However after waiting a while the IP NAT Ager seems to take up 99% of system resources and an ever increasing amount of memory. On the syslog we can see that it is pretty much out of memory and can't find any place to put new chunks:

10-19-2012      08:24:24      Local7.Critical      17803: -Traceback= 0x608CC104 0x6020D564
10-19-2012      08:24:24      Local7.Critical      17802: -Process= "Chunk Manager", ipl= 3, pid= 1
10-19-2012      08:24:24      Local7.Critical      17801: *Oct 19 08:43:49: %SYS-2-CHUNKEXPANDFAIL: Could not expand chunk pool for ipnat node. No memory available
10-19-2012      08:24:22      Local7.Critical      17800: -Traceback= 0x608CC104 0x601F0BA0 0x601F5C3C 0x6020E59C 0x6020D73C 0x6020D520
10-19-2012      08:24:22      Local7.Critical      17799: -Process= "Chunk Manager", ipl= 3, pid= 1
10-19-2012      08:24:22      Local7.Critical      17798:
10-19-2012      08:24:22      Local7.Critical      17797: Alternate Pool: None  Free: 0  Cause: No Alternate pool
10-19-2012      08:24:22      Local7.Critical      17796: Pool: Processor  Free: 561948  Cause: Memory fragmentation
10-19-2012      08:24:22      Local7.Critical      17795: *Oct 19 08:43:47: %SYS-2-MALLOCFAIL: Memory allocation of 65536 bytes failed from 0x6020E594, alignment 8
10-19-2012      08:24:13      Local7.Critical      17794: -Traceback= 0x608CC104 0x6020D564
10-19-2012      08:24:13      Local7.Critical      17793: -Process= "Chunk Manager", ipl= 3, pid= 1

Searching online looks like the only option is to upgrade the firmware which Cisco won't let us do because we don't have one of their million dollar "support service contracts". It's weird because we have been using this router at two other locations and never experienced this issue until we connected it to the 2WIRE modem. I have attached some of the other stuff we are seeing on the router.
jpwallenAuthor Commented:
I am going to close the question. It seems like the problem started forever ago.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.