CISCO DMVPN - Spoke isakmp sa not deleted after reload of remote hub

Hi,

I am currently setting up a dual hub single dmvpn cloud topology...things are going great until I reload one of my hub routers. As the hub reboots it looses its spi's. However the spokes dont recognise this for around an hour and during this time i just get invalid spi errors as the spoke tries to use the old spi. A show crypto isakmp sa on the spoke after reload of the hub shows the following:

Connections to the hub's "apparently still active".

IPv4 Crypto ISAKMP SA
dst             src             state          conn-id slot status
**hub1**    **spoke**   QM_IDLE           2037    0 ACTIVE
**hub2**    **spoke**   QM_IDLE           2036    0 ACTIVE

the same show command on the hub after reload..

Connections to spokes are gone..just leaving the hub to hub connection active.

IPv4 Crypto ISAKMP SA
dst             src             state          conn-id slot status
**hub1**    **hub2**    QM_IDLE           4001    0 ACTIVE

The following error for the invalid spi is recieved

*Mar 25 15:55:54.559: %CRYPTO-4-RECVD_PKT_INV_SPI: decaps: rec'd IPSEC packet has invalid spi for destaddr=**hub1**, prot=50, spi=0xD8A2CF8C(3634548620), srcaddr=**spoke**

If i shut down the tunnel interfaces on the spoke routers and bring them up again this refreshes the spi and a show crypto isakmp sa on the hub router shows the following:
(back to normal)

IPv4 Crypto ISAKMP SA
dst             src             state          conn-id slot status
**hub1**    **spoke1**  QM_IDLE           4003    0 ACTIVE
**hub1**    **spoke2**  QM_IDLE           4002    0 ACTIVE
**hub2**      **hub1**    QM_IDLE           4001    0 ACTIVE


I could set the  "(ipsec-profile)#set security-association lifetime seconds" to something lower than 86400 but i dont want to have to much processing on the spokes.

Is there a way for the spoke to detect when the hub has been down for a period and delete its keys? Also why does the hub to hub tunnel not have this problem?

Thanks in advance,

Mike.
IT_DeptAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Vito_CorleoneCommented:
I've seen this issue quite a bit and I've never been able to fix it. It seems to not affect all IOSes, the newer ones don't appear to have the issue.
0
IT_DeptAuthor Commented:
In the end i configured dead peer detection on the spokes in the form of "crypto isakmp keepalive 15 2" all is working well, although i have found that this keepalive mothod does not seem to work if you have isakmp profiles configured on your spokes also.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
ampdogCommented:
I seem to have this same problem!

When I have a service failure / restoral between hub and spokes the tunnels continue to try and reconnect, but they have no success due to the invalid spi error.

I have configured "crypto isakmp invalid-spi-recovery" on the hub and spokes, but it didn't seem to have any effect. “clear crypto sa” and “clear crypto isakmp” didn't do it either.

Doing a shut / no shut on the tunnel interface of the spokes clears it nicely, but some of my sites are unattented and single threaded, so if my tunnel is down I will have to have a tech physically go out and console in to do this. Is there a way to have this done automatically?

If I change my "set security-association lifetime seconds 86400" to 900 sec will that make sure that the SPI will be reset in 15 minutes instead of 24 hours? Causing a max 15 min outage...

I haven't tried "crypto isakmp keepalive 15 2" because I have isakmp profiles configured on the spokes and it was mentioned that I won't work that way.

Is it possible to run a TCL script on the spoke/hub routers that will check for the "%CRYPTO-4-RECVD_PKT_INV_SPI:" error and then shut/no shut the tunnel interface?

Thanks in advance!!!
0
IT_DeptAuthor Commented:
Hi ampdog,

I've attached a slide from a presentation by the Cisco TAC guys at Cisco networkers 2009 that shows that the keepalive method is recommended. However at the time of writing my previous post I was unaware that you could configure the keepalive per profile and my issue was that the global command was having no effect.(Wether that was a bug or not I dont know) It is however the correct way to configure it under each profile.
DMVPN Best Practice Slide(Also note that these should only be configured on the spokes there is no need to have keepalives on the hub)
If you are using isakmp profiles then instead of the global keepalive command, you need to configure the keepalive under the profile like this (this example is using pki for authentication taken from a live router):

crypto isakmp profile DMVPN_ISAKMP_PROFILE
   ca trust-point CA2-SASUBCA
   ca trust-point CA2-SAROOTCA
   match certificate DMVPN_CERT_MAP
   keepalive 15 retry 2

As for security-association lifetime, this would reduce the recovery period but also increase cpu consumption as you would have to recalculate the Diffee-Helman keys every 15 minutes... not a problem with one tunnel maybe if you arent running anthing major on your router... but if you have multiple spoke to spoke tunnels active at once you could get a very busy cpu. Best to leave it to re-key every 24 hours as branch routers arent allways good at "busy".

As for the TCL script if you need to do that, u can use an Applet instead of going into full TCL. Below is an example to shut/no shut the interface automatically and fire a syslog message when the INV_SPI is logged. It should give you a basic idea of what you need to do to get something like this running.

event manager applet TEST
event syslog pattern "%CRYPTO-4-RECVD_PKT_INV_SPI"
action 1.0 cli command "enable"
action 2.0 cli command "conf t"
action 3.0 cli command "int tun0"
action 4.0 cli command "shut"
action 5.0 cli command "no shut"
action 7.0 syslog msg "INV_SPI Error caught and tun0 reset"

Best of luck,

Mike.

Also,
If you want a copy of the networkers presentation on DMVPN in full then drop me a mail and I will get it to you.    it <antispam> at <antispam> coreassets.com
(replace the "at" with "@" and the "<antispam>" with ".")
0
IT_DeptAuthor Commented:
*Edit dont replace <anitspam> with anything just remove it.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Internet Protocol Security

From novice to tech pro — start learning today.