Solved

Cisco ASA failover - Primary unit reboots on activation

Posted on 2009-05-16
8
7,006 Views
Last Modified: 2013-11-16
Hi guys...

We have an Active/Standby ASA 5540 failover cluster. Few months back, the secondary unit became active automatically while the primary went into 'Standby Ready' mode. Now, whenever we try to make the primary Active either by giving the command "failover active" on primary or "no failover active" on secondary, the primary automatically and immediately reboots. The logging on secondary unit shows the following messages:

May 13 2009 09:24:27: %ASA-1-104002: (Secondary) Switching to STNDBY - Other unit want me Standby
May 13 2009 09:24:27: %ASA-6-210022: LU missed 8485 updates
May 13 2009 09:24:32: %ASA-1-105003: (Secondary) Monitoring on interface outside waiting
May 13 2009 09:24:32: %ASA-1-105003: (Secondary) Monitoring on interface inside waiting
May 13 2009 09:24:32: %ASA-1-105003: (Secondary) Monitoring on interface DMZ-NHS waiting
May 13 2009 09:24:42: %ASA-1-105008: (Secondary) Testing Interface outside
May 13 2009 09:24:42: %ASA-1-105008: (Secondary) Testing Interface inside
May 13 2009 09:24:42: %ASA-1-105008: (Secondary) Testing Interface DMZ-NHS
May 13 2009 09:24:42: %ASA-1-105009: (Secondary) Testing on interface outside Passed
May 13 2009 09:24:42: %ASA-1-105009: (Secondary) Testing on interface DMZ-NHS Passed
May 13 2009 09:24:42: %ASA-1-105009: (Secondary) Testing on interface inside Passed
May 13 2009 09:24:45: %ASA-1-103001: (Secondary) No response from other firewall (reason code = 1).
May 13 2009 09:24:45: %ASA-1-104001: (Secondary) Switching to ACTIVE - HELLO not heard from mate.
May 13 2009 09:28:55: %ASA-1-709003: (Secondary) Beginning configuration replication: Send to mate.
May 13 2009 09:29:07: %ASA-1-709004: (Secondary) End Configuration Replication (ACT)


The interface GigabitEthernet0/3 is used for LAN failover in both the firewalls. The failover interfaces are connected to a switch. For troubleshooting, we also connected the two interfaces using a cross over cable, it didnt work and the same issue was faced again. Following is the LAN failover configuration on the two units:

Primary Unit:

failover
failover lan unit primary
failover lan interface Statefull-Failover GigabitEthernet0/3
failover key *****
failover replication http
failover link Statefull-Failover GigabitEthernet0/3
failover interface ip Statefull-Failover 10.200.200.1 255.255.255.252 standby 10.200.200.2


Secondary Unit:

failover
failover lan unit secondary
failover lan interface Statefull-Failover GigabitEthernet0/3
failover key *****
failover replication http
failover link Statefull-Failover GigabitEthernet0/3
failover interface ip Statefull-Failover 10.200.200.1 255.255.255.252 standby 10.200.200.2


Following is the Show Failover result on the current active unit.

Failover On
Failover unit Secondary
Failover LAN Interface: Statefull-Failover GigabitEthernet0/3 (up)
Unit Poll frequency 1 seconds, holdtime 15 seconds
Interface Poll frequency 5 seconds, holdtime 25 seconds
Interface Policy 1
Monitored Interfaces 3 of 250 maximum
failover replication http
Version: Ours 7.2(1), Mate 7.2(1)
Last Failover at: 09:24:45 KSA May 13 2009
        This host: Secondary - Active
                Active time: 10195375 (sec)
                slot 0: ASA5540 hw/sw rev (1.0/7.2(1)) status (Up Sys)
                  Interface outside (X.X.X.1): Normal
                  Interface inside (Y.Y.Y.1): Normal
                  Interface DMZ-NHS (Z.Z.Z.1): Normal
                  Interface management (0.0.0.0): Link Down (Not-Monitored)
                slot 1: ASA-SSM-20 hw/sw rev (1.0/6.0(3)E1) status (Up/Up)
                  IPS, 6.0(3)E1, Up
        Other host: Primary - Standby Ready
                Active time: 0 (sec)
                slot 0: ASA5540 hw/sw rev (1.0/7.2(1)) status (Up Sys)
                  Interface outside (X.X.X.2): Normal
                  Interface inside (Y.Y.Y.2): Normal
                  Interface DMZ-NHS (Z.Z.Z.2): Normal
                  Interface management (0.0.0.0): Normal (Not-Monitored)
                slot 1: ASA-SSM-20 hw/sw rev (1.0/6.0(3)E1) status (Up/Up)
                  IPS, 6.0(3)E1, Up

Stateful Failover Logical Update Statistics
        Link : Statefull-Failover GigabitEthernet0/3 (up)
        Stateful Obj    xmit       xerr       rcv        rerr
        General         3509761538 0          3760724    2
        sys cmd         1360203    0          1360196    0
        up time         0          0          0          0
        RPC services    0          0          0          0
        TCP conn        3160458482 0          1931284    0
        UDP conn        284734101  0          423080     0
        ARP tbl         63178436   0          46150      2
        Xlate_Timeout   0          0          0          0
        VPN IKE upd     18605      0          6          0
        VPN IPSEC upd   11698      0          8          0
        VPN CTCP upd    18         0          0          0
        VPN SDI upd     0          0          0          0
        VPN DHCP upd    0          0          0          0

        Logical Update Queue Information
                        Cur     Max     Total
        Recv Q:         0       25      3774426
        Xmit Q:         0       7       3534578209


Any idea whats happening there or what to look for ????
0
Comment
Question by:ccsenet
  • 5
  • 2
8 Comments
 
LVL 10

Expert Comment

by:lanboyo
ID: 24403193
Well I would open a TAC case ASAP. If you have a HTTP state inspection for failover, you might want to replace it with a generic tcp inspection, this helped for "A guy on the internet".


And is the management interface unused?
0
 
LVL 19

Expert Comment

by:nodisco
ID: 24404977
I'd agree with opening a TAC case on it also - you have some stateful int errors on the failover output - do you have any errors on sh interface for your  Gi0/3 interface on both units?

If your management0/0 is not in use, realistically you should have it shutdown but its unlikely to cause this problem.  Do you have any logs from your Primary unit that may indicate why its rebooting?  
0
 
LVL 10

Accepted Solution

by:
lanboyo earned 500 total points
ID: 24408329
Sorry, I was unclear. You have ;

failover replication http

- Someone with similar symptoms was able to resolve it with replacing this with;

failover replication tcp .

Which screams BUG to me, but you might be more interested in stability than code purity.
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 
LVL 1

Author Comment

by:ccsenet
ID: 24409585

Lanboyo and nodisco

 Thanks for the prompt responses.

Is the management interface unused?
Yes, the management interface is not being used.

failover replication tcp?? This isnt a command in Cisco ASA. Either we can use "failover replication http" or "no failover replication http". The only difference between the two is that in case of later the http connection table from the active unit will not be transferred to the standby unit.

We have found a following link for best practices on configuring Cisco ASA:

http://www.checkthenetwork.com/networksecurity%20Cisco%20ASA%20Firewall%20Best%20Practices%20for%20Firewall%20Deployment%201.asp#_Toc218778849

The above link recommends to disable http replication for performance reasons. Anyhow, we will try disabling and then trying. Lets see...



0
 
LVL 1

Author Comment

by:ccsenet
ID: 24409628

Following are the log entries on Primary unit (Standby).

May 13 2009 16:10:46: %ASA-1-105005: (Primary) Lost Failover communications with mate on interface outside
May 13 2009 16:10:46: %ASA-1-105005: (Primary) Lost Failover communications with mate on interface inside
May 13 2009 16:10:46: %ASA-1-105005: (Primary) Lost Failover communications with mate on interface DMZ-NHS
May 13 2009 16:10:46: %ASA-1-105008: (Primary) Testing Interface outside
May 13 2009 16:10:46: %ASA-1-105008: (Primary) Testing Interface inside
May 13 2009 16:10:46: %ASA-1-105008: (Primary) Testing Interface DMZ-NHS
May 13 2009 16:10:46: %ASA-1-105009: (Primary) Testing on interface outside Passed
May 13 2009 16:10:47: %ASA-1-105009: (Primary) Testing on interface inside Passed
May 13 2009 16:10:47: %ASA-1-105009: (Primary) Testing on interface DMZ-NHS Passed
May 16 2009 09:28:00: %ASA-1-105005: (Primary) Lost Failover communications with mate on interface outside
May 16 2009 09:28:00: %ASA-1-105005: (Primary) Lost Failover communications with mate on interface inside
May 16 2009 09:28:00: %ASA-1-105005: (Primary) Lost Failover communications with mate on interface DMZ-NHS
May 16 2009 09:28:00: %ASA-1-105008: (Primary) Testing Interface outside
May 16 2009 09:28:00: %ASA-1-105008: (Primary) Testing Interface inside
May 16 2009 09:28:00: %ASA-1-105008: (Primary) Testing Interface DMZ-NHS
May 16 2009 09:28:00: %ASA-1-105009: (Primary) Testing on interface outside Passed
May 16 2009 09:28:01: %ASA-1-105009: (Primary) Testing on interface DMZ-NHS Passed
May 16 2009 09:28:01: %ASA-1-105009: (Primary) Testing on interface inside Passed
May 18 2009 07:52:13: %ASA-1-105005: (Primary) Lost Failover communications with mate on interface outside
May 18 2009 07:52:13: %ASA-1-105005: (Primary) Lost Failover communications with mate on interface inside
May 18 2009 07:52:13: %ASA-1-105005: (Primary) Lost Failover communications with mate on interface DMZ-NHS
May 18 2009 07:52:13: %ASA-1-105008: (Primary) Testing Interface outside
May 18 2009 07:52:13: %ASA-1-105008: (Primary) Testing Interface inside
May 18 2009 07:52:13: %ASA-1-105008: (Primary) Testing Interface DMZ-NHS
May 18 2009 07:52:13: %ASA-1-105009: (Primary) Testing on interface outside Passed
May 18 2009 07:52:13: %ASA-1-105009: (Primary) Testing on interface inside Passed
May 18 2009 07:52:13: %ASA-1-105009: (Primary) Testing on interface DMZ-NHS Passed
May 18 2009 10:25:57: %ASA-1-105005: (Primary) Lost Failover communications with mate on interface outside
May 18 2009 10:25:57: %ASA-1-105005: (Primary) Lost Failover communications with mate on interface inside
May 18 2009 10:25:57: %ASA-1-105005: (Primary) Lost Failover communications with mate on interface DMZ-NHS
May 18 2009 10:25:57: %ASA-1-105008: (Primary) Testing Interface outside
May 18 2009 10:25:57: %ASA-1-105008: (Primary) Testing Interface inside
May 18 2009 10:25:57: %ASA-1-105008: (Primary) Testing Interface DMZ-NHS
May 18 2009 10:25:57: %ASA-1-105009: (Primary) Testing on interface outside Passed
May 18 2009 10:25:57: %ASA-1-105009: (Primary) Testing on interface inside Passed
May 18 2009 10:25:57: %ASA-1-105009: (Primary) Testing on interface DMZ-NHS Passed


Are they (Active and Standby units) supposed to communicate over interfaces other than Giga0/3 (LAN Failover Interface)?

Any clues??
0
 
LVL 1

Author Comment

by:ccsenet
ID: 24411072

Cool... The problem is solved.

"no failover replication http" did it for us. The primary unit is now active. It seemed to be a problem with poll and hold timers. The short timers probably do not allow the http replication while failing over. We believe that adjusting timers might also have helped.

For the second problem posted above: "Lost Failover communications with mate on interface outside/inside/DMZ-NHS" we have found a good explanation in "Configuring Failover via Cisco ASDM" by Bob Eckhoff. Can be downloaded from:

https://cisco.hosted.jivesoftware.com/servlet/JiveServlet/download/3390-1-2874/Configuring%20Failover%20via%20ASDM_Posted_10-30-08.pdf;jsessionid=C38AF4535FE4F4FA666CDB19A6EEDDAA

Again the Poll and hold timers for monitored interfaces seems a possible solution there.


0
 
LVL 1

Author Closing Comment

by:ccsenet
ID: 31582199
"failover replication tcp" didnt work since there is no such command supported in Cisco ASA 5540. "no failover replication tcp" solved the problem.
0
 
LVL 1

Author Comment

by:ccsenet
ID: 24418593
A mistake made in the comment on lanboyo's solution. The correct statement: "no failover replication http" solved the problem.
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Changing external IP address on Cisco 1921 Router 1 63
SSH over http/https 8 124
Open a port on Cisco Router 1941 23 40
route-map permit with a number 1 19
The Cisco RV042 router is a popular small network interfacing device that is often used as an internet gateway. Network administrators need to get at the management interface to make settings, change passwords, etc. This access is generally done usi…
In the world of WAN, QoS is a pretty important topic for most, if not all, networks. Some WAN technologies have QoS mechanisms built in, but others, such as some L2 WAN's, don't have QoS control in the provider cloud.
After creating this article (http://www.experts-exchange.com/articles/23699/Setup-Mikrotik-routers-with-OSPF.html), I decided to make a video (no audio) to show you how to configure the routers and run some trace routes and pings between the 7 sites…
After creating this article (http://www.experts-exchange.com/articles/23699/Setup-Mikrotik-routers-with-OSPF.html), I decided to make a video (no audio) to show you how to configure the routers and run some trace routes and pings between the 7 sites…

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question