dcs-user
asked on
Juniper Netscreen SSG550
I have 2 FW in nsrp HA active/passive mode.
when FW-A is active, everything is working, but when FW-A become passive and FW-B become active , I loose 4 networks that are on the same module.
the module has 4 ethernet ports. I replaced with a new module, same problem.
at least once aday, I have to switch back to FW-A. I don't know why it keeps switching from FW-A to FW-B and B is not working right, I have reload the firmware on FW-B and it only works for approx 45 minutes before I lost the entire 4 networks on the same module. Any ideas, help would be appreciated.
when FW-A is active, everything is working, but when FW-A become passive and FW-B become active , I loose 4 networks that are on the same module.
the module has 4 ethernet ports. I replaced with a new module, same problem.
at least once aday, I have to switch back to FW-A. I don't know why it keeps switching from FW-A to FW-B and B is not working right, I have reload the firmware on FW-B and it only works for approx 45 minutes before I lost the entire 4 networks on the same module. Any ideas, help would be appreciated.
ASKER
the two units are identical, hardware and software. it used to work before with no issue, it just acting up lately. all the interfaces are monitored, when it fails over, the log only show the FW-A become primary backup and the FW-B become master. I don't know nor could find what trigger the fail over, but it only seems to fail over from A to B, not from B to A.
Please give me more details on how to find and compare track-ip settings.
Please give me more details on how to find and compare track-ip settings.
A "get nsrp", as above, from both nodes will give a good indication of what is configured on the cluster and from there we can then look a bit deeper.
ASKER
Thanks for the info here.
The nsrp info tells me that:
* Both units have same priority, so if a failover occurs from A to B, when A recovers, it stays as the backup firewall
* Interfaces being monitored are:
ethernet0/0
ethernet0/1
ethernet6/0
ethernet6/
ethernet6/3
ethernet0/2
ethernet6/2
ethernet2/0
* track-ip is disabled.
* RTO sync is on
From this it would seem that an interface is failing on A to cause the initial failover from A to B.
B seems to be in some way out of sync with A, in that when its the master, the networks do not all come up. There may be some config out of sync between the 2 member.
I would try the following:
1. Run on the backup (FW-B):
exec nsrp sync global-config check-sum
This will compare the config on B to that on A and tell you if it matches.
If it does not match, do this
exec nsrp sync global-config save
reset but do not save config
If this does not correct the issue, it may be prudent to rebuild the cluster.
Given that A seems to be fine in all of this, we can use this as a base.
1. Take a copy of the config from A, either copy off via TFTP or use the web UI to save a copy.
2. Open the file in notepad, look for device specific entries.
The main ones include:
* hostname
* NSRP priority (if needed)
* NSRP Pre empt (if needed)
* manage-ip settings
* physical interface info (ie speed, duplex etc)
Edit these to reflect firewall B and save as a new config.
Apply this config to B and reset.
This ensures that both firewalls are completely in sync with each other and should correct the failover issues.
If you want to have FW-A as the master whenever it is capable of doing so, we cna set the NSRP priority less than on B.
Points to note tho, if you have a busy firewall with lots of sessions to sync etc, consider using a timer delay for taking over the mastership to give the firewalls time to sync connections. Something like 60-90 secs is normally fine.
The nsrp info tells me that:
* Both units have same priority, so if a failover occurs from A to B, when A recovers, it stays as the backup firewall
* Interfaces being monitored are:
ethernet0/0
ethernet0/1
ethernet6/0
ethernet6/
ethernet6/3
ethernet0/2
ethernet6/2
ethernet2/0
* track-ip is disabled.
* RTO sync is on
From this it would seem that an interface is failing on A to cause the initial failover from A to B.
B seems to be in some way out of sync with A, in that when its the master, the networks do not all come up. There may be some config out of sync between the 2 member.
I would try the following:
1. Run on the backup (FW-B):
exec nsrp sync global-config check-sum
This will compare the config on B to that on A and tell you if it matches.
If it does not match, do this
exec nsrp sync global-config save
reset but do not save config
If this does not correct the issue, it may be prudent to rebuild the cluster.
Given that A seems to be fine in all of this, we can use this as a base.
1. Take a copy of the config from A, either copy off via TFTP or use the web UI to save a copy.
2. Open the file in notepad, look for device specific entries.
The main ones include:
* hostname
* NSRP priority (if needed)
* NSRP Pre empt (if needed)
* manage-ip settings
* physical interface info (ie speed, duplex etc)
Edit these to reflect firewall B and save as a new config.
Apply this config to B and reset.
This ensures that both firewalls are completely in sync with each other and should correct the failover issues.
If you want to have FW-A as the master whenever it is capable of doing so, we cna set the NSRP priority less than on B.
Points to note tho, if you have a busy firewall with lots of sessions to sync etc, consider using a timer delay for taking over the mastership to give the firewalls time to sync connections. Something like 60-90 secs is normally fine.
ASKER
When I run below command on FW-B
exec nsrp sync global-config check-sum,
it does not do anything, any ideas?
Thanks
exec nsrp sync global-config check-sum,
it does not do anything, any ideas?
Thanks
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
when I run the exec nsrp sync global-config and hit enter, it does nothing.
it only give me the command prompt again, I believe the syntax is correct.
the version I am running is 6.1.0r4.0
it is wierd that a few days ago, I reloaded the same version of screenos on the FW-B, and the
networks on the same module started to work, but it only works for approx 45.
everything started when I upgraded to the new firmware and later have to downgrade back to the 6.1.0r4.0, after that, the FW-A working with no issue but FW-B
thanks for help
it only give me the command prompt again, I believe the syntax is correct.
the version I am running is 6.1.0r4.0
it is wierd that a few days ago, I reloaded the same version of screenos on the FW-B, and the
networks on the same module started to work, but it only works for approx 45.
everything started when I upgraded to the new firmware and later have to downgrade back to the 6.1.0r4.0, after that, the FW-A working with no issue but FW-B
thanks for help
Are both units the same version of screenos too?
What may help is get the following from each firewall:
get nsrp
This will show us a bit more on the config of the cluster and also what is being monitored.
Also, when it fails over, what log entries do you have for both units? ie when FW-A fails, are there any indications as to what caused the failover? And for FW-B,. what does it say when the unit takes over and then subsequently fails?
I would also compare any track-ip settings you may have on the nodes, as these settings are NOT synced across the cluster, they are device dependant.