Link to home
Start Free TrialLog in
Avatar of dcs-user
dcs-user

asked on

Juniper Netscreen SSG550

I have 2 FW in nsrp HA active/passive mode.
when FW-A is active, everything is working, but when FW-A become passive and FW-B become active , I loose 4 networks that are on the same module.  
 the module has 4 ethernet ports.  I replaced with a new module, same problem.
at least once aday, I have to switch back to FW-A.  I don't know why it keeps switching from FW-A to FW-B and B is not working right, I have reload the firmware on FW-B and it only works for approx 45 minutes before I lost the entire 4 networks on the same module.  Any ideas, help would be appreciated.
Avatar of deimark
deimark
Flag of United Kingdom of Great Britain and Northern Ireland image

Are both units exact same hardware?  ie all same PIMs etc?

Are both units the same version of screenos too?

What may help is get the following from each firewall:

get nsrp

This will show us a bit more on the config of the cluster and also what is being monitored.

Also, when it fails over, what log entries do you have for both units?  ie when FW-A fails, are there any indications as to what caused the failover?  And for FW-B,. what does it say when the unit takes over and then subsequently fails?

I would also compare any track-ip settings you may have on the nodes, as these settings are NOT synced across the cluster, they are device dependant.
Avatar of dcs-user
dcs-user

ASKER

the two units are identical, hardware and software. it used to work before with no issue, it just acting up lately.  all the interfaces are monitored, when it fails over, the log only show the FW-A become primary backup and the FW-B become master.  I don't know nor could find what trigger the fail over, but it only seems to fail over from A to B, not from B to A.

Please give me more details on how to find and compare track-ip settings.
A "get nsrp", as above, from both nodes will give a good indication of what is configured on the cluster and from there we can then look a bit deeper.
Please see attached  
the FW-A is 251 and and FW-B is 252
get-nsrp-251.TXT
get-nsrp-252.TXT
Thanks for the info here.

The nsrp info tells me that:
*  Both units have same priority, so if a failover occurs from A to B, when A recovers, it stays as the backup firewall
*  Interfaces being monitored are:
ethernet0/0
ethernet0/1
ethernet6/0
ethernet6/
ethernet6/3
ethernet0/2
ethernet6/2
ethernet2/0
*  track-ip is disabled.
*  RTO sync is on

From this it would seem that an interface is failing on A to cause the initial failover from A to B.

B seems to be in some way out of sync with A, in that when its the master, the networks do not all come up.  There may be some config out of sync between the 2 member.

I would try the following:

1.  Run on the backup (FW-B):
exec nsrp sync global-config check-sum

This will compare the config on B to that on A and tell you if it matches.

If it does not match, do this

exec nsrp sync global-config save

reset but do not save config

If this does not correct the issue, it may be prudent to rebuild the cluster.

Given that A seems to be fine in all of this, we can use this as a base.

1.  Take a copy of the config from A, either copy off via TFTP or use the web UI to save a copy.

2.  Open the file in notepad, look for device specific entries.

The main ones include:
*  hostname
*  NSRP priority (if needed)
*  NSRP Pre empt (if needed)
*  manage-ip settings
*  physical interface info (ie speed, duplex etc)

Edit these to reflect firewall B and save as a new config.

Apply this config to B and reset.

This ensures that both firewalls are completely in sync with each other and should correct the failover issues.

If you want to have FW-A as the master whenever it is capable of doing so, we cna set the NSRP priority less than on B.

Points to note tho, if you have a busy firewall with lots of sessions to sync etc, consider using a timer delay for taking over the mastership to give the firewalls time to sync connections.  Something like 60-90 secs is normally fine.
When  I run below command on FW-B
exec nsrp sync global-config check-sum,
it does not do anything, any ideas?


Thanks
ASKER CERTIFIED SOLUTION
Avatar of deimark
deimark
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
when I run the exec nsrp sync global-config and hit enter, it does nothing.
it only give me the command prompt again, I believe the syntax is correct.
the version I am running is 6.1.0r4.0
it is wierd that a few days ago, I reloaded the same version of screenos on the FW-B, and the
networks on the same module started to work, but it only works for approx 45.  
everything started when I upgraded to the new firmware and later have to downgrade back to the 6.1.0r4.0, after that, the FW-A working with no issue but FW-B
thanks for help