• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 619
  • Last Modified:

Cluster manager redundancy

Hello,
My company has two servers running windows 2003 and Sql 2005.  These boxes are being hosted at a local hosting company.  Currently, the servers are set up as a cluster.  Each box has three nic cards.
The first card (Front end - Public ip) the second ( back end - private ip) third ( heartbeat).  We have tested this redundancy many times by powering down the first box and like clock work it switches to the second box.  
We had a situation yesterday that made the redundacy fail.  Our front end card became disabled ( for lack of a better word) and the box was taken off line.  The redundancy did not work because the heartbeat was still working??  
So my question is - is there something wrong with the way our hosted company set up cluster manager on these servers?  Is there a better way??
Thanks for your time
0
hexvader
Asked:
hexvader
  • 2
  • 2
  • 2
2 Solutions
 
oBdACommented:
Exactly what did you mean with "[the] card became disabled"? If you have an IP resource in the SQL group, and this IP address fails because it can't bind to the (disabled) NIC anymore, then you should have had a failover.
0
 
hexvaderAuthor Commented:

What happened is this - One of our it guys remote controlled to the server.  He needed to transfer a large data base file from the server down to our network.  Normally we would create a vpn session to our hosted facility but we have been having problems transfering large files for the past few days.  So he decieded to create a vpn session from the server to our network and pull it down that way.  As soon as he launched the vpn - he lost contact via remote control and the server was in limbo.  I believe once he created the vpn connection, the server used the default gateway / dns settings of our internal network and just went off line.  During this time frame - (about 8 minutes) the heartbeat was still showing the server as up when in fact it was down.  We got in contact with the hosted facility and had them reboot the box.  During the reboot - the redundancy kicked and our site and data came back.
I am quite positive we wont repeat this process again - but the question remains - Was the box technically down and should the roll over have worked.  What would happen if the nic died?? The heart beat would still be working  would the redundancy kick in??
0
 
Ted BouskillSenior Software DeveloperCommented:
Ah, how far to take redundancy can get very complex.  IE: Does your cluster have one network switch or are they redundant?  Do you have redundant firewalls in front of the server?  The list can go on an on.

Would the risk of the switch dying be higher than the NIC card?

I'm wondering if your cluster dependencies are misconfigured.  I'm pretty sure that if one of the NIC's on my cluster was disabled it would rollover automatically.
0
Improved Protection from Phishing Attacks

WatchGuard DNSWatch reduces malware infections by detecting and blocking malicious DNS requests, improving your ability to protect employees from phishing attacks. Learn more about our newest service included in Total Security Suite today!

 
oBdACommented:
As I said: as long as the IP address is bound correctly to the clustered NIC, it's not considered a failure. If the NIC dies completely, the TCP/IP stack of this NIC will go down, so will the IP address resource, and finally the group will failover.
In your case, the NIC didn't fail completely, so there was no reason for the cluster to fail the resource over.
MSCS is more complex than just listening for the heartbeat of the other node.
0
 
Ted BouskillSenior Software DeveloperCommented:
Hmm, the virtual IP should belong to the same subnet as the static IP's for the front end NIC.  If one front end NIC had it's subnet changed I would think that would trigger a failover condition.
0
 
hexvaderAuthor Commented:
Thanks for your help guys.  
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: Certified Penetration Testing

This CPTE Certified Penetration Testing Engineer course covers everything you need to know about becoming a Certified Penetration Testing Engineer. Career Path: Professional roles include Ethical Hackers, Security Consultants, System Administrators, and Chief Security Officers.

  • 2
  • 2
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now