We had an issue with our Active/Passive SQL Cluster 2005 and had to reboot them. The issue was related to our disks on SAN. when both the nodes came online and disks were showing up online on Node 1 we were able to bring the cluster service on and the resources online on Node 1 but the Node 2 failed to restart the service. We thought of evicting the node and try to rejoin it. It would let us do that and give this error message constantly even though the service is not started.
00000ba0.00000b78::2011/07/12-20:29:11.360 INFO [CS] Cluster Service started - Cluster Node Version 4.3790
00000ba0.00000b78::2011/07/12-20:29:11.360 INFO OS Version 5.2.3790 - Service Pack 2 (ADS 03000112L)
00000ba0.00000b78::2011/07/12-20:29:11.360 INFO Local Time is 2011/07/12-15:29:11.360
00000ba0.00000db0::2011/07/12-20:29:11.360 INFO [CS] Service Starting...
00000ba0.00000db0::2011/07/12-20:29:11.360 INFO [INIT] ClusterInitialize called to start cluster.
00000ba0.00000db0::2011/07/12-20:29:11.360 INFO [EP] Initialization...
00000ba0.00000db0::2011/07/12-20:29:11.360 INFO [DM] Initialization
00000ba0.00000db0::2011/07/12-20:29:11.360 ERR [DM] DmInitialize: The hive was loaded- rollback, unload and reload again
00000ba0.00000db0::2011/07/12-20:29:11.360 INFO [DM] DmpRestartFlusher: Entry
00000ba0.00000db0::2011/07/12-20:29:11.360 INFO [DM] DmpUnloadHive: unloading the hive
00000ba0.00000db0::2011/07/12-20:29:11.360 INFO [Qfs] QfsSetFileAttributes C:\WINDOWS\Cluster\CLUSDB.BKP$ 80, status 2
00000ba0.00000db0::2011/07/12-20:29:11.360 INFO [Qfs] QfsDeleteFile C:\WINDOWS\Cluster\CLUSDB.BKP$, status 2
00000ba0.00000db0::2011/07/12-20:29:11.360 INFO [DM] Loading cluster database from C:\WINDOWS\Cluster\CLUSDB
00000ba0.00000db0::2011/07/12-20:29:11.360 INFO [DM] DmpStartFlusher: Entry
00000ba0.00000db0::2011/07/12-20:29:11.360 INFO [DM] DmpStartFlusher: thread created
00000ba0.00000db0::2011/07/12-20:29:11.360 ERR [DM] Failed to open key Resources, status 2
00000ba0.00000db0::2011/07/12-20:29:11.376 ERR Cluster service suffered an unexpected fatal error at line 1386 of source module d:\nt\base\cluster\service\dm\dminit.c. The error code was 2.
Do anyone have any idea about what could be causing this?
"cluster node /force" on the node you'd like to clean. Be aware that will destroy cluster information on this node so do not run it on "live" node ;)
http://technet.microsoft.com/en-us/library/cc739895(WS.10).aspx