?
Solved

Windows Failover Cluster witness disk failing repeatedly

Posted on 2010-08-30
6
Medium Priority
?
2,095 Views
Last Modified: 2012-08-13
I have a windows failover cluster running two 2008 R2 Servers on VMWare with backend storage on a Compellent SAN.  Starting a few days ago, the shared witness disk is in a continual failure loop.  It goes from online to offline tp pending in the space of 1 to 2 minutes and it does it continuously.  I have run the validation tests with the disks offline and have no errors.  The vmware servers are hosting other servers and none of those are having problems.  SAN shows no errors, network shows no errors, and the only events I see indicate that the witness disk failed, but don't give any other information.

The only thing that changed recently is that the windows updates which were released on 8/24/10 were installed.  I've done some searches but I am not finding any information.  Nothing has been done because I don't know what to try.

VMWare servers are HP DL 380s with fiber connection via QLogix to the SAN, which is a Compellent 20 Series running version 4.5.3.  Again, no other systems are showing errors.  Is it possible that the Witness disk could be corrupt?
0
Comment
Question by:AANKyle
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
6 Comments
 
LVL 2

Accepted Solution

by:
jesse_7271 earned 2000 total points
ID: 33559972
So there is nothing in Application, System, or Cluster Logs besides the fail?
0
 
LVL 2

Assisted Solution

by:jesse_7271
jesse_7271 earned 2000 total points
ID: 33559990
%systemroot%\Cluster\cluster.log
0
 

Author Comment

by:AANKyle
ID: 33560059
I do not have a cluster.log file.  The only log files in there are .log1 and .log2 and both are unreadable.

In the system log, I do have disk errors that I did not see before - Event ID is 15.  Description is "The device, \Device\Harddisk1\DR1, is not ready for access yet."  I also see another one that says I need to run chkdsk on the Q volume, which is the witness disk.  Gonna try that now.
0
Problems using Powershell and Active Directory?

Managing Active Directory does not always have to be complicated.  If you are spending more time trying instead of doing, then it's time to look at something else. For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why

 

Author Comment

by:AANKyle
ID: 33560146
Check disk ran, but did not find any errors.  I did get this event in the log though:

Driver Management concluded the process to install driver FileRepository\volsnap.inf_amd64_neutral_7499a4fac85b39fc\volsnap.inf for Device Instance ID STORAGE\VOLUMESNAPSHOT\HARDDISKVOLUMESNAPSHOT1 with the following status: 0x0.
0
 
LVL 2

Assisted Solution

by:jesse_7271
jesse_7271 earned 2000 total points
ID: 33560346
Have you tried pausing one of the nodes?

Too many factors to trouble shoot without narrowing down more.  I would make sure you are getting a high level of logging

http://blogs.msdn.com/b/clustering/archive/2008/09/24/8962934.aspx

0
 

Author Comment

by:AANKyle
ID: 33560918
After running chkdsk on Q everything seems to have stabilized.  No more errors or disk messages.  Node 2 was paused so I am going to bring it back online and see what happens.

Thanks for the suggestion to recheck the event logs.  That helped me find the problem.
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Possible fixes for Windows 7 and Windows Server 2008 updating problem. Solutions mentioned are from Microsoft themselves. I started a case with them from our Microsoft Silver Partner option to open a case and get direct support from Microsoft. If s…
For anyone that has accidentally used newSID with Server 2008 R2 (like I did) and hasn't been able to get the server running again because you were unlucky (as I was) and had no backups - I was able to get things working by doing a Registry Hive rec…
This tutorial will walk an individual through the steps necessary to join and promote the first Windows Server 2012 domain controller into an Active Directory environment running on Windows Server 2008. Determine the location of the FSMO roles by lo…
This Micro Tutorial hows how you can integrate  Mac OSX to a Windows Active Directory Domain. Apple has made it easy to allow users to bind their macs to a windows domain with relative ease. The following video show how to bind OSX Mavericks to …
Suggested Courses

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question