Windows 2003 Cluster problem

I have a Windows 2003 cluster with two (active/active) nodes. Occasionally, one of the nodes hangs. I have to manually power off and on. At the same time, the fail-over doesnt occur, until I reboot.
I've reviewed windows event logs and found the following:
Cluster File Share resource 'Share Name' has failed a status check. The error code is 64.

What can cause this?

Also, I've found in Application Event Log that MSDTC is not starting. After some searching, I found that MSDTC service needs to be clustered. Does it have to be clustered?

The node that fails has two File share resources, IP Address, Network name, Generic Service( nothing fancy, just a service), Physical Disk, Two generic application (ABE resources).

Thanks.
LVL 1
all_expertsAsked:
Who is Participating?
 
all_expertsAuthor Commented:
I have found a solution for this problem from Microsoft 3rd tier support if anybody wants to know it.
We had some users (2-3) who were storing PST files on their home drives which reside on File Server Cluster. If you have these PST files open in Outlook, it uses a lot of I/O and can make Cluster Resource fail like that.
0
 
65tdRetiredCommented:
Review the cluster log look around the time error 64 was generated, be advised to adjust the log time +/- to Greenwich mean time.

Does the server and or applications need MSDTC?  Can it be disabled?
Otherwise if it's on cluster it's going to be setup as a cluster resource.
0
 
all_expertsAuthor Commented:
The only thing that is around error 64 in the log is that this share resource failed to failover.

I dont know if I need MSDTC. That's what I am trying to find out.
Do I need it for File Share Resource?
0
Cloud Class® Course: Python 3 Fundamentals

This course will teach participants about installing and configuring Python, syntax, importing, statements, types, strings, booleans, files, lists, tuples, comprehensions, functions, and classes.

 
65tdRetiredCommented:
MSDTC reg'ed for file share, not that I'm aware of.

From the log no loss of IP address or network name?

Have the dependancies been reviewed for the file share resources?
0
 
all_expertsAuthor Commented:
I dont see any errors about loss of IP. I mean, if it did loose an IP or a Network Name, it should have failed over. for the file share, the only dependency is the physical drive.

Also, after I restarted server i got these errors:
"Cluster service is requesting a bus reset for device \Device\ClusDisk0."
"The driver for device \Device\RaidPort0 performed a bus reset upon request."

Also, after restart, Cluster Service didnt start. I had to manually start it.
0
 
65tdRetiredCommented:
Is this a new cluster?
Are the quorum and data drives physical and seperate?
0
 
all_expertsAuthor Commented:
This cluster is not new. It has been configured for 5 months or so. I would say once a month it does this crash.
Quorum and data drives are located on SAN.
0
 
RobertAdrianCommented:
Ensure the cluster service account has full access ntfs permissions to the cluster disks and the shared folders.
0
 
all_expertsAuthor Commented:
Cluster Service has full access ntfs permissions. It is also a member of local admins
0
 
65tdRetiredCommented:
But are the quorum and data disks separate physical disks?
0
 
all_expertsAuthor Commented:
It is on SAN which has one HUGE RAID of drives for Storage.
It's two different Logical drives, but physically everything is on RAID.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.