I have a 2 node cluster running in Node and Disk Majority mode. We have a witness disk, a MSDTC disk and 3 disks for the SQl cluster. I'm able to successfully simulate failures and all services/apps failover. Also I can manually move from one node to another without problems.
One node we run validation on and we get zero errors. The other node we run validation on and it fails. We get these errors:
Failed to arbitrate for cluster disk 3 from node xxxxxxx.xxxxxxxx.com, failure reason: The RPC server is unavailable.
Failed to arbitrate for cluster disk 2 from node xxxxxxx.xxxxxxx.com, failure reason: The RPC server is unavailable.
The validation doesn't always fail though, sometimes we get warnings such as these:
Validating cluster resource SQL Server.
This resource is marked with a state of 'Offline'. The functionality that this resource provides is not available while it is in the offline state. The resource may be put in this state by an administrator or program. It may also be a newly created resource which has not been put in the online state or the resource may be dependent on a resource that is not online. Resources can be brought online by choosing the 'Bring this resource online' action in Failover Cluster Manager.
Validating cluster resource SQL Server Agent.
This resource is marked with a state of 'Offline'. The functionality that this resource provides is not available while it is in the offline state. The resource may be put in this state by an administrator or program. It may also be a newly created resource which has not been put in the online state or the resource may be dependent on a resource that is not online. Resources can be brought online by choosing the 'Bring this resource online' action in Failover Cluster Manager.
Any ideas to what could be causing this? Thanks for any help