Solved

2008 R2 Dual Node Failover Cluster Storage Access Problem

Posted on 2013-06-05
4
1,700 Views
Last Modified: 2013-06-12
We have 2 x DL380 G7 connected to a HP P2000 in a MS Hyper-v Failover Cluster Configuration for Highly Available VMs.

Setup as follows:
MS iSCSI Initiator on each Node
MS Failover Cluster manager on each Node
SCVMM 2008 R2 on each node
Q:\ Quorum is 500MB in size
CSV is 2.3TB in size
HP P2000 has 2 controllers with 4 ports each. We are using ports A1 and A3 and B1 and B3 for the SAN traffic. Dedciated NICs on each server
Dedicated Cluster Hearbeat using a Crossover cable
Dedicated NICs for the Physical & Hyper-V traffic

This has been operational for 6 months but then a severe power cut upset the complete set-up. After power was restored the CSV on the SAN was offline to both nodes until we ran a Powershell command on Node2

"clear-clusterdiskreservation -disk3"

This brought it online but neither Nodes could access the CSV, only the Quorum was accessible.

After 4 hours of reboots and "fettling" we destroyed the cluster and rebuilt it.

The result now is that Node 1 has the Quorum and can access the CSV c:\clusterstorage\volume1 no problem. Vms are back online and running on Node1.
Node 2 CANNOT access the CSV and when you browse to c:\clusterstorage\volume1 it is empty.

Main error within cluster manager is:

Cluster Shared Volume 1Volume1" (Cluster Disk 2) is no longer accessible from this cluster node because of error "ERROR_CANT_ACCESS_DOMAIN_INFO (1351)".

Research suggests changing the cluster service to Automatic Delayed Start which has had no effect.

Lots of research and attempts to fix the issue have resulted in nought. Problem is that the servers are live and in use by one of our clients so before they have any further downtime we must be certain of a fix, gulp!

Attached are 2 grabs showing the status of disk management on each Node1.
Cluster logs are available.

Anyone got any ideas on where to start?
Node1.JPG
Node2.JPG
0
Comment
Question by:robtheplod
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 39

Accepted Solution

by:
Philip Elder earned 500 total points
ID: 39224615
In my experience the disks that are shared between cluster nodes MUST be the same number.

Disk 0 = OS
Disk 1 = Quorum
Disk 2 = VHD/VHDX Storage

What the chicken is that QuikStor doing there? Get rid of it. Run your Cluster Validation Wizard again.

You will probably have to reboot both nodes to get the disks to line up. But, line up they must.

Philip
0
 

Author Comment

by:robtheplod
ID: 39226030
Thank you for the advice Philip.
We will make the changes and will let you know the result.
0
 

Author Comment

by:robtheplod
ID: 39240272
Thanks Philip, you were right on the money. The cluster shared volume is now available to both nodes after removing the Quickstor and rebooting both nodes.
0
 
LVL 39

Expert Comment

by:Philip Elder
ID: 39241434
Excellent. :)

Philip
0

Featured Post

How our DevOps Teams Maximize Uptime

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us. Read the use case whitepaper.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction: Sometimes when I receive a call from my users to solve their problems it is very difficult for me to found their computer IP address. Even finding their computer Host to provide remote support can be a problem.  So I resorted to Goo…
Moving your enterprise fax infrastructure from in-house fax machines and servers to the cloud makes sense — from both an efficiency and productivity standpoint. But does migrating to a cloud fax solution mean you will no longer be able to send or re…
Nobody understands Phishing better than an anti-spam company. That’s why we are providing Phishing Awareness Training to our customers. According to a report by Verizon, only 3% of targeted users report malicious emails to management. With compan…
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…

737 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question