Link to home
Start Free TrialLog in
Avatar of agcsupport
agcsupportFlag for United States of America

asked on

Add node to hyper-v server failover clustering environment

I have a 2 node Hyper-V environment with failover clustering that I am in the process of adding an additional node to. I have added both the hyper-v and clustering roles to the new node. I have 4 CSV's that I utilize for the vhdx files as well as the quorum/witness drive. These CSV's sit on a netapp filer. My next step was to present the csv luns to the new node. I added the new node as an initiator to one of the luns then did a refresh on the MPIO app. The app as well as disk management both became unresponsive. I noticed the lun came up online and I am guessing this is why the server was struggling with the it. I restarted thte server and removed it from the list of initiators and everything came up fine. I have read the OS see's a disk signature because this is an existing operational lun and because of that wants to bring the lun online by default when it's being added.

Can someone tell me the process and or steps I should be following when it comes to the luns and adding them to the new hyper-v/clustering server that will prevent this from happening?

Thanks
Avatar of Cliff Galiher
Cliff Galiher
Flag of United States of America image

Run the cluster validation wizard on the new node. It'll do all the necessary checks, and if all pass, give you the opportunity to add the node to the cluster. Then the cluster manager will bring the storage into the fold gracefully. By trying to do it manually, you are breaking the storage layer.
Avatar of agcsupport

ASKER

I did find that article and was using it along with another MS How-To. The last step in the articles "pre" list is "Ensure SAN disks are visible in Disk Management of the new server. Just check, Don’t try to make it online".

That is the challenge I am having when I do make the LUN/SAN Disk available the server is bringing it online.
Create new LUN (for testing purpose) and attach to New Hyper-V host. Check if  it is also coming online on Hyper-V.
I created a new LUN and presented only to the new server. After rescan it is there offline and unallocated.

The issue seems to be specifically to existing CSV disks in use by clustering on the existing servers.
There really is no issue. This is expected behavior.  NTFS is, at its core, still a single-access filesystem. Shared access is simply not supported. Microsoft creatively got around the issue by introducing CSVs.  But all CSVs really do is allow nodes to write data directly to a disk without interacting with the filesystem at all.  But for that to work, each node must know about the files in the filesystem in advance, and any changes to the filesystem need to be run through a single node. This is known as the CSV Coordinator node.  So a Hyper-V host can write to an existing VHD because it knows where the VHD blocks are, what the ACLs are, and similar. But any event that changes the VHD, such as a resize, still goes through the coordinator node.  

As such, you can't simply add an existing LUN to a new machine before making that machine part of the cluster.  That'll break because you now have two machines trying to manage the NTFS tables. And you get the freezing you describe.  If you go through the wizard and validate your cluster then add the node, it'll gracefully introduce the node to the existing storage and the node will be aware of the CSV coordinator, thus it won't try to handle the NTFS stuff.

So as I said above, simply run the validation wizard and add the node.  Any guidance that suggests doing otherwise is simply bad advice.  CSVs are not true shared-everything storage and trying to treat them as such will cause problems.
I guess I'm getting clustering resources confused with CSV's. I was thinking a CSV was the same as a clustered resource and therefore needed to be presented to any node that would be accessing it. So are you saying only one node needs the LUN/Disk presented to it and then that becomes the coordinator node for that CSV? The other nodes know about it and access it when it is added as a CSV?

Sorry for all the questions I just want to fully understand what is happening.
When you add the node, you'll be presented with storage already in the cluster that you can add. The wizard takes care of properly informing the cluster of the change and getting the new node access. Same goes for adding a new CSV. Yes, the cluster will select one node to act as coordinator. Not necessarily the node that originally had the LUN access. That depends on availability and workloads, etc.
So I present one node with the LUN/Disk and then create a CSV from that. All of the other nodes are really accessing the CSV through the one node that has the LUN presented to it or I should say the coordinator node? If I take the node down that the CSV is created from does that make that CSV inaccessible to the other nodes seeing as the LUN/Disk is no longer available?
ASKER CERTIFIED SOLUTION
Avatar of Cliff Galiher
Cliff Galiher
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Can you tell me whether the csv will be taken offline to run validation against them?
Validation is usually non-intrusive.