Add node to hyper-v server failover clustering environment

I have a 2 node Hyper-V environment with failover clustering that I am in the process of adding an additional node to. I have added both the hyper-v and clustering roles to the new node. I have 4 CSV's that I utilize for the vhdx files as well as the quorum/witness drive. These CSV's sit on a netapp filer. My next step was to present the csv luns to the new node. I added the new node as an initiator to one of the luns then did a refresh on the MPIO app. The app as well as disk management both became unresponsive. I noticed the lun came up online and I am guessing this is why the server was struggling with the it. I restarted thte server and removed it from the list of initiators and everything came up fine. I have read the OS see's a disk signature because this is an existing operational lun and because of that wants to bring the lun online by default when it's being added.

Can someone tell me the process and or steps I should be following when it comes to the luns and adding them to the new hyper-v/clustering server that will prevent this from happening?

Thanks
agcsupportAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Cliff GaliherCommented:
Run the cluster validation wizard on the new node. It'll do all the necessary checks, and if all pass, give you the opportunity to add the node to the cluster. Then the cluster manager will bring the storage into the fold gracefully. By trying to do it manually, you are breaking the storage layer.
Manojkumar RaneCommented:
agcsupportAuthor Commented:
I did find that article and was using it along with another MS How-To. The last step in the articles "pre" list is "Ensure SAN disks are visible in Disk Management of the new server. Just check, Don’t try to make it online".

That is the challenge I am having when I do make the LUN/SAN Disk available the server is bringing it online.
Creating Active Directory Users from a Text File

If your organization has a need to mass-create AD user accounts, watch this video to see how its done without the need for scripting or other unnecessary complexities.

Manojkumar RaneCommented:
Create new LUN (for testing purpose) and attach to New Hyper-V host. Check if  it is also coming online on Hyper-V.
agcsupportAuthor Commented:
I created a new LUN and presented only to the new server. After rescan it is there offline and unallocated.

The issue seems to be specifically to existing CSV disks in use by clustering on the existing servers.
Cliff GaliherCommented:
There really is no issue. This is expected behavior.  NTFS is, at its core, still a single-access filesystem. Shared access is simply not supported. Microsoft creatively got around the issue by introducing CSVs.  But all CSVs really do is allow nodes to write data directly to a disk without interacting with the filesystem at all.  But for that to work, each node must know about the files in the filesystem in advance, and any changes to the filesystem need to be run through a single node. This is known as the CSV Coordinator node.  So a Hyper-V host can write to an existing VHD because it knows where the VHD blocks are, what the ACLs are, and similar. But any event that changes the VHD, such as a resize, still goes through the coordinator node.  

As such, you can't simply add an existing LUN to a new machine before making that machine part of the cluster.  That'll break because you now have two machines trying to manage the NTFS tables. And you get the freezing you describe.  If you go through the wizard and validate your cluster then add the node, it'll gracefully introduce the node to the existing storage and the node will be aware of the CSV coordinator, thus it won't try to handle the NTFS stuff.

So as I said above, simply run the validation wizard and add the node.  Any guidance that suggests doing otherwise is simply bad advice.  CSVs are not true shared-everything storage and trying to treat them as such will cause problems.
agcsupportAuthor Commented:
I guess I'm getting clustering resources confused with CSV's. I was thinking a CSV was the same as a clustered resource and therefore needed to be presented to any node that would be accessing it. So are you saying only one node needs the LUN/Disk presented to it and then that becomes the coordinator node for that CSV? The other nodes know about it and access it when it is added as a CSV?

Sorry for all the questions I just want to fully understand what is happening.
Cliff GaliherCommented:
When you add the node, you'll be presented with storage already in the cluster that you can add. The wizard takes care of properly informing the cluster of the change and getting the new node access. Same goes for adding a new CSV. Yes, the cluster will select one node to act as coordinator. Not necessarily the node that originally had the LUN access. That depends on availability and workloads, etc.
agcsupportAuthor Commented:
So I present one node with the LUN/Disk and then create a CSV from that. All of the other nodes are really accessing the CSV through the one node that has the LUN presented to it or I should say the coordinator node? If I take the node down that the CSV is created from does that make that CSV inaccessible to the other nodes seeing as the LUN/Disk is no longer available?
Cliff GaliherCommented:
Yes. And No. All nodes have direct access to the data on a CSV.  They can read and write directly. They will be made aware of the CSV  (and as long as you've configured things properly, will have direct access to the LUN) when you add it as a resource in cluster manager (just as a new node becomes aware of existing resources.)  The cluster configuration and resources are stored in a small database on every node (and on the witness disk) and changes are replicated, so all nodes are aware of CSVs.

Anything that does *not* involve data, but involves metadata (such as changes in file size, changing timestamps, etc) must go through the coordinator node because it has a lock on the NTFS stuff that still powers the filesystem under the CSV. And as I said earlier, NTFS is not shareable directly.

If you try to add a LUN manually, you are breaking the rules because NTFS is unshared and since you haven't actually joined the node to the cluster yet, it doesn't know another machine already has responsibility for the NTFS side of things.  And you break it.  When you add the node properly, you will still get direct access to the LUN, but the node will *NOT* try to grab access to the NTFS stuff because it is aware of the other nodes.

So yes, some writes and reads go through the coordinator. And no, not all do. Many can be done directly. Otherwise you'd have a huge bottleneck...the coordinator node! Your question shows that you are thinking of it as an all-or-nothing access. And it is neither. Or it is both.  But regardless, you can't access it manually. THAT is breaking the rules!

As for the final bit, the coordinator node is a cluster resource like any other.  If you shut down a node, it gets moved to another node and, since all changes are replicated to all nodes, the other nodes are made aware of the new coordinator and will send any metadata read and write requests to the new coordinator node.

And similarly, if the coordinator node fails unexpectedly, the other nodes will act like they would with any failure. There'd be a vote, and if quorum is reached, a new node gets the coordinator-node responsibilities and all nodes get that change.  When the failed node comes back, the first thing it does is look at the witness for changes and gets itself in sync, and requests to be part of the cluster. If quorum is reached that it is up and up-to-date, it is allowed back in and...since it knows of the change of coordinator node status from its sync, it too will know where the new coordinator node is.

Don't overthink it.  Really. The validation wizard was designed to take into account *all* of this. Adding a node is as simple as running the wizard. Even all the other "advice" out there (checking patch levels, etc) is done by the wizard. It'll let you know of problems.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
agcsupportAuthor Commented:
Can you tell me whether the csv will be taken offline to run validation against them?
Cliff GaliherCommented:
Validation is usually non-intrusive.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Windows Server 2012

From novice to tech pro — start learning today.