asked on
Failover Cluster - Live Migration Network Question
We have a large Failover Cluster and we have 4 Networks configured:
- 2 iSCSI Networks going to separate switches
- 1 PROD (10GB) Network, each server has a NIC Team configured for this that goes to 2 separate switches. This network is setup for Cluster and Client within Failover Cluster
- 1 LIVE-MIGRATION (1GB) isolated to its own switch. It' setup for Cluster Only and Live Migration.
My question is, I need to do maintenance on the Live Migration switch and need to know, will my Cluster have issues if this switch go down?
I was considering setting up my PROD Network as another Live Migration network but I'd only want it to be 2nd in the priority list. By doing this, will it only use PROD "if" my LIVE-MIGRATION network that is the 1st priority is not up or how/when does it use the 2nd prioritized Live Migration network?
ASKER
Circling back to this post - The switch upgrade appeared to be successful during maintenance. We saw the Cluster errors like this too though, is that normal?
Cluster Shared Volume 'Volume2' ([VOLUMENAME') has entered a paused state because of 'STATUS_CONNECTION_DISCONNECTED(c000020c)'. All I/O will temporarily be queued until a path to the volume is reestablished.
Yeah. 5120 is the Event ID if I remember right?
So long as the VMs remain online, or come back online, you're good to go.
Now, if there are two paths to storage and MPIO is enabled then this is a problem. That would mean that when one of the I/O switches was offline the nodes were not able to access storage via the other path. Is that the case here?
ASKER
Yes, Event ID 5120 is right.
I saw other alerts for the Live Migration network being unavailable but I knew that was expected.
The odd part about the 5120 is that my Live Migration network and switch has no ties to my iSCSI or Storage? I have 2 iSCSI switches that are isolated and run their own IPs. I have MPIO setup. My VMs appeared to be fine but thought it was strange I saw references for Storage.
My guess is that the VM owner and the CSV Storage owner were two different nodes.
EDIT: This causes I/O to be rerouted between the nodes. It's always good to have the VMs and CSV owners aligned especially in higher I/O situations and for backups.
ASKER
Thanks Phillip, I am not quite following what you mean by VM owner and CSV Storage were 2 different nodes and they should be aligned?
In my environment I have a 20-Node Cluster running 120+ VMs with 14 CSVs. Wouldn't the VM and CSV owners be different in most cases?
We always aim to have the VM and the CSV owner on the same node.
We also aim to have 2x CSVs to 1x Node as a rule to keep the I/O relatively distributed.
iSCSI may be a limiting factor here though. 1GbE or 10GbE?
ASKER
That's interesting. How do you guys handle that when VMs move between Nodes? Do you migrate their disk to another CSV hosted on the new Node?
We use 2 10GbE networks for iSCSI.
We would run a script prior that would align the CSV with its VM.
Are the paths and MPIO set up correctly?
We don't "move" CSVs either and I'm at a bit of a loss as to why this should be necessary. CSVs are a clustered resource and should be available to all hosts. I mean, if the CSV were supposed to move with its VM, why wouldn't that happen automatically? Do your VMs simply fail if the "CSV host" goes down unexpectedly?
Because VM Owner = CSV Owner = no redirected I/O.
This article explains it well. Here's another from fellow Cluster MVP Darryl van der Peijl. And here's a script by Didier Van Hoye to do just that.
Backups are usually what bring out the I/O difficulties when things are redirected. They produce a substantial burst of I/O when they start off.
Ah, that's for Server 2008, ReFS, and backups, respectively. Thank you for the links! They don't apply to my environment, but that's good to know information.
That is for anything on Storage Spaces, Cluster Shared Volumes, and Hyper-V respectively since Server 2008.
From your link "Just to be clear, redirected I/O is no longer required to back up a Cluster Shared Volume since WS2012."
We back up the LUNs (OS and data) and use WIndows Backup within each VM (overkill?), and we don't use ReFS, which may explain why this hasn't been an issue for us.
We have plenty of in-production S2D (Storage Spaces Direct) and converged (Hyper-V/Storage Spaces Nodes + Shared SAS JBODs) where there is I/O constraint when the backups run.
We use Veeam at the host level. Windows Server Backup is useless IMNSHO.
Having two levels of backup that use VSS can cause VSS collisions, especially if in-guest Volume Shadow Copies (Previous Versions) is in use, which can and will cause data corruption. Please make sure there's a sufficient gap between each layer to avoid data corruption.
This is a baseline performance run with Windows Server 2019 in a Storage Spaces Direct (S2D) 2-node direct connect cluster:
That 70/30 run produced a respectable 288K IOPS and 3.5GB/Second of throughput.
The following images shows two very distinct patterns in disk output as we moved VMs off the CSV owner to see what kind of impact we'd get:
The lower bars are VMs with redirected I/O.
Now, this network is a RoCE (RDMA over Converged Ethernet) 40GbE dual-port setup with four ports total per node so network is not a constraint.
In most cases, we over provision our storage I/O network so when the VMs are not sitting on their storage CSV owner it's not noticed.
For those clusters where I/O is constrained though it causes problems. That's where we keep them aligned on purpose for backups.
ASKER
This is very interesting and good to know information, thank you very much Phil!
So reading that redirect I/O link a few times, it sounds like that traffic traverses over "Cluster" networks which could make sense why I saw those messages during my Live Migration switch upgrade, because it also acts as 1 of 2 of my "Cluster Communication" networks. Is that correct?
Bingo.
As soon as the CSV owner and VM owner are different we need a path between the two.
EDIT: And, cluster will choose the fastest available network if possible.
ASKER
Well interestingly, I don't think my CSV owner and VM owner changed while my Live Migration switch rebooted. VMs and CSVs remained in the current places but they were most likely different before and after.
So when this occurs, outside of optimizing the existing setup, are those messages safe to ignore when I reboot my Live Migration switch? I did have another Cluster network configured it could have used for Cluster Communication and I also had another Live Migration network configured in the event something needed to move between hosts.
Note: My iSCSI networks are not tied to my Live Migration network/switch.
They are safe to ignore. While we can define things the Cluster Service may take liberties with what's there.
It's why we set up Live Migration Quality of Service (QoS) to make sure LM doesn't kill other networks.
ASKER
I don't believe we have setup any Live Migration QoS. Our Live Migration network is just a separate network with it's own NIC Team on each Host.
Is there something else you also setup?
We set up QoS for Live Migration as a rule for setups where there is a SET vSwitch configured with vNICs (virtual network ports) as the SET vSwitch is shared bandwidth. We also do so for HCI (Hyper-Converged Infrastructure) using S2D (Storage Spaces Direct) to protect the node to node communications. Otherwise LM would saturate the storage networks as they are always the fastest ones available.
Has Live Migration SMB QoS been enabled?
To confirm: Has the LM been delimited to the 1GbE network? That is, the other networks are not checked for the LM properties?