Link to home
Create AccountLog in
Networking

Networking

--

Questions

--

Followers

Top Experts

Avatar of Joe Lowe
Joe Lowe🇺🇸

Network Maintenance on Failover Cluster

We have a 14-node Microsoft Failover Cluster that has 4 networks configured.

2 iSCSI 10GbE networks for our SAN and CSVs - Cluster Use: None

1 Live Migration network - Cluster Use: Cluster Only

1 Production Network - Cluster Use: Cluster and Client



We have some planned maintenance occurring on the stack switches that run the Production network resulting in them being down for about 2 minutes at most. We ideally do not want to shutdown all VMs and Cluster for this as it will be only down while the switches reboot. 


Outside of there being a small network blimp for VMs, will this negatively impact my Cluster? Example: My Cluster will not totally fail, shutdown, etc. 

Zero AI Policy

We believe in human intelligence. Our moderation policy strictly prohibits the use of LLM content in our Q&A threads.


Avatar of Philip ElderPhilip Elder🇨🇦

Can the switches be selectively rebooted to allow for the cluster and the switching fabric to reroute during each reboot?

Taking them down point-blank would not be a happy place to be for the cluster.

What kind of network teaming and virtual switch setup is there on each node?

Avatar of Joe LoweJoe Lowe🇺🇸

ASKER

Unfortunately they cannot be. The stack reboots together as a whole.

The NIC Teaming is on the Production Network, on the 2 stacked switches.
Virtual switch on the nodes are setup to point to this same team.

There is also a NIC team on the Live Migration network. This network is not getting rebooted.


Avatar of Philip ElderPhilip Elder🇨🇦

We always force the switches out of Stack Mode when that happens so that we can reboot each one individually to avoid any issues.

Cluster Only on the storage fabric would be one way to allow the cluster to communicate thus not lose touch.

You can disable Live Migration for that fabric as well as a precaution.

Reward 1Reward 2Reward 3Reward 4Reward 5Reward 6

EARN REWARDS FOR ASKING, ANSWERING, AND MORE.

Earn free swag for participating on the platform.


Avatar of Joe LoweJoe Lowe🇺🇸

ASKER

Right now we have Cluster Only on the Live Migration network already. Since that network is not being worked on and will remain online, could that still help keep the Cluster in a good state while the Production network (Cluster and Client) goes offline briefly? 

Avatar of Philip ElderPhilip Elder🇨🇦

Are the Live Migration and the Production Network fabrics on the same switches or different ones?

If different, then yes that should work out just fine.

Avatar of Joe LoweJoe Lowe🇺🇸

ASKER

Yes, the Live Migration network runs on a separate network and separate switch from the Production network.

Free T-shirt

Get a FREE t-shirt when you ask your first question.

We believe in human intelligence. Our moderation policy strictly prohibits the use of LLM content in our Q&A threads.


Avatar of Joe LoweJoe Lowe🇺🇸

ASKER

Since Live Migration and Production networks are totally separates running on separate switches from one another, would you say it's safe to perform the maintenance on Production and the only thing we should see is a brief disruption in network traffic on the VMs?

A bit of info on our stacked switches that are going to undergo maintenance, these are Cisco Meraki switches so removing them from stack membership and re-adding I'm not sure is seamless as when I called support about options to reboot one at a time, they didn't suggest that as an option. 

ASKER CERTIFIED SOLUTION
Avatar of Philip ElderPhilip Elder🇨🇦

Link to home
membership
Log in or create a free account to see answer.
Signing up is free and takes 30 seconds. No credit card required.
Create Account

Avatar of Philip ElderPhilip Elder🇨🇦

The VMs would be disconnected/unreachable for as long as the switches are offline/rebooting.

We've seen some go much longer than two minutes for reboots. Is this a known commodity as far as the time it takes?

As far as rebooting them one at a time, I'm not sure now that I think about it. Maybe they could be but in my experience the uni-pane for management means click Reboot and they are all going.

Avatar of Joe LoweJoe Lowe🇺🇸

ASKER

Are you referring to the Meraki switches?

Meraki has introduced their next firmware upgrade so during the download of the upgrade the switches stay online until it's time to apply the update. At that point, they then reboot themselves. Support confirmed the model we have (MS225) reboot rather quickly and should take no more than 2 minutes. Previously we've had this exact same thing happen unexpectedly and the Cluster appeared okay afterwards except for the Cluster alerting that it lost connection to the virtual switch like you mentioned. Last time it was so unexpected we were trying to ensure everything was okay quickly but we did confirm the switches were down for about 2 minutes during the reboot.

This week we had our MS125 at some offices doing the upgrade and they were offline for 3 minutes. 

Reward 1Reward 2Reward 3Reward 4Reward 5Reward 6

EARN REWARDS FOR ASKING, ANSWERING, AND MORE.

Earn free swag for participating on the platform.


Avatar of Philip ElderPhilip Elder🇨🇦

Yes, is the reboot time a known commodity in that the switch reboot time was seen.

Sounds to me like you're pretty much good to go though especially since an outage was experienced and things behaved themselves.

Avatar of Joe LoweJoe Lowe🇺🇸

ASKER

Yes, it appears to be.

Awesome, thank you Philip for your help. Now that we had a chance to properly plan and schedule this, we wanted to be 110% before undergoing this. 
Networking

Networking

--

Questions

--

Followers

Top Experts

Networking is the process of connecting computing devices, peripherals and terminals together through a system that uses wiring, cabling or radio waves that enable their users to communicate, share information and interact over distances. Often associated are issues regarding operating systems, hardware and equipment, cloud and virtual networking, protocols, architecture, storage and management.