Techie solution
asked on
Windows Cluster 2016 error , Node 1 unavailable after a restart
Windows Cluster 2016 error , Node 1 unavailable after a restart (after installing Patch 1 for SQL cluster.) Pinged the IP and the IP is available , able to RDP to the server. Also all disk are up and running only issue is with NODE1 network its showing unavailable.
Please Advise.
Please Advise.
ASKER
It was running prior to a reboot. There is an error id of 5398.
In an elevated PowerShell please post the results to a CODE snippet:
Get-NetAdapter
Get-NetIPAddress
ASKER
PS C:\WINDOWS\system32> Get-NetAdapter
Name InterfaceDescription ifIndex Status MacAddress LinkSpeed
---- -------------------- ------- ------ ---------- ---------
Cluster Intel(R) 82574L Gigabit Network Co...#2 3 Up 00-50-xx.xx.xx-D0 1 Gbps
Production Intel(R) 82574L Gigabit Network Conn... 4 Up 00-50-xx.xx.xx-99 1 Gbps
PS C:\WINDOWS\system32> Get-NetIPAddress
IPAddress : fe80::b1f4:d8a3:a2d5:efe0% 3
InterfaceIndex : 3
InterfaceAlias : Cluster
AddressFamily : IPv6
Type : Unicast
PrefixLength : 64
PrefixOrigin : WellKnown
SuffixOrigin : Link
AddressState : Preferred
ValidLifetime : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource : False
PolicyStore : ActiveStore
IPAddress : fe80::878:6c1:f868:6c66%2
InterfaceIndex : 2
InterfaceAlias : Local Area Connection* 2
AddressFamily : IPv6
Type : Unicast
PrefixLength : 64
PrefixOrigin : WellKnown
SuffixOrigin : Link
AddressState : Deprecated
ValidLifetime : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource : False
PolicyStore : ActiveStore
IPAddress : fe80::f18a:98:3a1c:ff4e%4
InterfaceIndex : 4
InterfaceAlias : Production
AddressFamily : IPv6
Type : Unicast
PrefixLength : 64
PrefixOrigin : WellKnown
SuffixOrigin : Link
AddressState : Preferred
ValidLifetime : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource : False
PolicyStore : ActiveStore
IPAddress : ::1
InterfaceIndex : 1
InterfaceAlias : Loopback Pseudo-Interface 1
AddressFamily : IPv6
Type : Unicast
PrefixLength : 128
PrefixOrigin : WellKnown
SuffixOrigin : WellKnown
AddressState : Preferred
ValidLifetime : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource : False
PolicyStore : ActiveStore
IPAddress : 192.xx.xx.2
InterfaceIndex : 3
InterfaceAlias : Cluster
AddressFamily : IPv4
Type : Unicast
PrefixLength : 24
PrefixOrigin : Manual
SuffixOrigin : Manual
AddressState : Preferred
ValidLifetime : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource : False
PolicyStore : ActiveStore
IPAddress : 169.xx.xx.102
InterfaceIndex : 2
InterfaceAlias : Local Area Connection* 2
AddressFamily : IPv4
Type : Unicast
PrefixLength : 16
PrefixOrigin : WellKnown
SuffixOrigin : Link
AddressState : Tentative
ValidLifetime : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource : False
PolicyStore : ActiveStore
IPAddress : 169.xx.xx.228
InterfaceIndex : 2
InterfaceAlias : Local Area Connection* 2
AddressFamily : IPv4
Type : Unicast
PrefixLength : 16
PrefixOrigin : Manual
SuffixOrigin : Manual
AddressState : Tentative
ValidLifetime : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource : False
PolicyStore : ActiveStore
IPAddress : 10.xx.xx.161
InterfaceIndex : 4
InterfaceAlias : Production
AddressFamily : IPv4
Type : Unicast
PrefixLength : 24
PrefixOrigin : Manual
SuffixOrigin : Manual
AddressState : Preferred
ValidLifetime : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource : False
PolicyStore : ActiveStore
IPAddress : 127.0.0.1
InterfaceIndex : 1
InterfaceAlias : Loopback Pseudo-Interface 1
AddressFamily : IPv4
Type : Unicast
PrefixLength : 8
PrefixOrigin : WellKnown
SuffixOrigin : WellKnown
AddressState : Preferred
ValidLifetime : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource : False
PolicyStore : ActiveStore
Name InterfaceDescription ifIndex Status MacAddress LinkSpeed
---- -------------------- ------- ------ ---------- ---------
Cluster Intel(R) 82574L Gigabit Network Co...#2 3 Up 00-50-xx.xx.xx-D0 1 Gbps
Production Intel(R) 82574L Gigabit Network Conn... 4 Up 00-50-xx.xx.xx-99 1 Gbps
PS C:\WINDOWS\system32> Get-NetIPAddress
IPAddress : fe80::b1f4:d8a3:a2d5:efe0%
InterfaceIndex : 3
InterfaceAlias : Cluster
AddressFamily : IPv6
Type : Unicast
PrefixLength : 64
PrefixOrigin : WellKnown
SuffixOrigin : Link
AddressState : Preferred
ValidLifetime : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource : False
PolicyStore : ActiveStore
IPAddress : fe80::878:6c1:f868:6c66%2
InterfaceIndex : 2
InterfaceAlias : Local Area Connection* 2
AddressFamily : IPv6
Type : Unicast
PrefixLength : 64
PrefixOrigin : WellKnown
SuffixOrigin : Link
AddressState : Deprecated
ValidLifetime : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource : False
PolicyStore : ActiveStore
IPAddress : fe80::f18a:98:3a1c:ff4e%4
InterfaceIndex : 4
InterfaceAlias : Production
AddressFamily : IPv6
Type : Unicast
PrefixLength : 64
PrefixOrigin : WellKnown
SuffixOrigin : Link
AddressState : Preferred
ValidLifetime : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource : False
PolicyStore : ActiveStore
IPAddress : ::1
InterfaceIndex : 1
InterfaceAlias : Loopback Pseudo-Interface 1
AddressFamily : IPv6
Type : Unicast
PrefixLength : 128
PrefixOrigin : WellKnown
SuffixOrigin : WellKnown
AddressState : Preferred
ValidLifetime : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource : False
PolicyStore : ActiveStore
IPAddress : 192.xx.xx.2
InterfaceIndex : 3
InterfaceAlias : Cluster
AddressFamily : IPv4
Type : Unicast
PrefixLength : 24
PrefixOrigin : Manual
SuffixOrigin : Manual
AddressState : Preferred
ValidLifetime : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource : False
PolicyStore : ActiveStore
IPAddress : 169.xx.xx.102
InterfaceIndex : 2
InterfaceAlias : Local Area Connection* 2
AddressFamily : IPv4
Type : Unicast
PrefixLength : 16
PrefixOrigin : WellKnown
SuffixOrigin : Link
AddressState : Tentative
ValidLifetime : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource : False
PolicyStore : ActiveStore
IPAddress : 169.xx.xx.228
InterfaceIndex : 2
InterfaceAlias : Local Area Connection* 2
AddressFamily : IPv4
Type : Unicast
PrefixLength : 16
PrefixOrigin : Manual
SuffixOrigin : Manual
AddressState : Tentative
ValidLifetime : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource : False
PolicyStore : ActiveStore
IPAddress : 10.xx.xx.161
InterfaceIndex : 4
InterfaceAlias : Production
AddressFamily : IPv4
Type : Unicast
PrefixLength : 24
PrefixOrigin : Manual
SuffixOrigin : Manual
AddressState : Preferred
ValidLifetime : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource : False
PolicyStore : ActiveStore
IPAddress : 127.0.0.1
InterfaceIndex : 1
InterfaceAlias : Loopback Pseudo-Interface 1
AddressFamily : IPv4
Type : Unicast
PrefixLength : 8
PrefixOrigin : WellKnown
SuffixOrigin : WellKnown
AddressState : Preferred
ValidLifetime : Infinite ([TimeSpan]::MaxValue)
PreferredLifetime : Infinite ([TimeSpan]::MaxValue)
SkipAsSource : False
PolicyStore : ActiveStore
I don't see at least a NetLbfoTeam?
It's preferable to set up a NIC team and bind the virtual switch onto that team and in this case with two ports to allow the host OS to share that team and thus have an IP for the production network.
In a cluster setting DNS A record registration for production should only be the IP of the management adapter.
To do so:
A SET switch (Switch Embedded Teaming) would allow for the creation of two virtual NICs (vNICs) to run on the two subnets needed.
It's preferable to set up a NIC team and bind the virtual switch onto that team and in this case with two ports to allow the host OS to share that team and thus have an IP for the production network.
In a cluster setting DNS A record registration for production should only be the IP of the management adapter.
To do so:
Set-DNSClient -InterfaceAlias Cl* -RegisterThisConnectionsAddress $False
A SET switch (Switch Embedded Teaming) would allow for the creation of two virtual NICs (vNICs) to run on the two subnets needed.
ASKER
This configuration was not needed initially . It was working fine.
Did you looked at the Failover Cluster Manager messages and tried a "Validate Cluster..." report?
ASKER
ASKER
Any guidance please.
There are obvious errors that a Systems/Networking enginner would have to look at and fix before running the validate cluster again and bring it back on line as a "cluster" - hopefully it is all well still running on node2 right? Just make sure the automatic fail over is disabled on it until node1 is brought back into the cluster. Alternatively...as this is a VM - do you have by any chance a snapshot of node1 taken just before the patch was applied? if yes, you could try rollout that snapshot on node1
ASKER
No we don't have any snapshot. Also can you guide me through how to troubleshoot it.
The log settings we use to troubleshoot are in this blog post: A Microsoft Cluster Troubleshooting Guide
Both FCM and PowerShell are useful to finding the problem.
Message Analyzer can also be tuned to help in a 2012 RTM/R2 and 2016 setting though we've not used it in a long time.
Both FCM and PowerShell are useful to finding the problem.
Message Analyzer can also be tuned to help in a 2012 RTM/R2 and 2016 setting though we've not used it in a long time.
ASKER
I tried to evict node 1 and when I am trying to rejoin it is throwing below error.
H--My-Pictures-C1.PNG
H--My-Pictures-C1.PNG
ASKER
Should we detroy the cluster and recreate it again with same nodes. If yes what should be the best practice to do this
I have two very thorough EE articles on all things Hyper-V:
Some Hyper-V Hardware and Software Best Practices
Practical Hyper-V Performance Expectations
This is a sample PowerShell script for setting up one a cluster node.
The goal in a cluster setting is to remove as many single points of failure (SPFs) as is possible. NIC teaming is one such method of doing so.
+ BIOS and firmware up to date on all nodes prior
+ Install OS
+ Install drivers
+ Set up NetLbfoTeam
+ Install Hyper-V and Cluster Roles
+ Set up clustered storage and CSVs
+ Set up Hyper-V to work with C:\ClusterStorage out of the box for new VMs
+ Set up & bind the virtual switch/SET Switch (prefer not shared with host OS but port count must be >2)
+ Import VMs (assuming there's already workloads on the current cluster)
The above should be a good start.
Some Hyper-V Hardware and Software Best Practices
Practical Hyper-V Performance Expectations
This is a sample PowerShell script for setting up one a cluster node.
The goal in a cluster setting is to remove as many single points of failure (SPFs) as is possible. NIC teaming is one such method of doing so.
+ BIOS and firmware up to date on all nodes prior
+ Install OS
+ Install drivers
+ Set up NetLbfoTeam
+ Install Hyper-V and Cluster Roles
+ Set up clustered storage and CSVs
+ Set up Hyper-V to work with C:\ClusterStorage out of the box for new VMs
+ Set up & bind the virtual switch/SET Switch (prefer not shared with host OS but port count must be >2)
+ Import VMs (assuming there's already workloads on the current cluster)
The above should be a good start.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
What "after installing Patch 1 for SQL cluster" is - I mean was that the SQL SP1? or something else?
Was the Cluster running ALL resources on Node2 prior to Patching/restart of Node1?
If you RDP into node2 and start Failover Cluster Manager on it, connect to your Cluster - what are the errors - if any - in last 24hours?
You should right click your clsuter name in it then run a "Validate Cluster..." report that will show you any potential issues with details.