MGA2
asked on
2-Node NLB Cluster - Convergence/Duplicate Cluster Subnets Issues
We are running an IIS 7 NLB on Server 2008 and our NLB cluster continously generates System Events (Source: NLB) that it is converging and/or a node is leaving the cluster. We've had time out issues which I am almost positive relates to the fact that one of the hosts is leaving the cluster while users are connected to the website.
I have researched this but have found nothing.
This MS site http://technet.microsoft.com/en-us/library/cc726431(WS.10).aspx
show some ofl the Event IDs that I am receiving but some say it is now. Primariily ID 18, 28 & 69 but it says 28 & 69 are normal conditions. We also get an Event stating that "NLB detected duplicate subnets". Looking at my configuration I do not have duplication subnets.
We have another NLB running Server 2003 with no issues
Can anyone give some insight?
NLB Display output is below to review configuration.
NLB Cluster Control Utility V2.5 (c) 1997-2007 Microsoft Corporation.
Cluster 172.16.16.210
=== Configuration: ===
Current time = 7/9/2009 10:09:19 AM
ParametersVersion = 5
CurrentVersion = V2.5
EffectiveVersion = 00000204
InstallDate = 0x49AE7B83
HostPriority = 2
ClusterName = webhosts.mgx2.com
ClusterIPAddress = 172.16.16.210
ClusterNetworkMask = 255.255.240.0
DedicatedIPAddresses/ = 172.16.16.205/255.255.240. 0
DedicatedNetworkMasks
McastIPAddress = 0.0.0.0
ClusterNetworkAddress = 02-bf-ac-10-10-d2
IPToMACEnable = ENABLED
MulticastSupportEnable = DISABLED
IGMPSupport = DISABLED
MulticastARPEnable = ENABLED
MaskSourceMAC = ENABLED
AliveMsgPeriod = 1000
AliveMsgTolerance = 5
MaxConnectionDescriptors = 262144
FilterICMP = DISABLED
ClusterModeOnStart = STARTED
PersistedStates = NONE
NBTSupportEnable = ENABLED
UnicastInterHostCommSuppor t = ENABLED
BDATeaming = NO
TeamID =
Master = NO
ReverseHash = NO
IdentityHeartbeatPeriod = 10000
NumberOfRules (7):
VIP Start End Prot Mode Pri Load Affinity
--------------- ----- ----- ---- -------- --- ---- --------
172.16.16.24 0 65535 Both Multiple Eql Single
172.16.16.25 0 65535 Both Multiple Eql Single
172.16.16.26 0 65535 Both Multiple Eql Single
172.16.16.30 0 65535 Both Multiple Eql Single
172.16.16.40 0 65535 Both Multiple Eql None
172.16.16.41 0 65535 Both Multiple Eql Single
172.16.16.42 0 65535 Both Multiple Eql Single
=== Event messages: ===
#24246 ID: 0x0000001c Type: 4 Category: 0 Time: 7/9/2009 10:07:50 AM
NLB cluster [172.16.16.210]: Host 2 converged with host(s): 1,2. It is now an active member of the NLB cluster and will start load balancing traffic.
#24245 ID: 0x00000045 Type: 4 Category: 0 Time: 7/9/2009 10:07:43 AM
NLB cluster [172.16.16.210]: NLB is initiating convergence on host 2 because host 1 is leaving the cluster.
#24239 ID: 0x0000001c Type: 4 Category: 0 Time: 7/9/2009 9:54:35 AM
NLB cluster [172.16.16.210]: Host 2 converged with host(s): 1,2. It is now an active member of the NLB cluster and will start load balancing traffic.
#24238 ID: 0x00000045 Type: 4 Category: 0 Time: 7/9/2009 9:54:28 AM
NLB cluster [172.16.16.210]: NLB is initiating convergence on host 2 because host 1 is leaving the cluster.
#24228 ID: 0x0000001c Type: 4 Category: 0 Time: 7/9/2009 8:28:56 AM
NLB cluster [172.16.16.210]: Host 2 converged with host(s): 1,2. It is now an active member of the NLB cluster and will start load balancing traffic.
#24227 ID: 0x00000045 Type: 4 Category: 0 Time: 7/9/2009 8:28:50 AM
NLB cluster [172.16.16.210]: NLB is initiating convergence on host 2 because host 1 is leaving the cluster.
#24226 ID: 0x0000001c Type: 4 Category: 0 Time: 7/9/2009 8:27:35 AM
NLB cluster [172.16.16.210]: Host 2 converged with host(s): 1,2. It is now an active member of the NLB cluster and will start load balancing traffic.
#24225 ID: 0x00000045 Type: 4 Category: 0 Time: 7/9/2009 8:27:29 AM
NLB cluster [172.16.16.210]: NLB is initiating convergence on host 2 because host 1 is leaving the cluster.
#24224 ID: 0x0000001c Type: 4 Category: 0 Time: 7/9/2009 8:11:56 AM
NLB cluster [172.16.16.210]: Host 2 converged with host(s): 1,2. It is now an active member of the NLB cluster and will start load balancing traffic.
#24223 ID: 0x00000045 Type: 4 Category: 0 Time: 7/9/2009 8:11:50 AM
NLB cluster [172.16.16.210]: NLB is initiating convergence on host 2 because host 1 is leaving the cluster.
=== IP configuration: ===
Windows IP Configuration
Host Name . . . . . . . . . . . . : webhost2
Primary Dns Suffix . . . . . . . : MGADM
Node Type . . . . . . . . . . . . : Hybrid
IP Routing Enabled. . . . . . . . : No
WINS Proxy Enabled. . . . . . . . : No
DNS Suffix Search List. . . . . . : MGADM
Ethernet adapter Private Backend:
Connection-specific DNS Suffix . :
Description . . . . . . . . . . . : HP NC373i Multifunction Gigabit Server Adapter #2
Physical Address. . . . . . . . . : 00-23-7D-A1-31-3C
DHCP Enabled. . . . . . . . . . . : No
Autoconfiguration Enabled . . . . : Yes
IPv4 Address. . . . . . . . . . . : 172.16.16.35(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.240.0
Default Gateway . . . . . . . . . : 172.16.17.1
DNS Servers . . . . . . . . . . . : 10.0.0.200
10.0.0.9
Primary WINS Server . . . . . . . : 10.0.0.200
Secondary WINS Server . . . . . . : 10.0.0.9
NetBIOS over Tcpip. . . . . . . . : Enabled
Ethernet adapter NLB Public Frontend:
Connection-specific DNS Suffix . :
Description . . . . . . . . . . . : HP NC373i Multifunction Gigabit Server Adapter
Physical Address. . . . . . . . . : 02-BF-AC-10-10-D2
DHCP Enabled. . . . . . . . . . . : No
Autoconfiguration Enabled . . . . : Yes
IPv4 Address. . . . . . . . . . . : 172.16.16.205(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.240.0
IPv4 Address. . . . . . . . . . . : 172.16.16.24(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.240.0
IPv4 Address. . . . . . . . . . . : 172.16.16.25(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.240.0
IPv4 Address. . . . . . . . . . . : 172.16.16.26(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.240.0
IPv4 Address. . . . . . . . . . . : 172.16.16.30(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.240.0
IPv4 Address. . . . . . . . . . . : 172.16.16.40(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.240.0
IPv4 Address. . . . . . . . . . . : 172.16.16.41(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.240.0
IPv4 Address. . . . . . . . . . . : 172.16.16.42(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.240.0
IPv4 Address. . . . . . . . . . . : 172.16.16.210(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.240.0
Default Gateway . . . . . . . . . : 172.16.17.1
DNS Servers . . . . . . . . . . . : 10.0.0.200
10.0.0.9
Primary WINS Server . . . . . . . : 10.0.0.200
Secondary WINS Server . . . . . . : 10.0.0.9
NetBIOS over Tcpip. . . . . . . . : Enabled
Tunnel adapter Local Area Connection* 8:
Media State . . . . . . . . . . . : Media disconnected
Connection-specific DNS Suffix . :
Description . . . . . . . . . . . : isatap.{B2CD1FE5-E0CD-4B7C -9785-28CF 130540D7}
Physical Address. . . . . . . . . : 00-00-00-00-00-00-00-E0
DHCP Enabled. . . . . . . . . . . : No
Autoconfiguration Enabled . . . . : Yes
Tunnel adapter Local Area Connection* 9:
Media State . . . . . . . . . . . : Media disconnected
Connection-specific DNS Suffix . :
Description . . . . . . . . . . . : isatap.{9E204A3D-D29A-4D4B -BC34-E52A C2C55323}
Physical Address. . . . . . . . . : 00-00-00-00-00-00-00-E0
DHCP Enabled. . . . . . . . . . . : No
Autoconfiguration Enabled . . . . : Yes
Tunnel adapter Local Area Connection* 11:
Media State . . . . . . . . . . . : Media disconnected
Connection-specific DNS Suffix . :
Description . . . . . . . . . . . : Teredo Tunneling Pseudo-Interface
Physical Address. . . . . . . . . : 02-00-54-55-4E-01
DHCP Enabled. . . . . . . . . . . : No
Autoconfiguration Enabled . . . . : Yes
=== Current state: ===
Host 2 has entered a converging state 8 time(s) since joining the cluster
and the last convergence completed at approximately: 7/9/2009 10:07:52 AM
Host 2 converged with the following host(s) as part of the cluster:
1, 2
I have researched this but have found nothing.
This MS site http://technet.microsoft.com/en-us/library/cc726431(WS.10).aspx
show some ofl the Event IDs that I am receiving but some say it is now. Primariily ID 18, 28 & 69 but it says 28 & 69 are normal conditions. We also get an Event stating that "NLB detected duplicate subnets". Looking at my configuration I do not have duplication subnets.
We have another NLB running Server 2003 with no issues
Can anyone give some insight?
NLB Display output is below to review configuration.
NLB Cluster Control Utility V2.5 (c) 1997-2007 Microsoft Corporation.
Cluster 172.16.16.210
=== Configuration: ===
Current time = 7/9/2009 10:09:19 AM
ParametersVersion = 5
CurrentVersion = V2.5
EffectiveVersion = 00000204
InstallDate = 0x49AE7B83
HostPriority = 2
ClusterName = webhosts.mgx2.com
ClusterIPAddress = 172.16.16.210
ClusterNetworkMask = 255.255.240.0
DedicatedIPAddresses/ = 172.16.16.205/255.255.240.
DedicatedNetworkMasks
McastIPAddress = 0.0.0.0
ClusterNetworkAddress = 02-bf-ac-10-10-d2
IPToMACEnable = ENABLED
MulticastSupportEnable = DISABLED
IGMPSupport = DISABLED
MulticastARPEnable = ENABLED
MaskSourceMAC = ENABLED
AliveMsgPeriod = 1000
AliveMsgTolerance = 5
MaxConnectionDescriptors = 262144
FilterICMP = DISABLED
ClusterModeOnStart = STARTED
PersistedStates = NONE
NBTSupportEnable = ENABLED
UnicastInterHostCommSuppor
BDATeaming = NO
TeamID =
Master = NO
ReverseHash = NO
IdentityHeartbeatPeriod = 10000
NumberOfRules (7):
VIP Start End Prot Mode Pri Load Affinity
--------------- ----- ----- ---- -------- --- ---- --------
172.16.16.24 0 65535 Both Multiple Eql Single
172.16.16.25 0 65535 Both Multiple Eql Single
172.16.16.26 0 65535 Both Multiple Eql Single
172.16.16.30 0 65535 Both Multiple Eql Single
172.16.16.40 0 65535 Both Multiple Eql None
172.16.16.41 0 65535 Both Multiple Eql Single
172.16.16.42 0 65535 Both Multiple Eql Single
=== Event messages: ===
#24246 ID: 0x0000001c Type: 4 Category: 0 Time: 7/9/2009 10:07:50 AM
NLB cluster [172.16.16.210]: Host 2 converged with host(s): 1,2. It is now an active member of the NLB cluster and will start load balancing traffic.
#24245 ID: 0x00000045 Type: 4 Category: 0 Time: 7/9/2009 10:07:43 AM
NLB cluster [172.16.16.210]: NLB is initiating convergence on host 2 because host 1 is leaving the cluster.
#24239 ID: 0x0000001c Type: 4 Category: 0 Time: 7/9/2009 9:54:35 AM
NLB cluster [172.16.16.210]: Host 2 converged with host(s): 1,2. It is now an active member of the NLB cluster and will start load balancing traffic.
#24238 ID: 0x00000045 Type: 4 Category: 0 Time: 7/9/2009 9:54:28 AM
NLB cluster [172.16.16.210]: NLB is initiating convergence on host 2 because host 1 is leaving the cluster.
#24228 ID: 0x0000001c Type: 4 Category: 0 Time: 7/9/2009 8:28:56 AM
NLB cluster [172.16.16.210]: Host 2 converged with host(s): 1,2. It is now an active member of the NLB cluster and will start load balancing traffic.
#24227 ID: 0x00000045 Type: 4 Category: 0 Time: 7/9/2009 8:28:50 AM
NLB cluster [172.16.16.210]: NLB is initiating convergence on host 2 because host 1 is leaving the cluster.
#24226 ID: 0x0000001c Type: 4 Category: 0 Time: 7/9/2009 8:27:35 AM
NLB cluster [172.16.16.210]: Host 2 converged with host(s): 1,2. It is now an active member of the NLB cluster and will start load balancing traffic.
#24225 ID: 0x00000045 Type: 4 Category: 0 Time: 7/9/2009 8:27:29 AM
NLB cluster [172.16.16.210]: NLB is initiating convergence on host 2 because host 1 is leaving the cluster.
#24224 ID: 0x0000001c Type: 4 Category: 0 Time: 7/9/2009 8:11:56 AM
NLB cluster [172.16.16.210]: Host 2 converged with host(s): 1,2. It is now an active member of the NLB cluster and will start load balancing traffic.
#24223 ID: 0x00000045 Type: 4 Category: 0 Time: 7/9/2009 8:11:50 AM
NLB cluster [172.16.16.210]: NLB is initiating convergence on host 2 because host 1 is leaving the cluster.
=== IP configuration: ===
Windows IP Configuration
Host Name . . . . . . . . . . . . : webhost2
Primary Dns Suffix . . . . . . . : MGADM
Node Type . . . . . . . . . . . . : Hybrid
IP Routing Enabled. . . . . . . . : No
WINS Proxy Enabled. . . . . . . . : No
DNS Suffix Search List. . . . . . : MGADM
Ethernet adapter Private Backend:
Connection-specific DNS Suffix . :
Description . . . . . . . . . . . : HP NC373i Multifunction Gigabit Server Adapter #2
Physical Address. . . . . . . . . : 00-23-7D-A1-31-3C
DHCP Enabled. . . . . . . . . . . : No
Autoconfiguration Enabled . . . . : Yes
IPv4 Address. . . . . . . . . . . : 172.16.16.35(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.240.0
Default Gateway . . . . . . . . . : 172.16.17.1
DNS Servers . . . . . . . . . . . : 10.0.0.200
10.0.0.9
Primary WINS Server . . . . . . . : 10.0.0.200
Secondary WINS Server . . . . . . : 10.0.0.9
NetBIOS over Tcpip. . . . . . . . : Enabled
Ethernet adapter NLB Public Frontend:
Connection-specific DNS Suffix . :
Description . . . . . . . . . . . : HP NC373i Multifunction Gigabit Server Adapter
Physical Address. . . . . . . . . : 02-BF-AC-10-10-D2
DHCP Enabled. . . . . . . . . . . : No
Autoconfiguration Enabled . . . . : Yes
IPv4 Address. . . . . . . . . . . : 172.16.16.205(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.240.0
IPv4 Address. . . . . . . . . . . : 172.16.16.24(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.240.0
IPv4 Address. . . . . . . . . . . : 172.16.16.25(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.240.0
IPv4 Address. . . . . . . . . . . : 172.16.16.26(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.240.0
IPv4 Address. . . . . . . . . . . : 172.16.16.30(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.240.0
IPv4 Address. . . . . . . . . . . : 172.16.16.40(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.240.0
IPv4 Address. . . . . . . . . . . : 172.16.16.41(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.240.0
IPv4 Address. . . . . . . . . . . : 172.16.16.42(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.240.0
IPv4 Address. . . . . . . . . . . : 172.16.16.210(Preferred)
Subnet Mask . . . . . . . . . . . : 255.255.240.0
Default Gateway . . . . . . . . . : 172.16.17.1
DNS Servers . . . . . . . . . . . : 10.0.0.200
10.0.0.9
Primary WINS Server . . . . . . . : 10.0.0.200
Secondary WINS Server . . . . . . : 10.0.0.9
NetBIOS over Tcpip. . . . . . . . : Enabled
Tunnel adapter Local Area Connection* 8:
Media State . . . . . . . . . . . : Media disconnected
Connection-specific DNS Suffix . :
Description . . . . . . . . . . . : isatap.{B2CD1FE5-E0CD-4B7C
Physical Address. . . . . . . . . : 00-00-00-00-00-00-00-E0
DHCP Enabled. . . . . . . . . . . : No
Autoconfiguration Enabled . . . . : Yes
Tunnel adapter Local Area Connection* 9:
Media State . . . . . . . . . . . : Media disconnected
Connection-specific DNS Suffix . :
Description . . . . . . . . . . . : isatap.{9E204A3D-D29A-4D4B
Physical Address. . . . . . . . . : 00-00-00-00-00-00-00-E0
DHCP Enabled. . . . . . . . . . . : No
Autoconfiguration Enabled . . . . : Yes
Tunnel adapter Local Area Connection* 11:
Media State . . . . . . . . . . . : Media disconnected
Connection-specific DNS Suffix . :
Description . . . . . . . . . . . : Teredo Tunneling Pseudo-Interface
Physical Address. . . . . . . . . : 02-00-54-55-4E-01
DHCP Enabled. . . . . . . . . . . : No
Autoconfiguration Enabled . . . . : Yes
=== Current state: ===
Host 2 has entered a converging state 8 time(s) since joining the cluster
and the last convergence completed at approximately: 7/9/2009 10:07:52 AM
Host 2 converged with the following host(s) as part of the cluster:
1, 2
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Split with http://#24829226 and http://#24839408
check the network connections and equipment for your heart-beat connections.
This has the feeling of a heart beat disconnect.
The cluster does not know of the other machine through the regular network (well, it kind of does but it does not monitor this). It knows of the other machines in the cluster through the heart-beat connection. So, if the machine is leaving the cluster and then coming back this makes me think that it is losing "sight" of the other machine and trying to start up the cluster on its own. Once it sees the other machine is available again it joins back to the cluster as it should.