epolli
asked on
Cluster network is Partitioned, network connections are Unreachable
I have an Exchange 2007 CCR cluster that runs in a Windows 2008 cluster.
2 days ago, the active node had performance problems (reading from / writing to the DB files took an abnormally long time). Sind a manual failover to the other node did not work, I rebooted the active node. Exchange started without problems on the other node.
Since that reboot, the Cluster Manager shows the Public Network as "Partitioned" and both connections in that network as "Unreachable".
When I stop the cluster service on the passive node, the network status changes to "Up", and the connection status to "Up" (active node) and "Down" (passive node).
Both servers are in the same VLAN, they can ping each other, I can ping them from another machine.
The "Validate a Configuration" Wizard does not find any problems.
In the event log, FailoverClustering events 1126 and 1129 are logged ("Cluster network interface...is unreachable by at least one other cluster node...", "Cluster network 'LAN' is partitioned").
Does anyone have an idea how the network got partitioned, since there don't seem to be network problems, and what I can do to fix this problem?
2 days ago, the active node had performance problems (reading from / writing to the DB files took an abnormally long time). Sind a manual failover to the other node did not work, I rebooted the active node. Exchange started without problems on the other node.
Since that reboot, the Cluster Manager shows the Public Network as "Partitioned" and both connections in that network as "Unreachable".
When I stop the cluster service on the passive node, the network status changes to "Up", and the connection status to "Up" (active node) and "Down" (passive node).
Both servers are in the same VLAN, they can ping each other, I can ping them from another machine.
The "Validate a Configuration" Wizard does not find any problems.
In the event log, FailoverClustering events 1126 and 1129 are logged ("Cluster network interface...is unreachable by at least one other cluster node...", "Cluster network 'LAN' is partitioned").
Does anyone have an idea how the network got partitioned, since there don't seem to be network problems, and what I can do to fix this problem?
Did you reboot the other node as well ??
ASKER
No, I couldn't, since a failover to the passive node does not work in the current condition.
You mean we have ServerA and ServerB.
ServerA (Active) performance issues failover to ServerB and reboot ServerA.
Now you see NIC showing as Unpartationed if ServerA is Up and if down NIC shows UP ........... Now ServerB is (Active) and failover to ServerA not happening.
Are you trying the Failover using failover Clustering ??
Did you try the Command prompt ??
ServerA (Active) performance issues failover to ServerB and reboot ServerA.
Now you see NIC showing as Unpartationed if ServerA is Up and if down NIC shows UP ........... Now ServerB is (Active) and failover to ServerA not happening.
Are you trying the Failover using failover Clustering ??
Did you try the Command prompt ??
ASKER
I tried the failover in the Exchange Management Console, I did not try the command prompt (you mean run a "move-ClusteredMailboxServ er" in the Exchange Management Shell?)
So you think that rebooting ServerB could fix the problem?
So you think that rebooting ServerB could fix the problem?
Open a normal command prompt and run these commands
Cluster res
Cluster group
Cluster group "Group Name" /move
Cluster res
Cluster group
Cluster group "Group Name" /move
ASKER
thanks - I'll try that tonight (because I cannot disconnect users now) and will let you know the results tomorrow.
Cool would wait for your update on it :)
ASKER
The failover did not work since the IP Address ressource did not start.
I had to stop the cluster service on the second node so that the IP Address (and then all other services) could start.
After restarting the cluster service, the network is still Partitioned.
I had to stop the cluster service on the second node so that the IP Address (and then all other services) could start.
After restarting the cluster service, the network is still Partitioned.
Are either of the networks (heartbeat and public) teamed?
Also see MS article below:
http://support.microsoft.com/kb/296799
Windows 2000 art on network:
http://support.microsoft.com/kb/242600
Also see MS article below:
http://support.microsoft.com/kb/296799
Windows 2000 art on network:
http://support.microsoft.com/kb/242600
ASKER
No, none of the networks is teamed.
Looks like the LAN interfaces can communicate with external hosts, but not with each other - right? I wonder why, since the "Validate a Cluster" wizard does not find any problems!?
Looks like the LAN interfaces can communicate with external hosts, but not with each other - right? I wonder why, since the "Validate a Cluster" wizard does not find any problems!?
Can the nodes ping and RDP each other?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
The problem is solved:
I have updated drivers and firmware for NICs, RAID Controller, etc.
And I have connected the LAN interface of server1 to another switch port since it could only connect with 100MB (instead of 1GB) on the original switch port.
I have updated drivers and firmware for NICs, RAID Controller, etc.
And I have connected the LAN interface of server1 to another switch port since it could only connect with 100MB (instead of 1GB) on the original switch port.
Rancy:
Can you check if there are any Upgrades required for the NIC's ??
Can you check if there are any Upgrades required for the NIC's ??
ASKER
was part of the solution
Hi,
I can confirm, that NIC drivers reinstalation help to solve issue with Partitioned cluster networks.
Also in our case networks have been configured and worked properly except in cluster.
So PING, TELNET, RDP was working, cluster connectivity NOT.
I can confirm, that NIC drivers reinstalation help to solve issue with Partitioned cluster networks.
Also in our case networks have been configured and worked properly except in cluster.
So PING, TELNET, RDP was working, cluster connectivity NOT.
Hi All,
I too faced the same issue, its a Win2008 R2 OS and with SQL2008 cluster.
Two network cards (10GB) for public network (which is teamed) and one network card (1gb) for heartbeat.
initially it was set to "Auto negotiate" for both network cards, later checked with network team and got to know it was to 10GB for public network and 100mb for heartbeat in switch side.
I have changed on both nodes the same and issue disappeared.
I too faced the same issue, its a Win2008 R2 OS and with SQL2008 cluster.
Two network cards (10GB) for public network (which is teamed) and one network card (1gb) for heartbeat.
initially it was set to "Auto negotiate" for both network cards, later checked with network team and got to know it was to 10GB for public network and 100mb for heartbeat in switch side.
I have changed on both nodes the same and issue disappeared.
Just for completeness: There exists a known issue which caused communication issues inside a cluster when the local OS firewall is enabled. This is reported here. The working solution for us was to run the following powershell command on each cluster node and perform a full reboot:
netsh advfirewall firewall set rule "Failover Clusters (UDP-In)" new enable=no