Cluster traffic in Microsoft High Availability clusters routing messup

Hi,

I'm managing a 3-node Exchange DAG cluster (not my own design) that shows exceptional network traffic that I don't understand. I hope someone can tell me what is going on. Each node has 3 network adapters:

Server x has the following IP's assigned to the network adapters.
NIC 1: 10.10.10.x/24, Server network - Default Gateway, DNS Servers and adapter registers in the DNS
NIC 2: 10.20.20.x/24, Replication network - No additional configuration,
NIC 3: 10.30.30.x/24, Backup network  - No additional IP configuration, all Microsoft services unbound in the adapter settings.

NIC 1 also has the default gateway. The other networks are dedicated, un-routed, closed vlan's. So they are not attached to the firewall, nor should they ever be. When I observe the firewall logs of the box that is at the default gateway, I see the following traffic dropped.

10.10.10.1:3343 udp 10.30.30.2 :3343
10.10.10.1:3343 udp 10.30.30.3 :3343
10.10.10.2:3343 udp 10.30.30.1 :3343
10.10.10.2:3343 udp 10.30.30.3 :3343
10.10.10.3:3343 udp 10.30.30.1 :3343
10.10.10.3:3343 udp 10.30.30.2 :3343

10.30.30.x is a local network to each of the exchange servers, therefor we can see in the windows routing table that 10.30.30.0/24 network should route over the 10.30.30.x interface.

The big question now is: Why are the server vlan attached nic's trying to reach the adapters in the backup vlan?

- I have checked the DNS zone files and local DNS cache, there are no references on either server to the other two in the dns, cache or local hosts file that refer to the IP addresses on the backup vlan.
- I have checked the IP routing table and all expected networks are properly assigned "on-link" with the matching IP address:

IPv4 Route Table
===========================================================================
Active Routes:
Network Destination        Netmask          Gateway       Interface  Metric
          0.0.0.0          0.0.0.0     10.10.10.254       10.10.10.1    261
       10.10.10.0    255.255.255.0         On-link        10.10.10.1    261
       10.10.10.1  255.255.255.255         On-link        10.10.10.1    261
     10.10.10.255  255.255.255.255         On-link        10.10.10.1    261
       10.20.20.0    255.255.255.0         On-link        10.20.20.1    261
       10.20.20.1  255.255.255.255         On-link        10.20.20.1    261
     10.20.20.255  255.255.255.255         On-link        10.20.20.1    261
        127.0.0.0        255.0.0.0         On-link         127.0.0.1    306
        127.0.0.1  255.255.255.255         On-link         127.0.0.1    306
  127.255.255.255  255.255.255.255         On-link         127.0.0.1    306
      169.254.0.0      255.255.0.0         On-link     169.254.1.105    261
    169.254.1.105  255.255.255.255         On-link     169.254.1.105    261
  169.254.255.255  255.255.255.255         On-link     169.254.1.105    261
       10.30.30.0      255.255.0.0         On-link        10.30.30.1    261
       10.30.30.1  255.255.255.255         On-link        10.30.30.1    261
   172.29.255.255  255.255.255.255         On-link        10.30.30.1    261
        224.0.0.0        240.0.0.0         On-link         127.0.0.1    306
        224.0.0.0        240.0.0.0         On-link        10.10.10.1    261
        224.0.0.0        240.0.0.0         On-link        10.20.20.1    261
        224.0.0.0        240.0.0.0         On-link        10.30.30.1    261
        224.0.0.0        240.0.0.0         On-link     169.254.1.105    261
  255.255.255.255  255.255.255.255         On-link         127.0.0.1    306
  255.255.255.255  255.255.255.255         On-link        10.10.10.1    261
  255.255.255.255  255.255.255.255         On-link        10.20.20.1    261
  255.255.255.255  255.255.255.255         On-link        10.30.30.1    261
  255.255.255.255  255.255.255.255         On-link     169.254.1.105    261
===========================================================================
Persistent Routes:
  Network Address          Netmask  Gateway Address  Metric
          0.0.0.0          0.0.0.0     10.10.10.254  Default
- I have tried to disable the adapters in cluster manager as an interface to be used for cluster communications, it won't disable as when I disable the interface and then recheck the setting it's back enabled again. The traffic still keeps coming on my firewall.

I have been seeing similar traffic from a Hyper-V cluster where packes destined for heartbeat and livemigration VLAN's  are send from the server VLAN's as well, so I think this is more a generic Cluster Service issue then that it is an Exchange related thing.
Since I am experiencing a lot of network performance issues, I suspect there is also a part to be found in network packets send from the wrong interfaces, but this example keeps me puzzled. Is there anyone who can explain to me why this is happening?
VuurvosAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

AmitIT ArchitectCommented:
You need to add persistent route on each server to use that specific NIC. Why?
Read this:
strong host model: https://technet.microsoft.com/en-us/magazine/2007.09.cableguy.aspx

On each server run this command:
Route add -p 10.38.16.0  MASK 255.255.255.0 <GatewayIP> METRIC 1 IF 12

12 is the interface list number. If you still have doubt, send me the route print result for all servers with NIC configuration details. Run this ipconfig /all > ip.txt
VuurvosAuthor Commented:
Hi Amit, thank you so much for pointing out the strong/weak send/receives to me. It's something I wasn't yet aware of in these terms.  I've added the output of what you asked and some more in the attached xlsx. This is a test setup so all information included is irrelevant to security.

When I try to add the rule you suggest, I get the following response:

C:\Windows\system32>route add 10.30.30.0 mask 255.255.255.0 10.30.30.6 METRIC 1 IF 18
The route addition failed: The object already exists.

C:\Windows\system32>route add -r 10.30.30.0 mask 255.255.255.0 10.30.30.6 METRIC 1 IF 18
 OK!
I sort of fail to see what the benefit then would be from adding this permanently to the static routes table given that it is already added dynamically anyway. Could you please elaborate on that?

Having said that, I've been reading the article and if I read it correctly, it would mean that I need to have weak sends/receives disabled. As you can see this is disabled on the interface, and I've checked the other interfaces as well, and is set the same:

Weak Host Sends                    : disabled
Weak Host Receives                 : disabled
AmitIT ArchitectCommented:
I don't see any attachment.
Making Bulk Changes to Active Directory

Watch this video to see how easy it is to make mass changes to Active Directory from an external text file without using complicated scripts.

VuurvosAuthor Commented:
First time to add files on EE... apparently forgot to click the upload button...

Experts-Exchange.xlsx
AmitIT ArchitectCommented:
Can you check DAG replication setting from EMC. You should only enable replication via replication NIC and disabel it on MAPI/Backup NIC.
VuurvosAuthor Commented:
By default all networks are configured for replication I see. I'll have a check with the network team tomorrow if indeed this traffic has stopped. That might explain where the packets are coming from in this case, it doesn't though explain why they are send anyway.

The way I understand it, the clusterservice (given that I have seen this same behavior also on Hyper-V clusters and Fileserver clusters)  deliberately uses the IP address of the LAN as originating address to it's cluster packets. Since the host is acting as a strong host sender on all interfaces, it will only look at the routing table explicitly bound to that interface. Finding the default gateway as a usable route on this interface, it will thus send the cluster packet using the server LAN adapter. If so, would that mean that I would also be able to solve this by enabling weak host sends explicitly from the Serverlan adapter? And if so, what would be the particular security ramifications of such a change?
AmitIT ArchitectCommented:
I don't see any issue in enabling it. However, I would suggest you open case with MS to get more clarity for this issue.
nashim khanExchange AdministratorCommented:
VuurvosAuthor Commented:
For cluster servers, the following article seems to be the most logical explanation for the issue:

http://blogs.technet.com/b/askcore/archive/2014/02/20/configuring-windows-failover-cluster-networks.aspx

Configuring full mesh heartbeat

The Cluster Virtual Network Driver (NetFT.SYS) builds routes between the nodes based on the Cluster property PlumbAllCrossSubnetRoutes.


Value Description

0     Do not attempt to find cross subnet routes if local routes are found
1     Always attempt to find routes that cross subnets
2     Disable the cluster service from attempting to discover cross subnet routes after node successfully joins.

To make a change to this property, you can use the command:

(Get-Cluster). PlumbAllCrossSubnetRoutes = 1

so basically, setting this to 0 should keep the cross network attempts out of the way. Either by default, during Exchange Setup or by manual reconfiguration by my predecessor, the setting seems to have been set to 1.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
AmitIT ArchitectCommented:
Thanks for sharing.
VuurvosAuthor Commented:
During the discussion in the Microsoft Partner Channel, I came across the PlumbAllCrossSubnetRoutes property and how it impacts the path discovery of cluster services. Through this post I'd like to give some feedback to the other readers on what I have found there.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Windows Server 2012

From novice to tech pro — start learning today.