Link to home
Start Free TrialLog in
Avatar of Vuurvos
VuurvosFlag for Netherlands

asked on

Cluster traffic in Microsoft High Availability clusters routing messup

Hi,

I'm managing a 3-node Exchange DAG cluster (not my own design) that shows exceptional network traffic that I don't understand. I hope someone can tell me what is going on. Each node has 3 network adapters:

Server x has the following IP's assigned to the network adapters.
NIC 1: 10.10.10.x/24, Server network - Default Gateway, DNS Servers and adapter registers in the DNS
NIC 2: 10.20.20.x/24, Replication network - No additional configuration,
NIC 3: 10.30.30.x/24, Backup network  - No additional IP configuration, all Microsoft services unbound in the adapter settings.

NIC 1 also has the default gateway. The other networks are dedicated, un-routed, closed vlan's. So they are not attached to the firewall, nor should they ever be. When I observe the firewall logs of the box that is at the default gateway, I see the following traffic dropped.

10.10.10.1:3343 udp 10.30.30.2 :3343
10.10.10.1:3343 udp 10.30.30.3 :3343
10.10.10.2:3343 udp 10.30.30.1 :3343
10.10.10.2:3343 udp 10.30.30.3 :3343
10.10.10.3:3343 udp 10.30.30.1 :3343
10.10.10.3:3343 udp 10.30.30.2 :3343

10.30.30.x is a local network to each of the exchange servers, therefor we can see in the windows routing table that 10.30.30.0/24 network should route over the 10.30.30.x interface.

The big question now is: Why are the server vlan attached nic's trying to reach the adapters in the backup vlan?

- I have checked the DNS zone files and local DNS cache, there are no references on either server to the other two in the dns, cache or local hosts file that refer to the IP addresses on the backup vlan.
- I have checked the IP routing table and all expected networks are properly assigned "on-link" with the matching IP address:

IPv4 Route Table
===========================================================================
Active Routes:
Network Destination        Netmask          Gateway       Interface  Metric
          0.0.0.0          0.0.0.0     10.10.10.254       10.10.10.1    261
       10.10.10.0    255.255.255.0         On-link        10.10.10.1    261
       10.10.10.1  255.255.255.255         On-link        10.10.10.1    261
     10.10.10.255  255.255.255.255         On-link        10.10.10.1    261
       10.20.20.0    255.255.255.0         On-link        10.20.20.1    261
       10.20.20.1  255.255.255.255         On-link        10.20.20.1    261
     10.20.20.255  255.255.255.255         On-link        10.20.20.1    261
        127.0.0.0        255.0.0.0         On-link         127.0.0.1    306
        127.0.0.1  255.255.255.255         On-link         127.0.0.1    306
  127.255.255.255  255.255.255.255         On-link         127.0.0.1    306
      169.254.0.0      255.255.0.0         On-link     169.254.1.105    261
    169.254.1.105  255.255.255.255         On-link     169.254.1.105    261
  169.254.255.255  255.255.255.255         On-link     169.254.1.105    261
       10.30.30.0      255.255.0.0         On-link        10.30.30.1    261
       10.30.30.1  255.255.255.255         On-link        10.30.30.1    261
   172.29.255.255  255.255.255.255         On-link        10.30.30.1    261
        224.0.0.0        240.0.0.0         On-link         127.0.0.1    306
        224.0.0.0        240.0.0.0         On-link        10.10.10.1    261
        224.0.0.0        240.0.0.0         On-link        10.20.20.1    261
        224.0.0.0        240.0.0.0         On-link        10.30.30.1    261
        224.0.0.0        240.0.0.0         On-link     169.254.1.105    261
  255.255.255.255  255.255.255.255         On-link         127.0.0.1    306
  255.255.255.255  255.255.255.255         On-link        10.10.10.1    261
  255.255.255.255  255.255.255.255         On-link        10.20.20.1    261
  255.255.255.255  255.255.255.255         On-link        10.30.30.1    261
  255.255.255.255  255.255.255.255         On-link     169.254.1.105    261
===========================================================================
Persistent Routes:
  Network Address          Netmask  Gateway Address  Metric
          0.0.0.0          0.0.0.0     10.10.10.254  Default
- I have tried to disable the adapters in cluster manager as an interface to be used for cluster communications, it won't disable as when I disable the interface and then recheck the setting it's back enabled again. The traffic still keeps coming on my firewall.

I have been seeing similar traffic from a Hyper-V cluster where packes destined for heartbeat and livemigration VLAN's  are send from the server VLAN's as well, so I think this is more a generic Cluster Service issue then that it is an Exchange related thing.
Since I am experiencing a lot of network performance issues, I suspect there is also a part to be found in network packets send from the wrong interfaces, but this example keeps me puzzled. Is there anyone who can explain to me why this is happening?
Avatar of Amit
Amit
Flag of India image

You need to add persistent route on each server to use that specific NIC. Why?
Read this:
strong host model: https://technet.microsoft.com/en-us/magazine/2007.09.cableguy.aspx

On each server run this command:
Route add -p 10.38.16.0  MASK 255.255.255.0 <GatewayIP> METRIC 1 IF 12

12 is the interface list number. If you still have doubt, send me the route print result for all servers with NIC configuration details. Run this ipconfig /all > ip.txt
Avatar of Vuurvos

ASKER

Hi Amit, thank you so much for pointing out the strong/weak send/receives to me. It's something I wasn't yet aware of in these terms.  I've added the output of what you asked and some more in the attached xlsx. This is a test setup so all information included is irrelevant to security.

When I try to add the rule you suggest, I get the following response:

C:\Windows\system32>route add 10.30.30.0 mask 255.255.255.0 10.30.30.6 METRIC 1 IF 18
The route addition failed: The object already exists.

C:\Windows\system32>route add -r 10.30.30.0 mask 255.255.255.0 10.30.30.6 METRIC 1 IF 18
 OK!
I sort of fail to see what the benefit then would be from adding this permanently to the static routes table given that it is already added dynamically anyway. Could you please elaborate on that?

Having said that, I've been reading the article and if I read it correctly, it would mean that I need to have weak sends/receives disabled. As you can see this is disabled on the interface, and I've checked the other interfaces as well, and is set the same:

Weak Host Sends                    : disabled
Weak Host Receives                 : disabled
I don't see any attachment.
Avatar of Vuurvos

ASKER

First time to add files on EE... apparently forgot to click the upload button...

Experts-Exchange.xlsx
Can you check DAG replication setting from EMC. You should only enable replication via replication NIC and disabel it on MAPI/Backup NIC.
Avatar of Vuurvos

ASKER

By default all networks are configured for replication I see. I'll have a check with the network team tomorrow if indeed this traffic has stopped. That might explain where the packets are coming from in this case, it doesn't though explain why they are send anyway.

The way I understand it, the clusterservice (given that I have seen this same behavior also on Hyper-V clusters and Fileserver clusters)  deliberately uses the IP address of the LAN as originating address to it's cluster packets. Since the host is acting as a strong host sender on all interfaces, it will only look at the routing table explicitly bound to that interface. Finding the default gateway as a usable route on this interface, it will thus send the cluster packet using the server LAN adapter. If so, would that mean that I would also be able to solve this by enabling weak host sends explicitly from the Serverlan adapter? And if so, what would be the particular security ramifications of such a change?
I don't see any issue in enabling it. However, I would suggest you open case with MS to get more clarity for this issue.
ASKER CERTIFIED SOLUTION
Avatar of Vuurvos
Vuurvos
Flag of Netherlands image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks for sharing.
Avatar of Vuurvos

ASKER

During the discussion in the Microsoft Partner Channel, I came across the PlumbAllCrossSubnetRoutes property and how it impacts the path discovery of cluster services. Through this post I'd like to give some feedback to the other readers on what I have found there.