[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 2498
  • Last Modified:

Microsoft Cluster Server with LACP network cards constantly disconnect

I have 2 windows 2003 servers running in a cluster and i am attempting to increase the bandwidth to them.  I recently purchased 2x Extreme Networks x450a core switches and stacked them. Until now i have had the 2 LAN Nic's on my cluster servers (each server has 2x INTEL PRO 1000 PM) teamed using ALB but i started testing out using LACP.  I created the LACP on the switches and turned on link aggregation on the teams and everything seemed to run fine.  But now i keep getting event logs showing that the secondary link on the server disconnected and then rejoins.  

First i get notification that the cluster lost connection with my LAN on the opposite server, then i get Event logs for iANSMiniport ID: 13 and 14 indicating:
13: The Intel(R) PRO/1000 PM Network Connection has been deactivated from the team.
14: Secondary Adapter has rejoined the Team: Intel(R) PRO/1000 PM Network connection.

Each time it does this i get a momentary packet loss to the server.  Eventually after about 24 hours.  or so the server finally stops responding on its cluster IP and When i look at my team, one of the cards has entered Standby even though it should be active.

At first i thought it was a physical problem with 1 server in the cluster so i failed it over and as soon as i fail it to the backup server the same thing occurs on my backup.  At this point I'm better off going back to ALB but i want to be able to resolve this issue to get more bandwidth to the server.
0
Umbra-IT
Asked:
Umbra-IT
  • 4
  • 3
2 Solutions
 
from_expCommented:
hi!
please provide me logs from x450a

my initial assumption - you have problem with lacp configuration.
possibly you should try to configure Extremes to be lacp master(or active) and server to be lacp passive (slave)
please also provide me with your current config for lacp on extremes. do you use ports from different stack units for lag? possibly this is stacking issue (check if your firmware is current) and try to use ports from the same stack unit for lag (yes, it is not very good, but you can try that for testing purposes).

Stacking on X450a is rather new feature, so I expect it may contain some flows...
0
 
Umbra-ITAuthor Commented:
yes i have the same port on 2 different switches in the stack in the LACP configuration.  So the way i setup the stack on the extreme was just :
enable sharing 1:34 grouping 1:34, 2:34 algorithm address-based L2 lacp

i think i just noticed an issue when looking at the server.  it appears that i have an untagged vlan running on almost all ports of the 2 switches except it seems to have excluded 2:34, i wonder if that is the issue.  

i was playing around with the LACP and it seems that when you add a port to sharing on the extreme it takes it out of the VLAN its in.
0
 
Umbra-ITAuthor Commented:
I actually just tried to re-create the sharing group and sure enough when i create the link aggregation it takes the secondary port out of my vlan.  When i attempt to re-add the secondary port to the vlan i get:

* Slot-1 Stack.17 # config vlan "UMBNET" add ports 2:33
Error: Port 2:33 is in load sharing mode. Perform all configuration changes to load share master (1:33).
Configuration failed on backup MSM, command execution aborted!

Im wondering if this has anything to do with the secondary card going offline and back online.  If the secondary network link is not in the same vlan wouldn't that cause issues if the primary link failed?  also how would i be able to get full throughput.  Unless because its considered a shared port the vlan is automatically on the secondary port even though the config doesnt show it.
0
Veeam and MySQL: How to Perform Backup & Recovery

MySQL and the MariaDB variant are among the most used databases in Linux environments, and many critical applications support their data on them. Watch this recorded webinar to find out how Veeam Backup & Replication allows you to get consistent backups of MySQL databases.

 
from_expCommented:
what does show log saying about that problem?
Please try to use ports within the same switch for testing.
I have exp with x450a lags without stack and with soft 11.6
and I would configure vlans for both ports firts and only then I would try to add them to lag
and again, please note, that lacp can be master and slave - please try to play with that: server configure as slave and switch as master.
0
 
Umbra-ITAuthor Commented:
Im running version 12.0.1.11.  Maybe i need a firmware upgrade, i thinkg 12.0.3 is generally available.

I did first put both ports into the VLAN and then created the sharing group on the switch, which resulted in the master port remaining in the vlan but the secondary port being removed.  I will try setting it up on a single switch to see if that makes any difference.  Im also currently updating the Intel drivers to see if that makes a difference
0
 
from_expCommented:
ok, wish very good luck with that.
I would suggest upgrading firmware for switches
Please come back with the results
0
 
Umbra-ITAuthor Commented:
Turns out that after everything being said we installed 2 new intel pro 1000 cards and disabled the onboard cards which seems to have resolved the issue.  At this point i feel that it was a physical issue with the onboard network cards of the motherboard.
0

Featured Post

 The Evil-ution of Network Security Threats

What are the hacks that forever changed the security industry? To answer that question, we created an exciting new eBook that takes you on a trip through hacking history. It explores the top hacks from the 80s to 2010s, why they mattered, and how the security industry responded.

  • 4
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now