Microsoft Cluster Server with LACP network cards constantly disconnect

Posted on 2008-11-04
Last Modified: 2013-11-09
I have 2 windows 2003 servers running in a cluster and i am attempting to increase the bandwidth to them.  I recently purchased 2x Extreme Networks x450a core switches and stacked them. Until now i have had the 2 LAN Nic's on my cluster servers (each server has 2x INTEL PRO 1000 PM) teamed using ALB but i started testing out using LACP.  I created the LACP on the switches and turned on link aggregation on the teams and everything seemed to run fine.  But now i keep getting event logs showing that the secondary link on the server disconnected and then rejoins.  

First i get notification that the cluster lost connection with my LAN on the opposite server, then i get Event logs for iANSMiniport ID: 13 and 14 indicating:
13: The Intel(R) PRO/1000 PM Network Connection has been deactivated from the team.
14: Secondary Adapter has rejoined the Team: Intel(R) PRO/1000 PM Network connection.

Each time it does this i get a momentary packet loss to the server.  Eventually after about 24 hours.  or so the server finally stops responding on its cluster IP and When i look at my team, one of the cards has entered Standby even though it should be active.

At first i thought it was a physical problem with 1 server in the cluster so i failed it over and as soon as i fail it to the backup server the same thing occurs on my backup.  At this point I'm better off going back to ALB but i want to be able to resolve this issue to get more bandwidth to the server.
Question by:Umbra-IT
    LVL 21

    Expert Comment

    please provide me logs from x450a

    my initial assumption - you have problem with lacp configuration.
    possibly you should try to configure Extremes to be lacp master(or active) and server to be lacp passive (slave)
    please also provide me with your current config for lacp on extremes. do you use ports from different stack units for lag? possibly this is stacking issue (check if your firmware is current) and try to use ports from the same stack unit for lag (yes, it is not very good, but you can try that for testing purposes).

    Stacking on X450a is rather new feature, so I expect it may contain some flows...

    Author Comment

    yes i have the same port on 2 different switches in the stack in the LACP configuration.  So the way i setup the stack on the extreme was just :
    enable sharing 1:34 grouping 1:34, 2:34 algorithm address-based L2 lacp

    i think i just noticed an issue when looking at the server.  it appears that i have an untagged vlan running on almost all ports of the 2 switches except it seems to have excluded 2:34, i wonder if that is the issue.  

    i was playing around with the LACP and it seems that when you add a port to sharing on the extreme it takes it out of the VLAN its in.

    Author Comment

    I actually just tried to re-create the sharing group and sure enough when i create the link aggregation it takes the secondary port out of my vlan.  When i attempt to re-add the secondary port to the vlan i get:

    * Slot-1 Stack.17 # config vlan "UMBNET" add ports 2:33
    Error: Port 2:33 is in load sharing mode. Perform all configuration changes to load share master (1:33).
    Configuration failed on backup MSM, command execution aborted!

    Im wondering if this has anything to do with the secondary card going offline and back online.  If the secondary network link is not in the same vlan wouldn't that cause issues if the primary link failed?  also how would i be able to get full throughput.  Unless because its considered a shared port the vlan is automatically on the secondary port even though the config doesnt show it.
    LVL 21

    Expert Comment

    what does show log saying about that problem?
    Please try to use ports within the same switch for testing.
    I have exp with x450a lags without stack and with soft 11.6
    and I would configure vlans for both ports firts and only then I would try to add them to lag
    and again, please note, that lacp can be master and slave - please try to play with that: server configure as slave and switch as master.

    Author Comment

    Im running version  Maybe i need a firmware upgrade, i thinkg 12.0.3 is generally available.

    I did first put both ports into the VLAN and then created the sharing group on the switch, which resulted in the master port remaining in the vlan but the secondary port being removed.  I will try setting it up on a single switch to see if that makes any difference.  Im also currently updating the Intel drivers to see if that makes a difference
    LVL 21

    Assisted Solution

    ok, wish very good luck with that.
    I would suggest upgrading firmware for switches
    Please come back with the results

    Accepted Solution

    Turns out that after everything being said we installed 2 new intel pro 1000 cards and disabled the onboard cards which seems to have resolved the issue.  At this point i feel that it was a physical issue with the onboard network cards of the motherboard.

    Featured Post

    PRTG Network Monitor: Intuitive Network Monitoring

    Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

    Join & Write a Comment

    Scenerio: You have a server running Server 2003 and have applied a retail pack of Terminal Server Licenses.  You want to change servers or your server has crashed and you need to reapply the Terminal Server Licenses. When you enter the 16-digit lic…
    #Citrix #Citrix Netscaler #HTTP Compression #Load Balance
    Migrating to Microsoft Office 365 is becoming increasingly popular for organizations both large and small. If you have made the leap to Microsoft’s cloud platform, you know that you will need to create a corporate email signature for your Office 365…
    Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

    733 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    24 Experts available now in Live!

    Get 1:1 Help Now