Link to home
Start Free TrialLog in
Avatar of buck57005
buck57005Flag for United Kingdom of Great Britain and Northern Ireland

asked on

HP network teaming fails after server reboot

Hi.

I am having an issue with network teaming on a HP Proliant DL380 G3 server (running Windows Server 2003 R2 Standard), connected to a CISCO C2950 switch.  

Teaming works fine until the HP server is rebooted.  When the server is rebooted, the network team fails to initialize.  If I log on to the server at the console, open network connections and look at the teaming connection, I either see a message saying that the cable is unplugged, or the acquiring IP address message (despite the server having a static IP).  

If I get the first message (network cable is unplugged) and I leave the network connections window long enough, the second message will appear.  The only way I can restore the teaming interface is to disable it through the network connections window, and then re-enable it.  The interface comes up instantly and all is fine until the next reboot.

In all cases, the individual interfaces (i.e. the team members) display "connected" in the network connections window.  The only item that is selected in the interface properties window of the team members is the HP Network Configuration Utility.

Before I started this morning, I was getting the following error in the Windows System log:

Event Type:      Warning
Event Source:      CPQTeamMP
Event ID:      434
LBSRV03: PROBLEM: A non-Primary Network Link is not receiving. Receive-path validation has been enabled for this Team by selecting the Enable receive-path validation Heartbeat Setting.  ACTION: Please check your cabling to the link partner. Check the switch port status, including verifying that the switch port is not configured as a Switch-assist Channel. Generate Broadcast traffic on the network to test whether these are being received. Also make sure all teamed NICs are on the same broadcast domain. Run diagnostics to test card. Drop the NIC from the team, determine whether it is receiving broadcast traffic in that configuration.

However, with my current configuration, I am now getting the following error in the System log:

Event Type:      Warning
Event Source:      CPQTeamMP
Event Category:      None
Event ID:      461
Description:
Team ID: 0
Aggregation ID: 1
Team Member ID: 1
 PROBLEM: 802.3ad link aggregation (LACP) has failed. ACTION: Ensure all ports are connected to LACP-aware devices.

I have attached a file showing the output of a "show run" and "show etherchannel 3 detail" command on the CISCO switch.  Etherchannels 1 and 2 are working.  These are connected to two other servers.  The extra config on the Etherchannel 3 and FE0/5 and FE0/6 interfaces was added this morning to try and cure the issue.  

All Etherchannels and interfaces are in the native VLAN on the switch.  I tried setting the VLAN ID in the HP config utility to VLAN1 but this didn't seem to make any difference.

I believe I have installed the latest drivers and firmware for the network cards from the page below:

http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareIndex.jsp?lang=en&cc=us&prodNameId=3288130&prodTypeId=15351&prodSeriesId=316529&swLang=8&taskId=135&swEnvOID=1005#11395

I installed the 14.0.0.7 HP NC-Series Broadcom driver and the 2.1.5.7 broadcom firmware update.

The version of the HP Network Configuration Utility is 10.00.0.12.

I have been searching the net all morning and despite trying several things, the issue remains.  I don't really know what else to try and if anyone has any suggestions, they would be greatly appreciated.

Let me know if you need any other info.

Thanks, Shaun


HP Network Configuration Utility settings:

 hp01.tif

 hp02.tif

 hp03.tif

All offload settings disabled)

 hp04.tif

 hp05.tif

 hp06.tif

 hp07.tif
ciscooutput.txt
Avatar of Steve
Steve
Flag of United Kingdom of Great Britain and Northern Ireland image

are you trying to set teaming up on the server and the switch?
You should  only use aggregation the server OR switch.
Avatar of Member_2_231077
Member_2_231077

Have you tried taking out the "switchport mode active" and "spanning-tree portfast" from channel-group 3 to make them similar to the two working channel-groups?
What version of the HP Network teaming software are you using?
 
Avatar of buck57005

ASKER

Hi.  Thanks for your comments.


Totallytonto, I am configuring both the switch and server so as to load balance across the two Ethernet ports.  As far as I understand it, you need to have a managed switch and software on the server configured to load balance across a connection.  I might be wrong but I think this is the reason that you cannot configure teaming on a cheap £5 jobbie switch.  Certainly the way I am trying to configure teaming on this server is the same (as far as I can see) as on the working server.  That said, any alternative suggestions are always welcome.


D Vante, the only software I am using is the HP Network Configuration Utility which is version 10.00.0.12.  I have noticed when I click on About in the config utility that the Network Teaming Intermediate Driver (NTID) is version 10.00.00.0.


Andyalder, initially, the Etherchannel 3, and the FE05 and FE06 interfaces matched the working interfaces identically.  However, I still got the same issue.

Just to satisfy my curiosity, I just removed the following commands from the FE05 and FE06 interfaces:

switchport mode access
spanning-tree portfast

I also removed the following command from the Etherchannel 3 interface:

switchport mode access

I rebooted the server and got the same behaviour.  However, I have an extra event in the System log on the server:

Event Type:      Information
Event Source:      CPQTeamMP
Event Category:      None
Event ID:      439
Description:
LBSRV03: PROBLEM: A non-Primary Network Link is being Closed. This is typically because of a PnP action, possibly it was reconfigured through Network-Properties or through HP Network Configuration Utility? Possibly it was Disabled? Possibly it is being dropped from a Team or the Team is being Dissolved? ACTION: No action is required if the described behavior is expected. Otherwise, investigate the PnP reason, possibly re-enable the miniport.

This is in addition to the 461 events that I listed above.

I also just noticed that there are two 462 events in the System log:

Event Type:      Information
Event Source:      CPQTeamMP
Event Category:      None
Event ID:      462
Description:
Team ID: 0
Aggregation ID: 0
Team Member ID: 0
 802.3ad link aggregation(LACP) has been restored.

The other event is identical apart from it references team member 1.  I checked back in the System log and these events occur each time the server is booted and no negative events related to teaming are displayed once these events are generated.

It sounds like LACP is unsuccessful initially, and then succeeds, but the teaming connection in Windows never recovers.  

Does anyone know if it's possible to script the disabling and enabling (or possibly a repair) of a network connection using vbScript.  I'm wondering if that may be a possible workaround.  Publish a computer startup script and put a delay of 2 or 3 minutes in to allow the LACP process to succeed.

Thanks, Shaun
You can use the netsh command from the command line to perform most functions on network cards.

Also, you dont normally set teaming up on the switch AND the server. Just one or the other.
If you set both you'll find that the server and te switch are fighting to control the traffic and will fail.
Hi Totallytonto.

Thanks for your message.  

I probably need to clarify, I am enabling teaming on the server, and the majority of the config is done on there.  However, surely the switch needs to be LACP aware otherwise how would it know how to load balance the traffic?

Cheers
What kind of teaming are you setting up on the server?
-Transmit load balance with fault tolerance
-network fault tolerance
-switch assisted loadbalancing with fault tolerance?
He's using 802.3ad = LACP, it's a form of switch assisted load balancing. It's initiated from the switch since that's active and the server is set to automatic (passive). Could swap them around so that the switch is passive and the server active in the team setup.
Would that be by using the following command for the FE05 and FE06 interfaces?

channel-group 3 mode passive

Cheers
Yes, that's right, with 'channel-group 3 mode passive' the seitch sits there waiting for the server to initialise the LACP channel group.
I will certainly give that a try tonight and will let you know how it goes.

Cheers
Don't forget to change the server setting to active in dropdown box at hp02.tif
ASKER CERTIFIED SOLUTION
Avatar of buck57005
buck57005
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial