Network/LAN performance of a server

I am a developer concerned about network performance of one server.  We have about 8 server setup at a data center in their own cabinet/LAN.   If I copy a 10M file between most of these servers, it takes 1 second.  

When copying the file to/from our main web server, it takes well over 5 minutes.  But when copying from the database server to this web server, it goes for a couple minutes, then stops with this error:    
     Cannot Copy (FileName).   The Specified Network Name is No Longer Available

I see orphaned JVM requests slowing increasing on this web server until it runs of our JVM memory and I have to restart the service.   I assume this is a result of the problem and not the problem itself.

Server is a Dell PowerEdge 2850 running windows 2003 Web Edition.  It has teamed 1G NIC cards, please see attached images.  

Any idea where to look or how to trouble shoot this, please give me an idea.    Remember I am a developer, not a server admin, so use small words :)

LVL 39
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Are you going across a WAN? Or any type of optimization service, it could be getting optimized on one end.  I have seen problems with TOE chip on some network cards that cause this problem. Try disabling the TOE in the device properties of your NIC driver. Have tried using just one of the NIC's instead of teaming?
There are a number of possibilities

- the network segment that the machine is on, is over subscribed heavily
- there is a fault on the switch port
- there is bad cabling
- the driver is corrupt (might be worth re-setting up your teaming)
- you have a faulty network card, or bad card settings (rx/tx offloads, cache, etc)

start with the easy things like switch ports and cables....

do you always test between the same two machines?
can you put a cross over cable between say your laptop and the machine in question? how does that perform?

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
gdemariaAuthor Commented:
Scarrison, thanks for your response..
>the network segment that the machine is on, is over subscribed heavily
Does my first image help with this?  I am wondering why a 1 G card shows Speed as 100 Mbps.  Also, the send/receive is 178,xxx,xxx.   Is that a big/normal load?

> do you always test between the same two machines?
I have tested between this server and three other servers.  One server gives that error, the other two just take 5+ minutes.   I also tested between other servers (not involving this problem one) and they copy the file in 1 second.

CohosEvamy, thanks for your response as well.  I don't know what a TOE is but I will do some research on it.

SolarWinds® Network Configuration Manager (NCM)

SolarWinds® Network Configuration Manager brings structure and peace of mind to configuration management. Bulk config deployment, automatic backups, change detection, vulnerability assessments, and config change templates reduce the time needed for repetitive tasks.

If the gig ethernet cards are not plugged into a gigE switch then they will only operate in 100M mode.

The send/ receive numbers have nothing to do with each other and it really says nothing about the load as those packets could be 16 bits or 64400 bits.  The number of packets shown there are for the entire "connected" duration of 156 days and 8+ hours

That is a rather long time to maintain a connection and if possible, see if rebooting the server might help.
Sounds like intermittent comms. What service pack are you running on the problem child server? SP1 can flood a single nic because of a bug in the code that improperly configured the MTU settings. If using SP1, consider going to SP2.

Also, your nic teaming could have failed. I have seen that before.
gdemariaAuthor Commented:
> What service pack are you running on the problem child server

My OS is Windows Server 2003 Web Edition  5.2.3790  Service Pack 0.0
I'm a bit surprised I don't have any service packs installed as I have it set on automatic updates.  

> SP1 can flood a single nic

Is there some way to view the traffic on each Nic to see if this is happening?
Looking at the "status" window of each Nic (the first image at the top of the screen).   For one nic it shows 200,000,000 and for the other nic it is 57,000,000.    One seems to be getting 4x the traffic of the other.  Perhaps that's normal as low volumne times just using the one nic??  

> Also, your nic teaming could have failed. I have seen that before.

Is there a way to test to see if this is happening?  I see that packets on send/receive are increasing on both nics.

Well, How many nodes on the network do you have?

If less than 250 nodes, consider breaking the NIC team and using one NIC. Multihomed servers are problematic, at best.

gdemariaAuthor Commented:
Chief, sorry to have disappeared, I've been traveling a lot and didn't even know another responce had been logged.

> If less than 250 nodes, consider breaking the NIC team and using one NIC. Multihomed servers are problematic, at best.

This is really interesting.   There are just a handful of servers located at a data center.   Two database servers and several web servers.   Interesting this is the only server that uses a teamed NIC card configuration.  It was recently recommended to me to change the database server to team the NICs in order to double the throughput and provide fail-over.  

You're suggesting that teaming the NICs could cause this problem?  
I always thought that all servers used teamed NICs for redundancy...

Should I try unteaming them?
I see you have your NICs in fault tolerant mode, not load balancing.  Using the Intel network tool can you switch wich network card is "master"?
gdemariaAuthor Commented:
Thanks S.C. for your reply!
I've attached a few more images, looks like no primary is set.   And the second NIC has some usage, but very little.

Given that only one card is really being used, perhaps this isn't my problem??  Is there really any way to know - or just trial and error?

Do you recommend any changes to this configuration?
Simply to isolate the problem it would be useful to see if removing one network card from the team (just unplug the active one, this should force the secondary to take over) solves the issue.
gdemariaAuthor Commented:
Ironically, this issue was finally just resolved on Friday (two days ago).   We still don't know why it happened, but it seems to be done.  We replaced the switch and set the nic cards to auto detect, one of those things resolved the problem.  Thanks for all your ideas.
gdemariaAuthor Commented:
I am objecting to the auto close with the sole purpose of making the link to "accept multiple solutions" appear.   There is an ee bug that hides this link when the autoclose is in progress.   I will object and split points.  Moderator, nothing to do here except report the bug so it is resolved.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Network Management

From novice to tech pro — start learning today.