cmaohio
asked on
How to maximize network utilization
My network topology is like this:
Dual Cisco 4507 Core Switches
Cisco 3750-E Server Switches & Access Switches Ports at 1Gbps
10Gbps backbone between Access/Core/Server Switches
My main storage server is a HP DL360 G4p connected at 1Gbps to the Server Switches
It connects to an EMC AX4 via an iSCSI connection at 1Gbps
The other day we had someone copying data from an SD card directly to the storage server over the network. So, removable storage dragged to share drive.
When the person was copying the data from the SD card (about 6 GB I think) it took up 36Mbps on our MRTG graphs and essentially shut everyone else out. Nobody could get to files or do anything. It was really bad. I told her to stop the copy and everything returned to normal. I assumed that it was the slow read rate of the SD card that caused the problem.
My question is this: Why, with a 1Gbps network and a 10Gbps backbone did that take down my network? I mean, all the other servers stopped the ability to talk, for the most part.
Second, I’m having similar issues today when the utilization on the network is reaching about 36Mbps to the APP server and things are dying. What can I do to help this? Why does it barely fill the capacity of the server and it kills everything?
I have verified that the connections are in fact 1000/full duplex connections.
Would a more powerful server help? Is my network not optimized somehow?
Dual Cisco 4507 Core Switches
Cisco 3750-E Server Switches & Access Switches Ports at 1Gbps
10Gbps backbone between Access/Core/Server Switches
My main storage server is a HP DL360 G4p connected at 1Gbps to the Server Switches
It connects to an EMC AX4 via an iSCSI connection at 1Gbps
The other day we had someone copying data from an SD card directly to the storage server over the network. So, removable storage dragged to share drive.
When the person was copying the data from the SD card (about 6 GB I think) it took up 36Mbps on our MRTG graphs and essentially shut everyone else out. Nobody could get to files or do anything. It was really bad. I told her to stop the copy and everything returned to normal. I assumed that it was the slow read rate of the SD card that caused the problem.
My question is this: Why, with a 1Gbps network and a 10Gbps backbone did that take down my network? I mean, all the other servers stopped the ability to talk, for the most part.
Second, I’m having similar issues today when the utilization on the network is reaching about 36Mbps to the APP server and things are dying. What can I do to help this? Why does it barely fill the capacity of the server and it kills everything?
I have verified that the connections are in fact 1000/full duplex connections.
Would a more powerful server help? Is my network not optimized somehow?
When you have some time at night or on a weekend, replicate the user's setup and benchmark your network speed with Jperf. It's nice little program that can test network transfer speeds.
http://code.google.com/p/xjperf/
Then transfer some files back and forth while monitoring your connection with Wireshark. Wireshark has a variety of statistcs and analysis reports that can help you see what problems the network is having.
http://www.wireshark.org/
Also, test the transfer times of a large file from a hard drive versus a SD card and see if you notice any network problems while doing so. There's no reason for a single slow connection to take down the entire network.
If Jperf and wireshark don't help, can you post a network diagram and the confgs from the core switches?
http://code.google.com/p/xjperf/
Then transfer some files back and forth while monitoring your connection with Wireshark. Wireshark has a variety of statistcs and analysis reports that can help you see what problems the network is having.
http://www.wireshark.org/
Also, test the transfer times of a large file from a hard drive versus a SD card and see if you notice any network problems while doing so. There's no reason for a single slow connection to take down the entire network.
If Jperf and wireshark don't help, can you post a network diagram and the confgs from the core switches?
Sounds like the bottleneck is the IO on the server, not the network.
ASKER
Yes Irmoore, it does sound like that. How do I increase that? Trunk two Ethernet ports? Still I was less than 33% utilization on the interface.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Ah, I see. So, that must be it. They are SATA 7200RPM drives. 4 in a RAID 5 configuration. So, that's my bottleneck. Man, that is unfortunate. All this stellar technology we spent money on and the DRIVES are the problem! damn!
ASKER
Essentially there is no solution but the answer to the question was complete.
No cost workaround (only costs you time and some drive space):
Backup.
Create 4-drive RAID-10.
Move data back.
You'll take a hit in drive space (RAID-1 space=2n, versus RAID-5 space=3n), but the performance boost should be noticeable.
Backup.
Create 4-drive RAID-10.
Move data back.
You'll take a hit in drive space (RAID-1 space=2n, versus RAID-5 space=3n), but the performance boost should be noticeable.
As for the above issue that can happen when transfering from multiple locations (meaning the sd card). The server is attempting to grab a file located on another systems SD card reader so it will try to get the transfer done as soon as possible. I personally would not go that route and instead have her copy the data first to her local system, then from the local system transfer it to the server etc.