dtk12
asked on
How to Cisco 6509E port-channel load balaning when only 2 ip addresses are used
We have what I think is an unusual situation for port-channel load balancing. We use a 6509E between our NetApp san and Vm clusters. We have port-channels of 4 or 6 1g cables connecting the hosts and San. The port-channel load balancing is set to ip dst src and this works pretty good for customer port-channels. Storage port-channels aren't performing well at all. Our setup is each host has 1 IP address for attachment to the SAN. The San has 1 IP per data store. This creates a very limited number of source and destination pairs resulting is very poor distribution across cables.
Storage Port-Channel 189
PC 189 IN 17415128909
G1/1/26 160482 0.0009%
G1/2/22 49 0.0000%
G2/1/11 17417839321 100.0156%
G2/2/22 439737 0.0025%
PC 189 OUT 25555359398
G1/1/26 3274093321 12.8%
G1/2/22 88373277 0.3%
G2/1/11 3376972167 13.2%
G2/2/22 18861999453 73.8%
Note on the Inbound packets only 1 port is being used due to connecting to only 1 IP.
Out bound packets are distributed based upon the IP of the data store connecting to.
If have been told by our San folks that we can't add more IP's on either side. We are hoping to find another method of distributing packets ather then src dest IP. Macs won't work either due to the small number of macs in the link.
Does anyone know of another approach to load balancing on a 6509E?
Thanks
Storage Port-Channel 189
PC 189 IN 17415128909
G1/1/26 160482 0.0009%
G1/2/22 49 0.0000%
G2/1/11 17417839321 100.0156%
G2/2/22 439737 0.0025%
PC 189 OUT 25555359398
G1/1/26 3274093321 12.8%
G1/2/22 88373277 0.3%
G2/1/11 3376972167 13.2%
G2/2/22 18861999453 73.8%
Note on the Inbound packets only 1 port is being used due to connecting to only 1 IP.
Out bound packets are distributed based upon the IP of the data store connecting to.
If have been told by our San folks that we can't add more IP's on either side. We are hoping to find another method of distributing packets ather then src dest IP. Macs won't work either due to the small number of macs in the link.
Does anyone know of another approach to load balancing on a 6509E?
Thanks
It depends. Are you using iSCSI or NFS? How is the Netapp configured Etherchannel or LACP
Most of the SANs I've worked with only allow active/passive connections. You can't use LACP with a lot of these devices. You must use MPIO instead.
@craigbeck: With the Netapp it depends since it can be iscsi which supports MPIO or NFS which does not and you have to use datastore loadbalancing. LACP is link aggregation which is supported on the Netapp side but not on othe VMWare side (an etherchannel port may be used instead).
You can use multiple vmkernel ports on the iscsi switch, bind to initiator, and configure round robin on the datastore for MPIO.
You can use multiple vmkernel ports on the iscsi switch, bind to initiator, and configure round robin on the datastore for MPIO.
ASKER
More info: These are NFS datastores and we are using Etherchannel.
Thanks
Thanks
There is no way to do MPIO on NFS. Best you can do is use a separate IP for each datastore so that you can at least have multiple sessions on the etherchannel. Typically you don't want to use too many nics, they will not be used. It appears that you're not using more than 2 nics worth of throughput. Have you looked at your IOPS? What FAS model?
For the In / Out that you pasted, is that from the perspective of the switch, or the NetApp?
Either way, you have poor balance, but understand that each device is responsible for its *outbound* only. That means:
1) 6509E -> NetApp traffic flow is balanced by the 6509E
2) NetApp -> 6509E traffic flow is balanced by the NetApp itself
Regarding flow #1, assuming that is represented by "PC 189 OUT" in your original post, you maybe be able to improve by enabling layer 4 (TCP/UDP port) hashing. Can you please show us the output of the command "etherchannel load-balance" from your 6509E, so that we can see how it's currently hashing? You can experiment with changes by using different "port-channel load-balance" options. Depending on linecard/sup/IOS specifics there are a plethora of options available (try ? in config mode).
Regarding flow #2, as the previous experts have noted, there are protocol and NetApp specific restrictions in place. I'd recommend you break this problem into two separate issues for each of the two directions and perhaps post the question on the NetApp -> clients load balancing in the storage area.
- Dani
Either way, you have poor balance, but understand that each device is responsible for its *outbound* only. That means:
1) 6509E -> NetApp traffic flow is balanced by the 6509E
2) NetApp -> 6509E traffic flow is balanced by the NetApp itself
Regarding flow #1, assuming that is represented by "PC 189 OUT" in your original post, you maybe be able to improve by enabling layer 4 (TCP/UDP port) hashing. Can you please show us the output of the command "etherchannel load-balance" from your 6509E, so that we can see how it's currently hashing? You can experiment with changes by using different "port-channel load-balance" options. Depending on linecard/sup/IOS specifics there are a plethora of options available (try ? in config mode).
Regarding flow #2, as the previous experts have noted, there are protocol and NetApp specific restrictions in place. I'd recommend you break this problem into two separate issues for each of the two directions and perhaps post the question on the NetApp -> clients load balancing in the storage area.
- Dani
ASKER
Thanks for the posts:
paulsolvo: We are running 3140 controllers with OS version 8.0.1
IOPS over the past 6 hours have peaked at 2000 on FC and 1600 on SATA with anaverage of 750.
We have 4 nics from each VM host to the switch per PC
We have 6 nics from NetApp to switch per PC
NetFixr-Dani:
Current load balance settings
DC6509E-01#sh etherchannel lo
EtherChannel Load-Balancing Configuration:
src-dst-ip enhanced
mpls label-ip
EtherChannel Load-Balancing Addresses Used Per-Protocol:
Non-IP: Source XOR Destination MAC address
IPv4: Source XOR Destination IP address
IPv6: Source XOR Destination IP address
MPLS: Label or IP
Sorry I didn't make it clear in my origional port. The following is from Switch to VM Host
Storage Port-Channel 189 for 1 of out Hosts
PC 189 IN 17415128909
G1/1/26 160482 0.0009%
G1/2/22 49 0.0000%
G2/1/11 17417839321 100.0156%
G2/2/22 439737 0.0025%
PC 189 OUT 25555359398
G1/1/26 3274093321 12.8%
G1/2/22 88373277 0.3%
G2/1/11 3376972167 13.2%
G2/2/22 18861999453 73.8%
This is from Switch to NatApp
Storage Port-Channel 80 NA1
PC 80 IN 41242088849
G1/1/1 5773506867 14%
G1/1/23 8054914833 20%
G1/2/1 27155342513 66%
G2/1/1 2221493330 5%
G2/1/23 14631225674 35%
G2/2/1 41670238187 101%
241%
PC 80 OUT 34858364466
G1/1/1 1186532258 3%
G1/1/23 2313039547 7%
G1/2/1 11523741679 33%
G2/1/1 8574406154 25%
G2/1/23 18393368611 53%
G2/2/1 37339404314 107%
228%
I just ran the numbers on the above, I double checked my numbers and as you can see the sum of packets through each cable exceeds the packets through the port channel bt 200+%. Do you think the cisco sh commands are in error?
This totally confuses me.
The VM switches for the storage network are as follow:
Load Balancing: Route based on ip hash
I will also post this on Netapp and VM as suggested.
Thanks
paulsolvo: We are running 3140 controllers with OS version 8.0.1
IOPS over the past 6 hours have peaked at 2000 on FC and 1600 on SATA with anaverage of 750.
We have 4 nics from each VM host to the switch per PC
We have 6 nics from NetApp to switch per PC
NetFixr-Dani:
Current load balance settings
DC6509E-01#sh etherchannel lo
EtherChannel Load-Balancing Configuration:
src-dst-ip enhanced
mpls label-ip
EtherChannel Load-Balancing Addresses Used Per-Protocol:
Non-IP: Source XOR Destination MAC address
IPv4: Source XOR Destination IP address
IPv6: Source XOR Destination IP address
MPLS: Label or IP
Sorry I didn't make it clear in my origional port. The following is from Switch to VM Host
Storage Port-Channel 189 for 1 of out Hosts
PC 189 IN 17415128909
G1/1/26 160482 0.0009%
G1/2/22 49 0.0000%
G2/1/11 17417839321 100.0156%
G2/2/22 439737 0.0025%
PC 189 OUT 25555359398
G1/1/26 3274093321 12.8%
G1/2/22 88373277 0.3%
G2/1/11 3376972167 13.2%
G2/2/22 18861999453 73.8%
This is from Switch to NatApp
Storage Port-Channel 80 NA1
PC 80 IN 41242088849
G1/1/1 5773506867 14%
G1/1/23 8054914833 20%
G1/2/1 27155342513 66%
G2/1/1 2221493330 5%
G2/1/23 14631225674 35%
G2/2/1 41670238187 101%
241%
PC 80 OUT 34858364466
G1/1/1 1186532258 3%
G1/1/23 2313039547 7%
G1/2/1 11523741679 33%
G2/1/1 8574406154 25%
G2/1/23 18393368611 53%
G2/2/1 37339404314 107%
228%
I just ran the numbers on the above, I double checked my numbers and as you can see the sum of packets through each cable exceeds the packets through the port channel bt 200+%. Do you think the cisco sh commands are in error?
This totally confuses me.
The VM switches for the storage network are as follow:
Load Balancing: Route based on ip hash
I will also post this on Netapp and VM as suggested.
Thanks
Ok, a couple things:
- Cisco load balancing is optimized for port counts in powers of 2: 2 / 4 / 8, and therefore you will not gain any throughput by using 6 links.
- Try and change the "port-channel load-balance" method to one which uses tcp/udp ports in the hash, experiment with different options until you find the one which yields the best balance.
- Cisco load balancing is optimized for port counts in powers of 2: 2 / 4 / 8, and therefore you will not gain any throughput by using 6 links.
- Try and change the "port-channel load-balance" method to one which uses tcp/udp ports in the hash, experiment with different options until you find the one which yields the best balance.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.