asked on

How to Cisco 6509E port-channel load balaning when only 2 ip addresses are used

We have what I think is an unusual situation for port-channel load balancing. We use a 6509E between our NetApp san and Vm clusters. We have port-channels of 4 or 6 1g cables connecting the hosts and San. The port-channel load balancing is set to ip dst src and this works pretty good for customer port-channels. Storage port-channels aren't performing well at all. Our setup is each host has 1 IP address for attachment to the SAN. The San has 1 IP per data store. This creates a very limited number of source and destination pairs resulting is very poor distribution across cables.

Storage Port-Channel 189

PC 189      IN      17415128909
G1/1/26            160482      0.0009%
G1/2/22            49      0.0000%
G2/1/11            17417839321      100.0156%
G2/2/22            439737      0.0025%




PC 189      OUT      25555359398
G1/1/26            3274093321      12.8%
G1/2/22            88373277      0.3%
G2/1/11            3376972167      13.2%
G2/2/22            18861999453      73.8%

Note on the Inbound packets only 1 port is being used due to connecting to only 1 IP.
Out bound packets are distributed based upon the IP of the data store connecting to.

If have been told by our San folks that we can't add more IP's on either side. We are hoping to find another method of distributing packets ather then src dest IP. Macs won't work either due to the small number of macs in the link.

Does anyone know of another approach to load balancing on a 6509E?

Thanks

Paul Solovyovsky

It depends. Are you using iSCSI or NFS? How is the Netapp configured Etherchannel or LACP

Craig Beck

Most of the SANs I've worked with only allow active/passive connections. You can't use LACP with a lot of these devices. You must use MPIO instead.

Paul Solovyovsky

@craigbeck: With the Netapp it depends since it can be iscsi which supports MPIO or NFS which does not and you have to use datastore loadbalancing. LACP is link aggregation which is supported on the Netapp side but not on othe VMWare side (an etherchannel port may be used instead).

You can use multiple vmkernel ports on the iscsi switch, bind to initiator, and configure round robin on the datastore for MPIO.

dtk12

ASKER

More info: These are NFS datastores and we are using Etherchannel.

Thanks

Paul Solovyovsky

There is no way to do MPIO on NFS. Best you can do is use a separate IP for each datastore so that you can at least have multiple sessions on the etherchannel. Typically you don't want to use too many nics, they will not be used. It appears that you're not using more than 2 nics worth of throughput. Have you looked at your IOPS? What FAS model?

NetFixr-Dani

For the In / Out that you pasted, is that from the perspective of the switch, or the NetApp?

Either way, you have poor balance, but understand that each device is responsible for its *outbound* only. That means:

1) 6509E -> NetApp traffic flow is balanced by the 6509E

2) NetApp -> 6509E traffic flow is balanced by the NetApp itself

Regarding flow #1, assuming that is represented by "PC 189 OUT" in your original post, you maybe be able to improve by enabling layer 4 (TCP/UDP port) hashing. Can you please show us the output of the command "etherchannel load-balance" from your 6509E, so that we can see how it's currently hashing? You can experiment with changes by using different "port-channel load-balance" options. Depending on linecard/sup/IOS specifics there are a plethora of options available (try ? in config mode).

Regarding flow #2, as the previous experts have noted, there are protocol and NetApp specific restrictions in place. I'd recommend you break this problem into two separate issues for each of the two directions and perhaps post the question on the NetApp -> clients load balancing in the storage area.

- Dani

dtk12

ASKER

Thanks for the posts:

paulsolvo: We are running 3140 controllers with OS version 8.0.1

IOPS over the past 6 hours have peaked at 2000 on FC and 1600 on SATA with anaverage of 750.

We have 4 nics from each VM host to the switch per PC
We have 6 nics from NetApp to switch per PC

NetFixr-Dani:

Current load balance settings

DC6509E-01#sh etherchannel lo
EtherChannel Load-Balancing Configuration:
src-dst-ip enhanced
mpls label-ip

EtherChannel Load-Balancing Addresses Used Per-Protocol:
Non-IP: Source XOR Destination MAC address
IPv4: Source XOR Destination IP address
IPv6: Source XOR Destination IP address
MPLS: Label or IP

Sorry I didn't make it clear in my origional port. The following is from Switch to VM Host

Storage Port-Channel 189 for 1 of out Hosts

PC 189      IN      17415128909
G1/1/26            160482      0.0009%
G1/2/22            49      0.0000%
G2/1/11            17417839321      100.0156%
G2/2/22            439737      0.0025%




PC 189      OUT      25555359398
G1/1/26            3274093321      12.8%
G1/2/22            88373277      0.3%
G2/1/11            3376972167      13.2%
G2/2/22            18861999453      73.8%

This is from Switch to NatApp

Storage Port-Channel 80 NA1

PC 80      IN      41242088849
G1/1/1            5773506867      14%
G1/1/23            8054914833      20%
G1/2/1            27155342513      66%
G2/1/1            2221493330      5%
G2/1/23            14631225674      35%
G2/2/1            41670238187      101%
                   241%

PC 80      OUT      34858364466
G1/1/1            1186532258      3%
G1/1/23            2313039547      7%
G1/2/1            11523741679      33%
G2/1/1            8574406154      25%
G2/1/23            18393368611      53%
G2/2/1            37339404314      107%
                   228%

I just ran the numbers on the above, I double checked my numbers and as you can see the sum of packets through each cable exceeds the packets through the port channel bt 200+%. Do you think the cisco sh commands are in error?
This totally confuses me.

The VM switches for the storage network are as follow:

Load Balancing: Route based on ip hash

I will also post this on Netapp and VM as suggested.

Thanks

NetFixr-Dani

Ok, a couple things:

- Cisco load balancing is optimized for port counts in powers of 2: 2 / 4 / 8, and therefore you will not gain any throughput by using 6 links.

- Try and change the "port-channel load-balance" method to one which uses tcp/udp ports in the hash, experiment with different options until you find the one which yields the best balance.

ASKER CERTIFIED SOLUTION

Paul Solovyovsky

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial