Clariion CX4 iSCSI Performance


Hoping there are some people with good experience implementing iSCSI on a EMC Clariion CX4's.

Basically the problem is excessive amount of outbound discards on the switchports connected to hosts in an iSCSI environment.

A brief overview of the configuration...

- 2 x 10Gb ports per SP (different subnet per port, with 2 subnets across entire SAN)
- 2 x Catalyst 3750X in a Stack (dedicated to iSCSI, no VLAN's configured, all ports native VLAN 1)
- Server 2008 R2 connected using 2 x Intel ET adapters

                           SPA                SPB
                      10Gb-10Gb    10Gb-10Gb
                          |          |             |         |
                          Cisco Catalyst 3750X
                                        |      |
                         Server (Microsoft iSCSI)

Port counters on the switch interfaces connected to the server are showing a high amount of outbound discards particularly when performing large sequential reads from the SAN (backups etc). My theory is that the SAN is sending data at 10Gb speeds (although obviously determined by the disks in the back-end) which is overwhelming the capabilities of the 1Gb ports. SNMP on the switch is not showing that the 1Gb ports are being overutilized but I suspect this could be microburts which are not displayed by SNMP monitoring.

My question is, if the problem above is an accurate diagnosis, how do you prevent the 1Gb ports from being overwhelmed by the 10Gb ports? I have 'flowcontrol receive desired' configured on the switch interfaces...will flowcontrol only function correctly if the ports are set to full auto-negotiation? The 1Gbps ports are set to full auto but the 10Gb ports on the SAN cannot be set to full auto and have to be configured to 10Gb. Also, not seeing any PAUSE frames in the show flow control command on the Catalyst.

Other factors to note are...

- Jumbo frames for 9000 bytes configured (can ping with packet size of 8972 when running ping -f to the SP ports from the Windows host)
- TcpDelAckTicks set to 0 in registry (emc150702)
- TcpAckFrequency set to 1 in registry (emc150702)
- iSCSIDisableNagle created in registry (emc150702)

I'm not expecting an answer from the limited detail I've provided above but really hoping I can get onto someone who has a lot of knowledge in this area so it points me in the right direction.

Thanks in advance for your help!
Who is Participating?
rfc1180Connect With a Mentor Commented:
Enabling jumbo frames will not decrease the amount of OutDiscards nor is duplex mismatches.

>My theory is that the SAN is sending data at 10Gb speeds (although obviously determined by the disks in the back-end) which is overwhelming the capabilities of the 1Gb ports

Correct, this is very common is iSCSI and there is very little you can do; if the discards are low, the TCP will recover from a few dropped frames, however, if the discards are very frequent, then yes, this will impact TCP performance, etc.

> SNMP on the switch is not showing that the 1Gb ports are being overutilized but I suspect this could be microburts which are not displayed by SNMP monitoring.

I would put my next paycheck on that it is related to micorbursts, get a SNMP grapher from PRTG:

once installed, change the interval to every second and then monitor.

Good Luck

P.S I do not make that much, plus I am married, so I do not see a cent of that paycheck.
Regardless of your controller speed on the SAN have (10 GB) your connecting speed is determined by speed of the Switch Port. I believe Catalyst 3750 does not have 10GB Switch Ports. (Not sure though)

I recommend you identify the Switch Port Speed first. Then set your NIC speeds on the server, SAN controllers and Switch Port to Match them precisely.
Eg. if Switch has 1 GB speed . Set to 1 G Full, NIC 1 G Full San controllers 1 G Full. Enable Jumbo Frame support in NIC Drivers in windows and check if they are supported by your SAN Controllers too (Most likely they should).

Check for discards again..
TKS25Author Commented:
Thanks for your suggestions.

The Cataylyst 3750X is a 10G capable switch via an addiitional module so I can confirm that the SAN is def connected to the switch at 10GBps.

rfc1180 - What would you call a low amount of discards? I've not really worked on an issue like this before so I don't know what to expect. We're getting discards in the thousands during a large data transfer. Would this situation be eased by a switch with extremely large port buffers? I believe that the Catalyst has a 3MB buffer that's shared between all 24 1G ports in the switch (10G ports have a seperate ASIC).

Already running PRTG so will change the scan interval as suggested and see if it reveals somewhat more!

I'd imagine this issue is prevalent/possible in any iSCSI solution especially considering 10GB is being pushed by vendors. Be interested to know how people in an Equallogic environment address this as there's no option to switch to FC! I tell you...whoever said iSCSI is simpler than FC must not have had the experiences I've had with iSCSI.

Jaydeep_vermaConnect With a Mentor Commented:
Hi there,

The specification for the Cisco switch reads as follows:


    * 24 and 48 10/100/1000 PoE+ and non-PoE models
    * PoE+ with 30W power on all ports in 1 rack unit (RU)
    * Optional four 1 Gb Ethernet SFP (Small Form-Factor Pluggable) or two 10 Gb Ethernet SFP+ uplink network modules

that would mean that there are two uplinks from the CX EMC to the Cisco switch.   I am not quite sure how the CX 10 GB ports are connected as the only 10 ports are the uplink ports on the Cisco switch.    The difference between the uplink port and a switch port is that the uplinks are usually used to connect one switch to another to form a single switch.   If normal Ethernet cables are called straight cables than uplink Ethernet cables are cross-overs.    Please check if the uplinks are supposed to be used as switch port.

Dell has a solution for integration of 1 GB and 10 GB infrastructure.   It involves a Dell Blade server having two 10 GB switches and two 1 GB switches.    The 10 GB switches are then uplinked to two 1 GB swithes using a 10 GB uplinks.   The two 1 GB outside switches are then used to connect to 1 GB Equallogic Arrays.    The two 1 GB outside switches are also stacked together and two 10 gb switches are also stacked or LAG created.

I am attaching the full white paper explaining how Equallogic addresses this issue with iSCSI SAN.
QlemoBatchelor, Developer and EE Topic AdvisorCommented:
This question has been classified as abandoned and is being closed as part of the Cleanup Program. See my comment at the end of the question for more details.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.