Solved

EMC CX SAN IO Performance issues

Posted on 2014-03-19
7
1,651 Views
Last Modified: 2014-03-28
Hello EE,

I need expert advise fast.  We are rolling out Microsoft Dynamics and it is currently on an EMC CX 4120 and performance is slow

 Read Rand IOPS       Read Rand MBps       Write Rand IOPS       Write Rand MBps       Read Seq IOPS       Read Seq MBps       Write Seq IOPS       Write Seq MBps
 CX SQL Data (RAID5)       2856.28       22.31       786.57       6.14       16742.91       130.8       561.69       4.38
 CX SQL Log (RAID 1+0)       1726.54       13.48       763.36       5.96       18085.56       141.29       3561.35       27.82
 VNX SSD RAID5 (TEST01)       22036.2       172.15       13235.9       103.4       19576.9       152.94       15733.4       122.91
 VNX RAID5 (BUILD01)       61369.34       479.44       100.53       0.78       66381       518.6       10759.17       84.05

We’re being asked to dramatically increase SQL IO performance by the end of this week. And adding 6TB of RAM, and trading in/buying a new SAN isn’t an option. Intra-business day downtime for SQL is an option.  The SSD (have 5) seem to be performing, suggestions?

Thanks!
0
Comment
Question by:bergquistcompany
  • 4
  • 3
7 Comments
 
LVL 30

Expert Comment

by:Duncan Meyers
Comment Utility
These are absolutely amazing performance numbers for a CX4-120 (assuming your numbers are accurate)
This one in particular is HUGE:

VNX RAID5 (BUILD01)      Read IOPS: 61369.34      Read MB/s: 479.44

If you really are pushing that sort of I/O through a CX4-120 then pat yourself on the back.

Can you post response times for
CX SQL Data (RAID5)
CX SQL Log (RAID 1+0)
VNX SSD RAID5 (TEST01)
VNX RAID5 (BUILD01)
please?
 Ideally, can you use Unisphere Analyzer?
Also, please check Storage Processor CPU utilisation.
0
 

Author Comment

by:bergquistcompany
Comment Utility
Is the CX a lower performing unit then the VNX 5300?  I am concerned in the IOPS between the 2?  CX seems extremely slow?

I can run the Unisphere Analyzer, but don't know how to read it without EMC support.

Here is the CX report wizard statistics

Read Cache State Enabled both SP
Write Cache State Disabled both SP
Mirrored Write Cache Enabled both SP
Free Memory 66MB
Read Cache Size 529 MB
Write Cache Size 3 MB
Cache Page Size 4 kb
Lower 60% Watermark
High 80% Watermark
HA Cache Valut Disabled
Opt Raid 3 memory 0MB
0
 
LVL 30

Expert Comment

by:Duncan Meyers
Comment Utility
>Is the CX a lower performing unit then the VNX 5300?  I am concerned in the IOPS between the 2?  CX seems extremely slow?
Well, yes. The VNX is a generation newer and significantly faster. However, the perofrmance numbers are meaningless without response times in mS and understanding how the array is configured and how many drives are in the LUNs. A CX4-120 is theoretically capable of more than 20,000 IOPS with enough disks and configured for performance, so it's still pretty capable
0
IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 

Author Comment

by:bergquistcompany
Comment Utility
No, the CX should be able to do what the VNX does and the VNX is a 1Mb iSCSI and the CX 4Mb fiber which I think would impact as well.  I hear your thoughts is there a better way to measure?
0
 
LVL 30

Accepted Solution

by:
Duncan Meyers earned 500 total points
Comment Utility
No, the CX *is emphatically not* able to do what the VNX does
The VNX is a generation newer and closer to a CX4-240 in performance than a CX4-120 with faster processors and more cache.
The connection speed is not the limiting factor here as you can see from your own figures - the VNX is pushing an aggregate 479.44MB/s across 1Gb iSCSI. The limiting factor is the number of hard drives in the RAID groups. To put this into perspective:
For random work;oads
1 x 15K rpm SAS or FC drive can push about 200 IOPS or about 13MB/sec
1 x 10K rpm SAS or FC drive can push about 140 IOPS or about 12MB/sec
1 x 7200 rpm NL-SAS drive or SATA drive can push about 60 IOPS or 11 MB/sec
The drives are actually capable of 2 to 2.5 times rhat base performance as workload increases but beyond about 2.5x, response times go through the roof.

So a 4+1 RAID 5 set of 15K drives is capable of 1,000 - 2,500  IOPS and 65MB/sec.

Things get murkier with SSDs as the performance of SSDs changes with I/O pattern, I/O size and whether the device is reading, writing, reading after writing or writing after reading. Awesome. So - I use a rule-of-thumb of 3,000 IOPS per SSD which is hugely conservative, but keeps you out of trouble with real-world performance.

The bottom line to all this is the number of drives in the array is the limiting factor for performance, not the connectivity technology (except in special cases of large sequential I/O) .

>I hear your thoughts is there a better way to measure?
The 100% best place to measure array performance is on the array using Unisphere Analyzer. To measure disk performance on the host, uou can use perfmon or iostat. Key metrics to look at are:
Physical Disk:
IOPS/Throughput/transactions per second/Disk Transfers/sec - all largely synonymous
MB/sec/Bandwidth/Disk Bytes/sec
Response time in mS/Average Disk sec/Transfer
ABQL/Average Busy Queue Length/Current Disk Queue Length

Response time should be 20mS or less - short peaks to higher numbers are quite acceptable unless the host performance is affected. In this age of SSDs, response times are typically sub 2mS which is really, really cool. But less than 20mS provides excellent host response (beware, incidentally, the hybrid array vendor saying that you have a performance problem with response times of more than 5mS. It's nonsense.)

ABQL should be 2 - 3 per drive in the RAID array. It will peak higher - again, completely acceptable. High response time and high ABQL means you need to add more disk drives (SSD or spinning). You use the IOPS and bandwidth numbers to confirm this.

Now - speaking of IOPS, you need to calculate the array workload based on host IOPS. all RAID constructs have a write penalty - that is; a host write will generate more disk writes depending on what type of RAID you use. For example, a mirrored RAID (RAID 1, RAID 1/0) will generate two disk writes for every host write. RAID 5 generate four disk writes for every host write and RAID 6 generates six disk writes for every host write - so you can see that the RAID type you choose will directly affect the number of disks you need. Here's an example:

Database generating 10,000 host IOPS, 25% write, 75% read
= 2,500 write, 7,500 read
Say you choose RAID 5 because you want disk space efficiency. You'll generate this many IOPS:
(2,500 x 4) + 7,500 = 17,500 IOPS. (2,500 x 4 is write IOPS x write penalty)
Say you choose 15K drives. You'll need:
17,500/200 = 88 drives (always round up!) But as you are using 4+1 RAID 5, you'll need a multiple of 5 drives, so 90 15K drives.
If you choose SSDs, the:
17,500/3000 = 6 drives. Same thing applies: as you are using 4+1 RAID 5, you'll need a multiple of 5 drives, so 10 SSDs.

What if you chose RAID 1/0? The write penalty reduces from 4 to 2, so:
(2,500 x 2) + 7,500 = 12,500 IOPS. (2,500 x 2 is write IOPS x write penalty)
Say you choose 15K drives. You'll need:
12,500/200 = 63 drives (always round up!) But as you are using RAID 1/0, you'll need an even number of drives. so 64 15K drives.
If you choose SSDs, the:
12,500/3000 = 5 drives. Same thing applies: as you are using RAID 1/0, you'll need an even number of drives. so 6 SSDs.

Everyone thinks RAID 5 is cheaper - and it is if you're calculating for space. If you size your array for performance, RAID 1/0 is usually more efficient. But best of all is automatic tiering between SSD, SAD and NL-SAS with SSD caching over the top of the lot - and that's where hybrid arrays (like the VNX) are awesome.
0
 
LVL 30

Expert Comment

by:Duncan Meyers
Comment Utility
Strewth. That was a long post.  :-)
0
 

Author Closing Comment

by:bergquistcompany
Comment Utility
Great post thanks!
0

Featured Post

Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

Join & Write a Comment

I previously wrote an article addressing the use of UBCD4WIN and SARDU. All are great, but I have always been an advocate of SARDU. Recently it was suggested that I go back and take a look at Easy2Boot in comparison.
The Delta outage: 650 cancelled flights, more than 1200 delayed flights, thousands of frustrated customers, tens of millions of dollars in damages – plus untold reputational damage to one of the world’s most trusted airlines. All due to a catastroph…
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now