asked on

EMC Clariion CX300 with VmWare - to Raid 6 or not to Raid 6?

Hi

We're using an EMC Clariion CX300 with 3 enclosures. Enclosure 0 is filled with 15x72GB disks, enclosures 1 and 2 are filled with 15x300GB disks each.

As of FLARE 02.26, EMC introduced RAID6 support for the CX300. The EMC is used only by three VMware ESX 3.5 servers.

My current plan looks as follows (all servers are virtual):
- The File-Server gets a 9x300GB disk RAID 6 (~1.75TB space)
- SQL Server gets a 4x72GB (120GB) Raid 10 for Logs and a 9x 72GB Disk RAID 6 (~450GB) for Data files
- Exchange Server gets a 5x300GB Raid 5 (~1TB)
- For all other servers (about 10 more VMs), I provide two 5x300GB Raid 5s
- One Hot-Spare Disk per Enclosure (1x72, 2x300GB)
- No plan so far for the remaining 1x 72GB disks and 4x 300GB disks

The provided space in this configuration should be more than sufficient until June 2013, when the support for our EMC expires and we need a new storage solution anyways.

File, SQL and Exchange server have by far the highest Average Disk Usage (File 5000KBps; SQL 2100KBps; Exchange 2400KBps). Peaks are hit during Backup (File, SQL) and Database Maintenance (Exchange), which happens at night with almost zero user load. All other VMs have very low average Disk usage, meaning <= 200KBps and no peaks except during backups.

My questions are:
- Do you recommend Raid 6 usage for the two 9-disk RAIDs? If not, please add your configuration proposal.
- How is the performance of RAID 6 comparing to RAID 5 with (our) EMC system?
- Generally, is it a good idea to make a RAID containing 9 disks?
- Any other objections about my current plan of RAID configurations?

Thank you in advance for your comments.

Paul Solovyovsky

I am not sure on the RAID5 vs RAID6 in an EMC Environment but you have to be carefull with the 1.75TB threshold since 2TB is the max on a datastore and you have to take into account 5% loss for encapsulating from ntfs to vmdk, addition of swapfile for VM memory and room for snapshots if you're using applications that will use vmware snapshot capability.

For instance:

1.75TB NTFS = 87.5GB overhead for VMDK

Let's say you have 4GB RAM which would require another 4GB in the swap file and I would reserve 6GB of space

Datastore creation removes 500MB for encapsulating into FC or iSCSI

If you're using VMWare Snapshots for VCB or 3rd party backup apps another 10GB would normally suffice.

If you're having the VM in a single store this wouldn't leave you much room.

My $.02

Scripting_Guy

ASKER

very interessting $.02, thank you paulsolov. I'm aware of the 2TB limit as I've hit it last time i wanted to make a 2.5TB datastore for our servers. I however was not aware that vmware has a swap file of the RAM size of the VM, nor did I know that there's a 5% overhead from NTFS to VMDK.

We do not use VMware snapshots, especially not on the File-server as it would become too big I guess. We're backuping using BackupExec.

I guess you're comming fromt he VMware corner here, so one more quick question, but in order go be able to award you with full points, I will open a new question in the VMware corner and leave this one open for the EMC people. Plese check out this link: https://www.experts-exchange.com/questions/24356049/VMware-ESX-3-5-Recommended-Block-size-for-a-1-75TB-datastore-for-a-single-file-server.html

Duncan Meyers

- Do you recommend Raid 6 usage for the two 9-disk RAIDs? If not, please add your configuration proposal.
Nope. RAID 5 gives much better performance. There are some very specific guidelines around the use of RAID 6 on a CLARiiON. RAID 6 gives you marginally improved protection from a double drive failure (statistically very low likelihood) at the cost of poor performance.

- How is the performance of RAID 6 comparing to RAID 5 with (our) EMC system?
Parity RAID works like this: For every host write, you get 4 disk operations for RAID 5 and 6 for RAID 6:

For RAID 5:
1. Read original data
2. Read original parity
Create new parity
3. Write new parity
4. Write new data

For RAID 6:
1. Read original data
2. Read original parity
3. Read second parity
Create new parity
4. Write new parity
5. Write second new parity
6. Write new data

So you can see that there is a whole lot more going on at the disks in RAID 6. CLARiiON has RAID 5 write optimizations that also help improve performance that don't apply under a RAID 6 schema. For a VMware environment, use large RAID 5 or RAID 1/0 RAID groups, or use striped metaLUNs.
- Generally, is it a good idea to make a RAID containing 9 disks?
Yes. With Fibre Channel disks, no problem. Do not exceed 9 disks per RAID 5 group for SATA disks, tho'. The more disks, the more performance you get. VMware by its nature presents a higly random I/O load, so you should configure the array with that in mind. The remainder of the disk workload is also highly random.

- Any other objections about my current plan of RAID configurations?
Yep.
- 1 hot spare per 30 disks is fine. I'd allocate two 300GB drives as hot spares and make the most of the 73GB drives
- On the 300GB drives, create 4 x 7 drive RAID 5 sets and use striped metaLUNs across all the RAID groups. You'll have just a little short of 4000 IOPS available in that configuration if you've got 10K disks, and about 5600 IOPS for 15K disks.
- The first 5 drives in enclosure 0 will have about 6GB useable each. That's it. So best remove them from the calculations
- Do not put databases (or any random I/O load on RAID 6 or SATA drives.
- Best practice is 500GB VMFS LUNs.
- The disk usage figures you quote is a measure of bandwidth (MB/s). Throughput (IOPS) is loads more important in you environment. Generally speaking, high throughput = low bandwidth and vice versa.

Windows Perfmon will give you both throughput and bandwidth stats. Use the following counters:

Physical Disk:
Select the drives that you want to monitor
Disk Reads/sec
Disk Writes/sec
gives throughput.

Disk Reads bytes/sec and
Disk Write bytes/sec give bandwidth.

You can also collect statistics from the CX300. Right-click on the array serial number in Navisphere and select Properties. Tick the Statistics Logging box on the General tab. You may also have Navisphere Analyzer installed, a stats monitoring package (although this would be unusual for a CX300). Right click on the array serial number in Navisphere and select Analyzer. If you have options greyed out, then Analyze is not installed, although you can collect stats in an encrypted form. If you have options available, you can pull more stats off the CX300 than you can possibly analyze.... :-)

Hope this helps!

Scripting_Guy

ASKER

Thank you meyersd! This is some good input. Although, I need a little further explanation to some things you've said - I feel a bit newbish :)

"- On the 300GB drives, create 4 x 7 drive RAID 5 sets and use striped metaLUNs across all the RAID groups. You'll have just a little short of 4000 IOPS available in that configuration if you've got 10K disks, and about 5600 IOPS for 15K disks"
We have the 10K FC disks, but this is not really your point I guess. What do striped metaLUNs do? How do I create them in EMC Navisphere? And:

"- Best practice is 500GB VMFS LUNs."
Is this not a contradiction to what you said above? 4x7 drive Raid 5s give much larger LUNs than 500GB. But perhaps I am just misunderstanding you :)

"- The first 5 drives in enclosure 0 will have about 6GB useable each. That's it. So best remove them from the calculations"
I think its 4 drives, isn't it? Those are the disks where the EMC stores its configuration stuff, right? Is this noticable performancewise? I was planing to use those 4 disks as Raid 1/0 for my SQL logs, because the space is by far enough.

Also, I should probably mention that currently some disks are in use by the VMware and therefore cannot just be deleted and recreated. I need to migrate the VMs stored on these LUNs. I have 8 300GB disks and 6 72GB disks which are currently storing my VMs.

ASKER CERTIFIED SOLUTION

Duncan Meyers

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Scripting_Guy

ASKER

meyersd, you're the man! really!

So basically, a stripe metaLUN is something like a "raid of raid", and i can use all the 28 disks on every striped metaLUN. Sounds really nice. Downside seams to be that if one of those 4 raid5 fails, all my data is gone. But I think i can live with that.

one last question, can I extend a (striped) metaLUN?

I currently use Enclosure 1 Disks 6 - 13 (Raid 5) as a 1.75TB storage, where most of my VMs are stored. (Lets forget about enclosure 0, I can handle that). How would you practically "migrate around" those VMs to get to the desired solution you presented above? Can I just start with enclosure 2 and a striped metaLUN over those two RAIDgroups, then move the VMs to the new metaLUN, then prepare enclosure 1 and then extend the stripped metaLUN?

Duncan Meyers

So basically, a stripe metaLUN is something like a "raid of raid", and i can use all the 28 disks on every striped metaLUN. Sounds really nice. Downside seams to be that if one of those 4 raid5 fails, all my data is gone. But I think i can live with that.

one last question, can I extend a (striped) metaLUN?
Yes indeed. You can add another component of the same size as the original components (so 125GB in the example above), or you can just add a chunk-o'-disk with a concatenate expansion - although you've then undone all your good work with the striped meta...

I currently use Enclosure 1 Disks 6 - 13 (Raid 5) as a 1.75TB storage, where most of my VMs are stored. (Lets forget about enclosure 0, I can handle that). How would you practically "migrate around" those VMs to get to the desired solution you presented above?
You could use Stirage Vmotion to redistribute the VMs to new LUNs. There is a plug-in for VI3.5 (http://sourceforge.net/projects/vip-svmotion/) that is much easier to use the the Storage Vmotion CLI.

Can I just start with enclosure 2 and a striped metaLUN over those two RAIDgroups, then move the VMs to the new metaLUN, then prepare enclosure 1 and then extend the stripped metaLUN?
Yes indeed - but be prepared for the stripe meta expansion to take some time. Given that constraint, I'd be inclined to take two 250GB LUNs, then make them into 500GB metaLUNs, and carefully manually distribute the load across the disk shelves by chosing where to put the VMs - it's a bit more manual work, but the end result will be just as satisfactory.

Duncan Meyers

Oh, and thank you for your kind words. Glad I could help. :-)