Link to home
Start Free TrialLog in
Avatar of hlam40
hlam40

asked on

RAID config for exchange 2007

I'm building a new server for exchange 2007 to replace my old exchange 2003. here is what I have:  11 disks SAS drive (300GB 15K   =  8 drives   and 600GB 10K  =  3 drives)

I'm planning to configure the RAID as follow:

OS: RAID 1 (300GB 15K *2) = 278GB
Logs: RAID 5 (600GB 10K * 3) = 1TB
Data: RAID 5 (300GB 15K * 6) = 1.3TB

We have about 420 mailboxes and the limit is 2.5GB each mailbox. What do you recommend for the size like this? Should i use RAID1 instead of RAID5? Please advise. Thanks!
Avatar of Mike
Mike
Flag of United States of America image

That's how I would do it.
SOLUTION
Avatar of Nicolus
Nicolus

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
RAID5?  No.  It might take days to rebuild and you will be exposed.  (Unless this is light usage).
Consider going RAID6 for all data.

Then RAID1 for O/S & write logs.  Buy a pair of small SSDs and use the native software RAID1 to mirror them.  Move all your index files and any "hot" files that have high random and non-sequential I/O to them.

For $500 or so, you can get a pair of SSDs and probably sustain 40,000 random I/Os per second, compared to a few hundred for any of those RAID groups.   RAID6 has a greater performance hit on writes then RAID5, but not on reads with a decent controller.  

The SSDs can be paid for by going with larger disks, and I would use a 6-disk RAID6 because I/O will be balanced better.  If your users do a lot of searching and use exchange for a database and index it a lot, then the SSDs will make the config scream, and will more than make up for the RAID6 performance hit.

But this is all dependent on how much I/O your users generate.   Run performance monitor on your existing config, and describe how the RAID is configured now, and what kind of IOPs and throughput you are seeing.  No point throwing money away if these people check their email once an hour vs once every 2 minutes.
While I agree with most of what dlethe says I would suggest reading this ziff davis article before making a hasty decision:   http://www.zdnet.com/blog/storage/why-raid-6-stops-working-in-2019/805

And another calculator at:  http://www.ibeast.com/content/tools/RaidCalc/RaidCalc.asp

Unfortunately it only goes to Raid5 but gives interesting comments as well.
There are a lot of holes in that article, as the writer didn't know about nearline SATA which has same ECC as the SAS drives, and writer didn't get into performance as being a function of types of I/O and raid level.  But the biggies still ring true
 - RAID5 is nuts due to MTBDL, so go R6 if downtime & data loss is expensive
 - One can't determine "best" storage config w/o knowing I/O requirements.
 - SSDs can make huge differences in price & price/performance .. but they are not a universal fix for all sites
have a look at this link, its from the Exchange Product Team: http://blogs.technet.com/b/exchange/archive/2007/01/15/3397742.aspx and use the storage calculator.  Its the best thing to do.  It will provide the IOP requirements you need and then work on the best disk / raid level from there
@dlethe Your comments about the article are good.  I just pointed it out as a reference. Most written articles are lacking in completeness. I simply pointed to it for some thought.

For example some people insist that you should put your OS on it's own physical disk in just a mirrored (Raid1) array while putting your programs/data on a separate array, usually Raid5 or better.

Others believe that disks have improved to the point that it is now better to put everything on one physical array Raid5 or better.

Many times it comes down to personal preference, cost and needs. It may actually be that the best option is to put you email in the cloud instead of hosting it yourself anyway. This can offer an even higher level of redundancy with multiple servers geographically separated.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Nicolus
Nicolus

This has become the thread to watch!  As we progress into technology changes we also lead to mindset changes; i.e. Mechanical Disk v. SSD, Cost v. Performance, does anyone give a crap either way if we go with plan A v. plan B will we see any "real" benefits - does that 2 milliseconds +/- make a world of difference when there's only 15 users on the office, is cloud the way to go and thereby making the author's question moot?

We are at the cusp of some amazing times here ladies and gentlemen...  When I got into IT nearly 15 years ago RAID of any form was the mandatory Holy Grail...  Yet in the past 7 years (due to enhancement of magnetic media) I have yet to see an internal hard drive fail.  We often overlook a critical fact when we talk about RAID.  For example, in a 3 disk RAID 5 configuration we are constantly assured of "if one drive fails" we're ok.  What is NEVER talked about is, "hey buddy, ALL 3 disks are by the same manufacturer, were put in service at the same time, saw the same environment effects (temp, shock, surge) so who is to say that in you 3,4,5,6, infinity disk RAID config you're not going to lose 4 of those disks within a 2 day period and now you're left sitting on the can, head in hands, telling yourself, "FML, I thought RAID was going to save me!"

I've come learn from some colleagues that are storage managers (one for a top data warehouse and the other for a major cellular carrier) that the best practice when using RAID is to create the RAID, then doing one of two things... either powering down enough disks but still have redundancy, operating the RAID in a deminshed state for about a year, then powering up those disk again so that now you have a fresh disk in your RAID, or option 2, after a year replacing a few disks in the RAID with brand new disks... again, so that you have a RAID that is truly free of any chance of multiple disks failing at the same time due to an across the board wear and tear via their natural life or exposure.

We are also now contrained by software developers as well...  I had a month long conversation with QuickBooks engineers and Intel RAID guys about the deployment of a new QB server in our enterprise.  These two groups were at each other's throats... QB said, RAID slows down our performance as our DB engine sees the data as being fragmented... even though the OS sees the RAID as one disk.  Intel guys kept saying, nope High IO = RAID config...

Well, in the end I let the Intel guy convince me of going RAID and sure as shinola QB performance was embarrassing to say the least.  As soon as we took the DB off of RAID and put it on a SINGLE high RPM, High Throughput, High Cache drive, got departmental credit for the unused drives and spent the recovered money on more RAM for the server, QB performance was flying!  We did use an identical disk in a RAID 1 just for everyone's peace of mind, but to not digress; here is an example where ANY combination of RAID did nothing but to take performance in the wrong direction.

Like I said, this is a the thread to watch!
Nicolus:  Interesting read.  Some very good points addressed. I especially like the point about the "same environment effects".

However, I must argue for the other side. I have been in the IT side of the financial industry  since 1978 when we still ran everything on the Big Blue mainframes.  Over the past year we have had maybe 3 drives fail, all on different servers. In 2 of these cases we were running on Raid 5 and were able to recover easily. The third was on an AIX box and recovered as well.

I have heard of situations like the one you had with QB. We often go through the same with vendors when we try to explain that we run in a full TS environment. They often want to install a client on a workstation (if not the entire program) however we run thin clients which prevents this. This is one of the reasons for having a full test environment.

Once again, it all comes down to cost vs. risk.
Dear pony10us,

I promise not to let my commentary be the source of hijack for this thread...  But if permitted, I would like to reflect on your mention of "Big Blue mainframes"...  Those were the days!  Hours upon hours of flooring and structural support surveys... and then to see them roll in, box after box, and the integration specialists putting it all together... :::heave a sigh:::  those were the days!  Now I can't live without my tablet!  LOL  And I manage my servers, firewalls, and vpn routings all from the comfort of a hammock on a lazy Sunday.

To clarify my comments in the original post in no way was I diminishing the importance of RAID.  The point I was trying to make was that most people think that by SIMPLY have a RAID they are protected since (as we were taught in school) "in the event of a disk failure data from a RAID can be recovered."  However, I never hear a text book state, "hey buddy, all those disks in that RAID, at whatever time X, will have near identical wear and tear to them and the chance of more than one disk failing in a RAID operation within a near timeframe of each other is a plausible possibility...  So A) make a rotation schedule to inject new drives into your array before your array fails" and B) although there are computational performance metrics for RAID performance, unless your application benefits from that all you're looking at really is redundancy alone and not performance gain.

Thank you again for the nod on the content!  :-)  Always appreciated.
Nicolus:

I fully agree and having a CE onsight to "reprogram" the big box by moving wires around on posts.  :)

Anyway, back to the subject again. Over the years I have seen the follwoing scenarios where having a Raid 5 configuration still didn't do any good:

1. A Raid 5 with 3 drives and 2 of them fail either at the same time or the first one isn't noticed until the second one fails. (sort of fits closely with what you are saying). Without a good backup you have nothing.

2. All the work of doing a Raid 5 for redundancy but only having one controller card that fails.

3. Single power supply

4. If you have dual power supplies but you plug them both into the same source such as a UPS or wall outlet

5. single power source (we have a UPS on one leg that maintains power until the generator kicks in if the power goes out. The generator is dual powered by natural gas and/or propane.

6. A single server. You need to have a disaster recovery location that is geographically different. If you live in an area that is prone to some form of natural disaster such as earthquakes in the west, hurricanes in the east or floods just about anywhere this is very important.

As both Nicolus and I are trying to stress, the version of Raid you use is a small portion of securing the data.

Raid 1 with dual controllers is aruguably the safest yet the most costly since you only have use of 50 percent of the total drive space.

Raid 5 is lower cost as you give up the us of "approximately" 1 drive for parity however you run the risk(s) mentioned in previous posts.

There are other raid levels however since these are the two asked about I will only cover them right now.

Not to beat a dead horse but again it comes down to risk vs. cost. At the size of the organization I would seriously not rule out the cloud as well.