RAID 5 for combined OS and Data Volumes....Bad Idea?

Quick synopsis
----
Basically I work for a company which provide software which requires a database servers.  The standard set-up for these server's is always RAID 5 using some like 5 physical disks and 1 hot spare.  

We then partition a C and D drive across the RAID container, OS on C: drive and the data on D: drive

I encountered an issue the other day when one of the disks on the RAID failed - Windows blue screened and crashed.  On trying to reboot the server reported "Unable to find bootable device" which would suggest no MasterBootRecord could be found.

We then added the hotpspare to the RAID5 and let it start rebuilding from parity bits.  Whilst it was rebuilding  at got to around 7% and 10% rebuild, we tried rebooting and received the same "Unable to find bootable device".  We eventually waited for the disk to fully rebuild and then Windows would boot up with no problems.

The Questions
------

Now, having the OS volume and the data volume on the same RAID is normally bad idea in my opinion....Would you agree and that RAID 1 for the OS and RAID 5 for the data volume would be better and if so, why?

I presume the OS crashed and wouldn't load because elements of the OS were held on this faulty disk which failed, and would only boot when it had rebuilt and could access what is needed again...however isn't RAID 5 supposed to provide this data using parity bits whilst the disk is faulty or rebuilding?

Ben
LVL 1
benowensAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

andyalderSaggar maker's bottom knockerCommented:
RAID 1 for the OS and RAID 10 for the data is preferable, it may cost more but it's much faster. RAID5 performance is poor fir write and in your current single RAID 5 you've got the pagefile on RAID5 which is always advised against.

RAID 5 performance with a failed disk can be so slow that the server's unusable.

The OS crash shouldn't be down to the single disk failed in the RAID but you may have encountered an unrecoverable read error on one of the remaining disks,
 
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
lnkevinCommented:
however isn't RAID 5 supposed to provide this data using parity bits whilst the disk is faulty or rebuilding?

It should. Your case, I suspect you have more than one failed disk. Normally, when you have only one failed disk, if you had a hot spare already, it would go ahead and rebuilt with your hot spare (shouldn't need to manually add hot spare. Your case, when you add the hot spare, it rebuilt the second failed drive and your hot spare is not available any more.

I agree with Andy on RAID 1 and 10 configuration since you are running database that requires more write access for logs.

K
0
benowensAuthor Commented:
RAID 10 is a good point.  I remember seeing this recomendation for Progress databases and guessed it would be a standard for most database systems.  RAID 10 is quite expensive though isn't it?

It was only 1 faulty disk, you could see that in RAID configuration and we only had to replace one disk for sure!

I guess I just wanted someone to confirm that using RAID 5 for an OS partition is stupid because.... if the data in inaccessible on a disk, then obtaining that data from parity bits is going to be slow.  The OS doesn't like that it can't get the information immediately and the OS will crash as a result.  It was only when the data had been recreated from the disk rebuild that the OS was happy again.  Does that sound like a correct summary.  Can someone confirm and back that up if it's true?
0
lnkevinCommented:
It's a fair summary and I believed Andy did mention that as well.

K
0
andyalderSaggar maker's bottom knockerCommented:
I wouldn't expect the BSOD, in a boot from SAN environment you're meant to be able to failover paths which can take several seconds, of course the driver parameters tell it this and so the OS knows to wait. So it's possible later drivers will fix the BSOD problem.

During boot though drivers aren't loaded and it's using primitive int13h extensions and boot loaders so I could well expect problems there.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft Server OS

From novice to tech pro — start learning today.