Best way to replace drives in RAID5 array

Posted on 2014-12-30
Last Modified: 2016-11-23

I'm running Windows 2008 Server R2 on my Dell Poweredge.  I've got a Perc300 controller and 3 WD500 RE4 in a RAID 5 array.  I just had drive 1 fail.  I know that it's a good rule of thumb to replace the other hard drives too as they may be failing soon.  The server and drives are about 5 years old.  I needed to get the server online and the only drive I had available for temporary replacement was a WD 1TB Red drive.  I've installed this drive and successfully rebuilt the RAID5 array.  I'm not comfortable leaving this drive in for long and I'd like to put back the WD500 RE4 drives.  So, I've ordered 3 WD500 RE4s to replace all three of the drives.  What's the best way to do this?  I was thinking that I needed to remove one drive at a time and perform a rebuild on that drive.  When that's done, I'll take out the next drive, perform a rebuild.  Then finally I'll remove the third drive, replace it and then do another rebuild.  This server is in service 7 am to 8 pm 6 days a week so I don't have enough down time during a normal day to take down the server for hours at a time so that's why I was thinking about this solution.  Will this work or is there a better way to get this done?  Thanks for your help!
Question by:kendalltech
LVL 10

Accepted Solution

schaps earned 500 total points
ID: 40525019
I assume you mean one disk at a time overnight? I would take the time to do a full backup before starting the replacement in case something goes wrong, but it's probably the best way to get the job done without having to spend a lot of after-hours time onsite.
Otherwise, plan to spend a few hours one night to do that full backup, replace all the drives, rebuild the array, and restore the data. However, doing it one drive at a time is pretty safe, it's just doing what the RAID system is designed to do. You can swap a drive each night and go home when the rebuild starts, and you can check the data integrity between those overnight swaps.
If you have a fourth slot, you might consider adding a fourth drive if you do the all-at-once method. And, in that case, consider whether RAID10 would do the job better for you (it depends on your needs, really). Even if you stick with RAID5, I like a four-drive array better than three.

Expert Comment

ID: 40525488
I would avoid doing numerous rebuilds on a RAID5 for this purpose. If there are any bad blocks on the old drives that are encountered during rebuilds, you're going to end up with RAID stripes being "punctured" which can lead to data corruption.

I'd recommend getting a full validated backup of all the data, then create a completely new RAID set on the new drives to restore to.

You might want to go ahead and replace the failed drive with one of those new ones just to get the RAID5 in a healthy / non-degraded state, but only if you don't have the option to get a full backup and take the downtime needed to restore to a new RAID set.

Expert Comment

by:Glenn M
ID: 40525667
The process you're suggesting should work fine *if* after each rebuild you verify that all hard disk drive members are properly accounted for, fully functional individually, and the RAID controller posts correctly and without errors.

Plan around the degraded performance you'll get during the rebuild. And think about adding an additional drive to that array as a online spare - some additional piece of mind for the future.

Good luck.
Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

LVL 47

Expert Comment

ID: 40526351
A rebuild on 5 year old disks when one of them died is too high risk.  You are at risk of losing everything and are on borrowed time. You need to get a 1TB drive and do an image backup of the RAID5 (boot to unix and just use dd, or your favorite bare metal backup).

Then yank the drives and replace them with all new drives.  Initialize the RAID, then do an image restore from the 1TB drive.   This will not put ANY data at risk.

Expert Comment

ID: 40526666
To clarify why some of us warn against the RAID5 rebuild... 2 big risks:
1. The additional I/O load on the remaining two drives may end up triggering a failure on one of them, causing the RAID set to fail completely
2. Any bad blocks encountered during the rebuild on the two drives are going to cause a "puncture" at minimum (which can strangely seems to become contagious and spread bad blocks to other drives, triggering drive failures and additional corruption), or might even cause the S300 (not the best RAID card around - cheap model that's prone to problems) to fail the rebuild, or even panic and fail the RAID set.

If your #1 priority is the safety of the data and you don't have a validated backup yet, you should do that before attempting anything else. Then if data availability is the next biggest priority, you could go ahead w/ trying the rebuild, and trust the card to do its job... if it works fine, maybe you can do the whole process online (though I don't know if the S300 supports expanding the RAID size live like the H-series controllers do) with more purposeful failure/rebuilds. Typically it's not recommended to try it that way anyway though, and live drive upgrades are only done with H-series controllers using the "replace disk" function that mirrors data to the replacement drive without putting the RAID set in a degraded state.

Good luck!

Author Closing Comment

ID: 40578784
After three nights of rebuilding, it worked great.  Now I have a server with new RAID5 hard drives.  Thanks!

Featured Post

Enterprise Mobility and BYOD For Dummies

Like “For Dummies” books, you can read this in whatever order you choose and learn about mobility and BYOD; and how to put a competitive mobile infrastructure in place. Developed for SMBs and large enterprises alike, you will find helpful use cases, planning, and implementation.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Learn about cloud computing and its benefits for small business owners.
Create your own, high-performance VM backup appliance by installing NAKIVO Backup & Replication directly onto a Synology NAS!
This video Micro Tutorial explains how to clone a hard drive using a commercial software product for Windows systems called Casper from Future Systems Solutions (FSS). Cloning makes an exact, complete copy of one hard disk drive (HDD) onto another d…
This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…

821 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question