Best way to replace drives in RAID5 array

Posted on 2014-12-30
Last Modified: 2016-11-23

I'm running Windows 2008 Server R2 on my Dell Poweredge.  I've got a Perc300 controller and 3 WD500 RE4 in a RAID 5 array.  I just had drive 1 fail.  I know that it's a good rule of thumb to replace the other hard drives too as they may be failing soon.  The server and drives are about 5 years old.  I needed to get the server online and the only drive I had available for temporary replacement was a WD 1TB Red drive.  I've installed this drive and successfully rebuilt the RAID5 array.  I'm not comfortable leaving this drive in for long and I'd like to put back the WD500 RE4 drives.  So, I've ordered 3 WD500 RE4s to replace all three of the drives.  What's the best way to do this?  I was thinking that I needed to remove one drive at a time and perform a rebuild on that drive.  When that's done, I'll take out the next drive, perform a rebuild.  Then finally I'll remove the third drive, replace it and then do another rebuild.  This server is in service 7 am to 8 pm 6 days a week so I don't have enough down time during a normal day to take down the server for hours at a time so that's why I was thinking about this solution.  Will this work or is there a better way to get this done?  Thanks for your help!
Question by:kendalltech
LVL 10

Accepted Solution

schaps earned 500 total points
ID: 40525019
I assume you mean one disk at a time overnight? I would take the time to do a full backup before starting the replacement in case something goes wrong, but it's probably the best way to get the job done without having to spend a lot of after-hours time onsite.
Otherwise, plan to spend a few hours one night to do that full backup, replace all the drives, rebuild the array, and restore the data. However, doing it one drive at a time is pretty safe, it's just doing what the RAID system is designed to do. You can swap a drive each night and go home when the rebuild starts, and you can check the data integrity between those overnight swaps.
If you have a fourth slot, you might consider adding a fourth drive if you do the all-at-once method. And, in that case, consider whether RAID10 would do the job better for you (it depends on your needs, really). Even if you stick with RAID5, I like a four-drive array better than three.

Expert Comment

ID: 40525488
I would avoid doing numerous rebuilds on a RAID5 for this purpose. If there are any bad blocks on the old drives that are encountered during rebuilds, you're going to end up with RAID stripes being "punctured" which can lead to data corruption.

I'd recommend getting a full validated backup of all the data, then create a completely new RAID set on the new drives to restore to.

You might want to go ahead and replace the failed drive with one of those new ones just to get the RAID5 in a healthy / non-degraded state, but only if you don't have the option to get a full backup and take the downtime needed to restore to a new RAID set.

Expert Comment

by:Glenn M
ID: 40525667
The process you're suggesting should work fine *if* after each rebuild you verify that all hard disk drive members are properly accounted for, fully functional individually, and the RAID controller posts correctly and without errors.

Plan around the degraded performance you'll get during the rebuild. And think about adding an additional drive to that array as a online spare - some additional piece of mind for the future.

Good luck.
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

LVL 47

Expert Comment

ID: 40526351
A rebuild on 5 year old disks when one of them died is too high risk.  You are at risk of losing everything and are on borrowed time. You need to get a 1TB drive and do an image backup of the RAID5 (boot to unix and just use dd, or your favorite bare metal backup).

Then yank the drives and replace them with all new drives.  Initialize the RAID, then do an image restore from the 1TB drive.   This will not put ANY data at risk.

Expert Comment

ID: 40526666
To clarify why some of us warn against the RAID5 rebuild... 2 big risks:
1. The additional I/O load on the remaining two drives may end up triggering a failure on one of them, causing the RAID set to fail completely
2. Any bad blocks encountered during the rebuild on the two drives are going to cause a "puncture" at minimum (which can strangely seems to become contagious and spread bad blocks to other drives, triggering drive failures and additional corruption), or might even cause the S300 (not the best RAID card around - cheap model that's prone to problems) to fail the rebuild, or even panic and fail the RAID set.

If your #1 priority is the safety of the data and you don't have a validated backup yet, you should do that before attempting anything else. Then if data availability is the next biggest priority, you could go ahead w/ trying the rebuild, and trust the card to do its job... if it works fine, maybe you can do the whole process online (though I don't know if the S300 supports expanding the RAID size live like the H-series controllers do) with more purposeful failure/rebuilds. Typically it's not recommended to try it that way anyway though, and live drive upgrades are only done with H-series controllers using the "replace disk" function that mirrors data to the replacement drive without putting the RAID set in a degraded state.

Good luck!

Author Closing Comment

ID: 40578784
After three nights of rebuilding, it worked great.  Now I have a server with new RAID5 hard drives.  Thanks!

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
PC upgrade to Linux Mint 7 64
Clone from Hd to smaller SSD 87 166
Windows Updates are not downloading will eventually fail. 11 31
Dell Server drives 9 22
Hyper-convergence systems have taken the IT world by storm and have quickly started to change our point of view of how the data center should and could be architected. In this article, I’ll explain the benefits of employing a hyper-converged system …
In this article we have discussed the manual scenarios to recover data from Windows 10 through some backup and recovery tools which are offered by it.
This video Micro Tutorial explains how to clone a hard drive using a commercial software product for Windows systems called Casper from Future Systems Solutions (FSS). Cloning makes an exact, complete copy of one hard disk drive (HDD) onto another d…
This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…

943 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

1 Experts available now in Live!

Get 1:1 Help Now