Best way to replace drives in RAID5 array

Posted on 2014-12-30
Last Modified: 2016-11-23

I'm running Windows 2008 Server R2 on my Dell Poweredge.  I've got a Perc300 controller and 3 WD500 RE4 in a RAID 5 array.  I just had drive 1 fail.  I know that it's a good rule of thumb to replace the other hard drives too as they may be failing soon.  The server and drives are about 5 years old.  I needed to get the server online and the only drive I had available for temporary replacement was a WD 1TB Red drive.  I've installed this drive and successfully rebuilt the RAID5 array.  I'm not comfortable leaving this drive in for long and I'd like to put back the WD500 RE4 drives.  So, I've ordered 3 WD500 RE4s to replace all three of the drives.  What's the best way to do this?  I was thinking that I needed to remove one drive at a time and perform a rebuild on that drive.  When that's done, I'll take out the next drive, perform a rebuild.  Then finally I'll remove the third drive, replace it and then do another rebuild.  This server is in service 7 am to 8 pm 6 days a week so I don't have enough down time during a normal day to take down the server for hours at a time so that's why I was thinking about this solution.  Will this work or is there a better way to get this done?  Thanks for your help!
Question by:kendalltech
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
LVL 10

Accepted Solution

schaps earned 500 total points
ID: 40525019
I assume you mean one disk at a time overnight? I would take the time to do a full backup before starting the replacement in case something goes wrong, but it's probably the best way to get the job done without having to spend a lot of after-hours time onsite.
Otherwise, plan to spend a few hours one night to do that full backup, replace all the drives, rebuild the array, and restore the data. However, doing it one drive at a time is pretty safe, it's just doing what the RAID system is designed to do. You can swap a drive each night and go home when the rebuild starts, and you can check the data integrity between those overnight swaps.
If you have a fourth slot, you might consider adding a fourth drive if you do the all-at-once method. And, in that case, consider whether RAID10 would do the job better for you (it depends on your needs, really). Even if you stick with RAID5, I like a four-drive array better than three.

Expert Comment

ID: 40525488
I would avoid doing numerous rebuilds on a RAID5 for this purpose. If there are any bad blocks on the old drives that are encountered during rebuilds, you're going to end up with RAID stripes being "punctured" which can lead to data corruption.

I'd recommend getting a full validated backup of all the data, then create a completely new RAID set on the new drives to restore to.

You might want to go ahead and replace the failed drive with one of those new ones just to get the RAID5 in a healthy / non-degraded state, but only if you don't have the option to get a full backup and take the downtime needed to restore to a new RAID set.

Expert Comment

by:Glenn M
ID: 40525667
The process you're suggesting should work fine *if* after each rebuild you verify that all hard disk drive members are properly accounted for, fully functional individually, and the RAID controller posts correctly and without errors.

Plan around the degraded performance you'll get during the rebuild. And think about adding an additional drive to that array as a online spare - some additional piece of mind for the future.

Good luck.
Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

LVL 47

Expert Comment

ID: 40526351
A rebuild on 5 year old disks when one of them died is too high risk.  You are at risk of losing everything and are on borrowed time. You need to get a 1TB drive and do an image backup of the RAID5 (boot to unix and just use dd, or your favorite bare metal backup).

Then yank the drives and replace them with all new drives.  Initialize the RAID, then do an image restore from the 1TB drive.   This will not put ANY data at risk.

Expert Comment

ID: 40526666
To clarify why some of us warn against the RAID5 rebuild... 2 big risks:
1. The additional I/O load on the remaining two drives may end up triggering a failure on one of them, causing the RAID set to fail completely
2. Any bad blocks encountered during the rebuild on the two drives are going to cause a "puncture" at minimum (which can strangely seems to become contagious and spread bad blocks to other drives, triggering drive failures and additional corruption), or might even cause the S300 (not the best RAID card around - cheap model that's prone to problems) to fail the rebuild, or even panic and fail the RAID set.

If your #1 priority is the safety of the data and you don't have a validated backup yet, you should do that before attempting anything else. Then if data availability is the next biggest priority, you could go ahead w/ trying the rebuild, and trust the card to do its job... if it works fine, maybe you can do the whole process online (though I don't know if the S300 supports expanding the RAID size live like the H-series controllers do) with more purposeful failure/rebuilds. Typically it's not recommended to try it that way anyway though, and live drive upgrades are only done with H-series controllers using the "replace disk" function that mirrors data to the replacement drive without putting the RAID set in a degraded state.

Good luck!

Author Closing Comment

ID: 40578784
After three nights of rebuilding, it worked great.  Now I have a server with new RAID5 hard drives.  Thanks!

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this article we will learn how to backup a VMware farm using Nakivo Backup & Replication. In this tutorial we will install the software on a Windows 2012 R2 Server.
Many businesses neglect disaster recovery and treat it as an after-thought. I can tell you first hand that data will be lost, hard drives die, servers will be hacked, and careless (or malicious) employees can ruin your data.
This video teaches viewers how to encrypt an external drive that requires a password to read and edit the drive. All tasks are done in Disk Utility. Plug in the external drive you wish to encrypt: Make sure all previous data on the drive has been …
This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…
Suggested Courses

615 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question