Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Drive in RAID 5 array in PowerEdge 2800 failed

Posted on 2010-11-11
9
Medium Priority
?
1,463 Views
Last Modified: 2012-10-05
I have a disk in a RAID-5 array (1 of 4 disks) that is showing as failed on our PowerEdge 2800. We have a PERC 4e/Di RAID Controller that shows Physical Disks 0, 2 and 3 all online but 1 doesn't appear (this is the one blinking amber at me).

I assume the disk has failed and as a result I need to replace it. But I'm looking for a few instructions as to how this should be done.

Is it simple a case of powering down the server, removing the disc and then inserting the new one and all is well again or do I need to go into the RAID via Open Manage and tell it to rebuild that disk?

Any advice appreciated as always.

Thanx
0
Comment
Question by:Steven O'Neill
  • 2
  • 2
  • 2
  • +2
9 Comments
 
LVL 59

Accepted Solution

by:
Darius Ghassem earned 1200 total points
ID: 34111484
You can keep the server online remove the disk then replace the disk. The RAID will rebuild itself once the drive it placed into the system.

0
 
LVL 23

Assisted Solution

by:jakethecatuk
jakethecatuk earned 400 total points
ID: 34111721
this may be stating the obvious - but you need to make sure that the drive you put in is the same model (speed, size, connector etc) as the one coming out. although you could put in a larger slower drive, doing that would have a severe impact on your raid array.

then as dariusg says...out with the old, in with the new and monitor the rebuild.
0
 
LVL 47

Assisted Solution

by:David
David earned 400 total points
ID: 34111831
I will just clarify something Jake said..
 - Assuming the replacement disk is QUALIFIED for this controller, and at least equal in capacity to the old drive, then there is nothing wrong with the disk being faster, as you will get an incremental performance gain (which you will likely only see on a benchmark), conversely, if it is slower, you will have an incremental performance hit.

 - Best practice, if you do NOT have the replacement drive now, is to take a full backup.  Not only do you have no protection against a drive failure, but even a bad block (WHICH YOU MAY HAVE RIGHT NOW) results in partial data loss.   The less I/O you do on this system while you wait, after backing up, the better.

 - Assuming all disks were bought at same time, consider they have all had the same I/O load, operating hours, environmentals, and were built in same manufacturing run.  It is not unusual for drive failures to be in groups, so buy 2 disks, and if you have slots, make one of them a hot spare.

0
Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

 
LVL 2

Author Comment

by:Steven O'Neill
ID: 34112089
Hi guys

Thanx for all the advice. We are always backing up the servers here using Acronis Backup and Recovery SBS 10 (and Server 10 for the others) so I know the backups are ok and validated.

The drive we have is a Seagate Cheetah 146.9GB 10K U320 but they are not available right now so I've had to order a couple (yeah I already thot of that thanx) Seagate Cheetah 146.8GB 15K U320 disks. So slightly concerned about what jakethecatuk has said (as I didn't think it truly mattered).

So I assume there's nothing left for me to do but wait for the disk, remove the 'bad' one, insert the new one and let it rebuild (again I assume nothing needed from me).

I would also assume that the rebuild will hit the performance of the server as well? Would I simply use the OpenManage Server Administrator during the rebuild and am I best doing this out of hours (with now users around)?

Thanx again
0
 
LVL 23

Expert Comment

by:jakethecatuk
ID: 34112109
glad you are getting sorted.  my comment about size and speed only referenced slower drives.  dlethe expanded on that by confirming that faster and/or larger would not cause a problem.
0
 
LVL 33

Expert Comment

by:PowerEdgeTech
ID: 34113289
It's been stated, but let me emphasize, to save you headaches, that the replacement should never be done on a system with hot-swappable drives with the server powered off - especially when the drive has been used in a previous array.

That said, yes, the drive should begin to rebuild automatically and its progress can be monitored by OMSA.  If for some reason the rebuild doesn't happen automatically (within about 2 minutes), you can start it manually in OMSA.  

If the server is heavily used, you may consider rebuilding after hours, as there will be an amount of system resources dedicated to its rebuild.
0
 
LVL 47

Expert Comment

by:David
ID: 34113326
That is why I mentioned faster, that is one of the nice things about storage, every year it gets faster, cheaper, better.  Yes, pop it in, but important .. make sure you get this from Dell or authorized distributor.  The firmware on the drives is a big deal.  You save money buying a vanilla disk, but it won't have proper configurable settings that deal with cache, XOR logic, error recovery timing .. so they put your data at risk.

There is a configurable setting on most of the controllers that lets you prioritize rebuild vs application I/O.  I wouldn't make it prioritize rebuild any higher than 25%, and if systems are relatively idle at night, then the rebuild will use all it can anyway.  If it is busy during night, then just make judgement call.

It will likely finish overnight if you shut system down and do the rebuild from the BIOS, and even if it has not finished, you can just boot the computer when you get in, and rebuild continues at the lower priority
0
 
LVL 33

Expert Comment

by:PowerEdgeTech
ID: 34113437
The default on the PERC 4 is 30% priority, but I wouldn't set it any higher (if you're thinking playing with it :), as I've seen 50% slow the server to a nearly unusable state.
0
 
LVL 2

Author Closing Comment

by:Steven O'Neill
ID: 34119169
Thanx again for all your info guys. Disks arrived this morning and once has been inserted to replace the problem disc and it has begun rebuilding as mentioned.

Just monitoring it now to make sure if rebuilds fully.
0

Featured Post

[Webinar] Cloud and Mobile-First Strategy

Maybe you’ve fully adopted the cloud since the beginning. Or maybe you started with on-prem resources but are pursuing a “cloud and mobile first” strategy. Getting to that end state has its challenges. Discover how to build out a 100% cloud and mobile IT strategy in this webinar.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

On July 14th 2015, Windows Server 2003 will become End of Support, leaving hundreds of thousands of servers around the world that still run this 12 year old operating system vulnerable and potentially out of compliance in many organisations around t…
Data center, now-a-days, is referred as the home of all the advanced technologies. In-fact, most of the businesses are now establishing their entire organizational structure around the IT capabilities.
In this video, Percona Director of Solution Engineering Jon Tobin discusses the function and features of Percona Server for MongoDB. How Percona can help Percona can help you determine if Percona Server for MongoDB is the right solution for …
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…

879 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question