Celebrate National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Drive in RAID 5 array in PowerEdge 2800 failed

Posted on 2010-11-11
9
Medium Priority
?
1,441 Views
Last Modified: 2012-10-05
I have a disk in a RAID-5 array (1 of 4 disks) that is showing as failed on our PowerEdge 2800. We have a PERC 4e/Di RAID Controller that shows Physical Disks 0, 2 and 3 all online but 1 doesn't appear (this is the one blinking amber at me).

I assume the disk has failed and as a result I need to replace it. But I'm looking for a few instructions as to how this should be done.

Is it simple a case of powering down the server, removing the disc and then inserting the new one and all is well again or do I need to go into the RAID via Open Manage and tell it to rebuild that disk?

Any advice appreciated as always.

Thanx
0
Comment
Question by:Steven O'Neill
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
  • 2
  • +2
9 Comments
 
LVL 59

Accepted Solution

by:
Darius Ghassem earned 1200 total points
ID: 34111484
You can keep the server online remove the disk then replace the disk. The RAID will rebuild itself once the drive it placed into the system.

0
 
LVL 23

Assisted Solution

by:jakethecatuk
jakethecatuk earned 400 total points
ID: 34111721
this may be stating the obvious - but you need to make sure that the drive you put in is the same model (speed, size, connector etc) as the one coming out. although you could put in a larger slower drive, doing that would have a severe impact on your raid array.

then as dariusg says...out with the old, in with the new and monitor the rebuild.
0
 
LVL 47

Assisted Solution

by:David
David earned 400 total points
ID: 34111831
I will just clarify something Jake said..
 - Assuming the replacement disk is QUALIFIED for this controller, and at least equal in capacity to the old drive, then there is nothing wrong with the disk being faster, as you will get an incremental performance gain (which you will likely only see on a benchmark), conversely, if it is slower, you will have an incremental performance hit.

 - Best practice, if you do NOT have the replacement drive now, is to take a full backup.  Not only do you have no protection against a drive failure, but even a bad block (WHICH YOU MAY HAVE RIGHT NOW) results in partial data loss.   The less I/O you do on this system while you wait, after backing up, the better.

 - Assuming all disks were bought at same time, consider they have all had the same I/O load, operating hours, environmentals, and were built in same manufacturing run.  It is not unusual for drive failures to be in groups, so buy 2 disks, and if you have slots, make one of them a hot spare.

0
Supports up to 4K resolution!

The VS192 2-Port 4K DisplayPort Splitter is perfect for anyone who needs to send one source of DisplayPort high definition video to two or four DisplayPort displays. The VS192 can split and also expand DisplayPort audio/video signal on two or four DisplayPort monitors.

 
LVL 2

Author Comment

by:Steven O'Neill
ID: 34112089
Hi guys

Thanx for all the advice. We are always backing up the servers here using Acronis Backup and Recovery SBS 10 (and Server 10 for the others) so I know the backups are ok and validated.

The drive we have is a Seagate Cheetah 146.9GB 10K U320 but they are not available right now so I've had to order a couple (yeah I already thot of that thanx) Seagate Cheetah 146.8GB 15K U320 disks. So slightly concerned about what jakethecatuk has said (as I didn't think it truly mattered).

So I assume there's nothing left for me to do but wait for the disk, remove the 'bad' one, insert the new one and let it rebuild (again I assume nothing needed from me).

I would also assume that the rebuild will hit the performance of the server as well? Would I simply use the OpenManage Server Administrator during the rebuild and am I best doing this out of hours (with now users around)?

Thanx again
0
 
LVL 23

Expert Comment

by:jakethecatuk
ID: 34112109
glad you are getting sorted.  my comment about size and speed only referenced slower drives.  dlethe expanded on that by confirming that faster and/or larger would not cause a problem.
0
 
LVL 33

Expert Comment

by:PowerEdgeTech
ID: 34113289
It's been stated, but let me emphasize, to save you headaches, that the replacement should never be done on a system with hot-swappable drives with the server powered off - especially when the drive has been used in a previous array.

That said, yes, the drive should begin to rebuild automatically and its progress can be monitored by OMSA.  If for some reason the rebuild doesn't happen automatically (within about 2 minutes), you can start it manually in OMSA.  

If the server is heavily used, you may consider rebuilding after hours, as there will be an amount of system resources dedicated to its rebuild.
0
 
LVL 47

Expert Comment

by:David
ID: 34113326
That is why I mentioned faster, that is one of the nice things about storage, every year it gets faster, cheaper, better.  Yes, pop it in, but important .. make sure you get this from Dell or authorized distributor.  The firmware on the drives is a big deal.  You save money buying a vanilla disk, but it won't have proper configurable settings that deal with cache, XOR logic, error recovery timing .. so they put your data at risk.

There is a configurable setting on most of the controllers that lets you prioritize rebuild vs application I/O.  I wouldn't make it prioritize rebuild any higher than 25%, and if systems are relatively idle at night, then the rebuild will use all it can anyway.  If it is busy during night, then just make judgement call.

It will likely finish overnight if you shut system down and do the rebuild from the BIOS, and even if it has not finished, you can just boot the computer when you get in, and rebuild continues at the lower priority
0
 
LVL 33

Expert Comment

by:PowerEdgeTech
ID: 34113437
The default on the PERC 4 is 30% priority, but I wouldn't set it any higher (if you're thinking playing with it :), as I've seen 50% slow the server to a nearly unusable state.
0
 
LVL 2

Author Closing Comment

by:Steven O'Neill
ID: 34119169
Thanx again for all your info guys. Disks arrived this morning and once has been inserted to replace the problem disc and it has begun rebuilding as mentioned.

Just monitoring it now to make sure if rebuilds fully.
0

Featured Post

On Demand Webinar - Networking for the Cloud Era

This webinar discusses:
-Common barriers companies experience when moving to the cloud
-How SD-WAN changes the way we look at networks
-Best practices customers should employ moving forward with cloud migration
-What happens behind the scenes of SteelConnect’s one-click button

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

this article is a guided solution for most of the common server issues in server hardware tasks we are facing in our routine job works. the topics in the following article covered are, 1) dell hardware raidlevel (Perc) 2) adding HDD 3) how t…
ADCs have gained traction within the last decade, largely due to increased demand for legacy load balancing appliances to handle more advanced application delivery requirements and improve application performance.
In this video, Percona Director of Solution Engineering Jon Tobin discusses the function and features of Percona Server for MongoDB. How Percona can help Percona can help you determine if Percona Server for MongoDB is the right solution for …
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…

730 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question