Solved

Drive in RAID 5 array in PowerEdge 2800 failed

Posted on 2010-11-11
9
1,409 Views
Last Modified: 2012-10-05
I have a disk in a RAID-5 array (1 of 4 disks) that is showing as failed on our PowerEdge 2800. We have a PERC 4e/Di RAID Controller that shows Physical Disks 0, 2 and 3 all online but 1 doesn't appear (this is the one blinking amber at me).

I assume the disk has failed and as a result I need to replace it. But I'm looking for a few instructions as to how this should be done.

Is it simple a case of powering down the server, removing the disc and then inserting the new one and all is well again or do I need to go into the RAID via Open Manage and tell it to rebuild that disk?

Any advice appreciated as always.

Thanx
0
Comment
Question by:Steven O'Neill
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
  • 2
  • +2
9 Comments
 
LVL 59

Accepted Solution

by:
Darius Ghassem earned 300 total points
ID: 34111484
You can keep the server online remove the disk then replace the disk. The RAID will rebuild itself once the drive it placed into the system.

0
 
LVL 23

Assisted Solution

by:jakethecatuk
jakethecatuk earned 100 total points
ID: 34111721
this may be stating the obvious - but you need to make sure that the drive you put in is the same model (speed, size, connector etc) as the one coming out. although you could put in a larger slower drive, doing that would have a severe impact on your raid array.

then as dariusg says...out with the old, in with the new and monitor the rebuild.
0
 
LVL 47

Assisted Solution

by:dlethe
dlethe earned 100 total points
ID: 34111831
I will just clarify something Jake said..
 - Assuming the replacement disk is QUALIFIED for this controller, and at least equal in capacity to the old drive, then there is nothing wrong with the disk being faster, as you will get an incremental performance gain (which you will likely only see on a benchmark), conversely, if it is slower, you will have an incremental performance hit.

 - Best practice, if you do NOT have the replacement drive now, is to take a full backup.  Not only do you have no protection against a drive failure, but even a bad block (WHICH YOU MAY HAVE RIGHT NOW) results in partial data loss.   The less I/O you do on this system while you wait, after backing up, the better.

 - Assuming all disks were bought at same time, consider they have all had the same I/O load, operating hours, environmentals, and were built in same manufacturing run.  It is not unusual for drive failures to be in groups, so buy 2 disks, and if you have slots, make one of them a hot spare.

0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 2

Author Comment

by:Steven O'Neill
ID: 34112089
Hi guys

Thanx for all the advice. We are always backing up the servers here using Acronis Backup and Recovery SBS 10 (and Server 10 for the others) so I know the backups are ok and validated.

The drive we have is a Seagate Cheetah 146.9GB 10K U320 but they are not available right now so I've had to order a couple (yeah I already thot of that thanx) Seagate Cheetah 146.8GB 15K U320 disks. So slightly concerned about what jakethecatuk has said (as I didn't think it truly mattered).

So I assume there's nothing left for me to do but wait for the disk, remove the 'bad' one, insert the new one and let it rebuild (again I assume nothing needed from me).

I would also assume that the rebuild will hit the performance of the server as well? Would I simply use the OpenManage Server Administrator during the rebuild and am I best doing this out of hours (with now users around)?

Thanx again
0
 
LVL 23

Expert Comment

by:jakethecatuk
ID: 34112109
glad you are getting sorted.  my comment about size and speed only referenced slower drives.  dlethe expanded on that by confirming that faster and/or larger would not cause a problem.
0
 
LVL 33

Expert Comment

by:PowerEdgeTech
ID: 34113289
It's been stated, but let me emphasize, to save you headaches, that the replacement should never be done on a system with hot-swappable drives with the server powered off - especially when the drive has been used in a previous array.

That said, yes, the drive should begin to rebuild automatically and its progress can be monitored by OMSA.  If for some reason the rebuild doesn't happen automatically (within about 2 minutes), you can start it manually in OMSA.  

If the server is heavily used, you may consider rebuilding after hours, as there will be an amount of system resources dedicated to its rebuild.
0
 
LVL 47

Expert Comment

by:dlethe
ID: 34113326
That is why I mentioned faster, that is one of the nice things about storage, every year it gets faster, cheaper, better.  Yes, pop it in, but important .. make sure you get this from Dell or authorized distributor.  The firmware on the drives is a big deal.  You save money buying a vanilla disk, but it won't have proper configurable settings that deal with cache, XOR logic, error recovery timing .. so they put your data at risk.

There is a configurable setting on most of the controllers that lets you prioritize rebuild vs application I/O.  I wouldn't make it prioritize rebuild any higher than 25%, and if systems are relatively idle at night, then the rebuild will use all it can anyway.  If it is busy during night, then just make judgement call.

It will likely finish overnight if you shut system down and do the rebuild from the BIOS, and even if it has not finished, you can just boot the computer when you get in, and rebuild continues at the lower priority
0
 
LVL 33

Expert Comment

by:PowerEdgeTech
ID: 34113437
The default on the PERC 4 is 30% priority, but I wouldn't set it any higher (if you're thinking playing with it :), as I've seen 50% slow the server to a nearly unusable state.
0
 
LVL 2

Author Closing Comment

by:Steven O'Neill
ID: 34119169
Thanx again for all your info guys. Disks arrived this morning and once has been inserted to replace the problem disc and it has begun rebuilding as mentioned.

Just monitoring it now to make sure if rebuilds fully.
0

Featured Post

Comprehensive Backup Solutions for Microsoft

Acronis protects the complete Microsoft technology stack: Windows Server, Windows PC, laptop and Surface data; Microsoft business applications; Microsoft Hyper-V; Azure VMs; Microsoft Windows Server 2016; Microsoft Exchange 2016 and SQL Server 2016.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

ADCs have gained traction within the last decade, largely due to increased demand for legacy load balancing appliances to handle more advanced application delivery requirements and improve application performance.
While rebooting windows server 2003 server , it's showing "active directory rebuilding indices please wait" at startup. It took a little while for this process to complete and once we logged on not all the services were started so another reboot is …
A short tutorial showing how to set up an email signature in Outlook on the Web (previously known as OWA). For free email signatures designs, visit https://www.mail-signatures.com/articles/signature-templates/?sts=6651 If you want to manage em…
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…

751 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question