Link to home
Start Free TrialLog in
Avatar of BeGentleWithMe-INeedHelp
BeGentleWithMe-INeedHelpFlag for United States of America

asked on

6 year old Dell PowerEdge T110 with 1 of 4 failing hard drive.... What would you do?

On a Dell PowerEdge T110, shipped in Nov, 2011, we have this 1 of 4 hard drive with predicted failure.  It's set up as RAID 6, I think I saw (no spares... but raid 6 you can lose 2 drives before failure? Not that I want to get to that : )

Thoughts?

Replace the drive? Right away?  You can wait? Any indication of failure predicted today vs. next month vs.....
If you are going to replace the drive, I thought I heard that you have to get it from Dell? They build / built? their systems to require dell branded hard drives to work?
Where do you get a drive like that from? I don't see it on the dell website?

Or is this the argument to replace the server?  1 of 4 drives failing.... the others are as old, the whole machine is 6 years old (SBS 2011 standard OS)
SOLUTION
Avatar of Sam Simon Nasser
Sam Simon Nasser
Flag of Palestine, State of image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I assume it is out of warranty.  If so, replace the server.

It doesn't really change my answer, but a few thoughts:

Have you tested your backup?  An untested backup is as good as no backup.  You don't want to find out your restore strategy was flawed after the machine fails.  No good there. (same logic as not running a machine out of warranty....your DR strategy involves getting parts off ebay...)

Replacing the hard drive might buy you time.  Or it could put enough stress on the other drives during the "rebuild" that they also fail.  Drives in OEM servers tend to all be from the same manufacturing batch and that means they tend to fail close together.

SBS 2011 is nearing end of extended support. Use this opportunity to migrate.  Sooner is better.

Use this opportunity to discuss a budget for I.T. and get your servers on a regular replacement cycle that can be adhered to.  Buy extended warranties to match whatever replacement cycle you choose.

These are all small steps that even the *smallest* of small businesses can stomach. It just requires the hard conversations to explain the need.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of BeGentleWithMe-INeedHelp

ASKER

Cliff - yes out of warranty.  Good points, thanks.

Sam - I think the word I should have used is 'certified' for the ones that have to go into the server.

Looking at this from 2014,

https://www.dell.com/community/PowerEdge-HDD-SCSI-RAID/Do-I-need-certified-drives/td-p/4321318

it says dell doesn't require dell certified drives..

But this from 2010,

https://www.dell.com/community/PowerEdge-HDD-SCSI-RAID/Dell-certified-drives-only-with-PERC-H700/td-p/3449730

'Just got a new R510 Server with H700 Raid Card and yes unfortunately it is true you can only use Dell certified drives with the new Raid cards'

Maybe around 2010 Dell WAS requiring dell certified drives but by 2014 moved away from that?  This server is from 2011 : (
itguy: thanks also.  Totally agree on just replace, move to office 365, etc.  but not my money / they are penny wise / pound foolish : )  

My post and yours crossed.  Yeah, Dell doesn't make the drive, but some quick googling, it seems that the drive maker can customize the firmware for manufacturers... if nothing else to make it unique so the controller knows to look for something in the firmware to see that you bought our brand / version of hard drive?

The card in this server:  PERC 6/i SAS internal RAID adapter, PCI-Express

Several pages of people saying they have had problems with drives they didn't get from dell / not dell certified...it might work... but not supported..  more support for the idea to just replace the server?! : )

https://www.dell.com/community/PowerEdge-HDD-SCSI-RAID/Non-certified-drives-throwing-Faults/td-p/4139185

https://www.dell.com/community/PowerEdge-HDD-SCSI-RAID/Does-the-PERC-6-i-SAS-5-work-with-non-certified-Dell-HDDs/m-p/5114438
Personally, I would take care of the immediate issue and replace the drive. Then I would in my annual budget plan to replace the box because it is a HUGE liability.
yes, certified server hard drive is something, and only dell hard disk for dell server is something else.
check this article on The 5 Most Reliable Hard Drives According to Server Companies   https://www.makeuseof.com/tag/most-reliable-hard-drives/
As for replacing the drive, I would pull one of my existing drives and see what the Make and model is and replace with the same make and model drive. The drive technically should not need to be Dell Certified, but going with the same make and model and size drive you can't go wrong.
I personally will reiterate my one concern here.  Replacing the drive *will* stress the other drives and actually (SIGNIFICANTLY) increases their chance of failure.  On a server under warranty with good backups, not a big deal.  On an out of warranty server, now you are buying a new serer anyways...finding a way to restore a backup onto dissimilar hardware....and very likely will have a "no returns" policy on the drive you bought.

While I certainly don't have warm fuzzy feelings about running a RAID setup with a failed drive, I think replacing the drive could compound the problem, not resolve it.  

The underlying problem is, of course, that the server is out of warranty.  But crying over spilt milk and all that. in an effort not to dwell on the problem, but to look for the solution, I still have to recommend just biting the bullet, ensuring you have a GOOD backup, and buying a new server and migrating ASAP.  Heck, moving to Office 365 doesn't even require a new server, so you could start *that* now.   I, for one, do not advocate replacing the drive.
it guy:  

You say pull one of my existing drives and see what the Make and model is and replace with the same make and model drive

Good advice, except what would you do when they don't make that drive anymore : ) that is a 6 year old model.
I have to disagree with you there, while rebuilding a degraded raid will tax the system for a short time, I do not believe that it will cause additional drives to fail without warning. While that is possible, it is not probable.
Drives failing in batches is very well documented.  Anecdotal experience is a small subset to draw conclusions, but you draw from a large enough pool and what I said is demonstratively true.  The older the drives, the higher the probability (since drives actually publish MTBF figures) and real world experiences in the field (not mine, but the various IT communities at large) have born that out.  Proceed at your own risk.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
There was only one version of PERC firmware that insisted on Dell branded drives.

Thanks!  At least I know I wasn't going completely crazy!!

not using hdtune - i meant to ask that.  thanks!