hard drive failure - any hope? Western Digital Sata drive.

Customer brought me a WD 160GB WD1600JS-75NCB1 sata drive.  
It’s making the clicking noises.  I tried putting it in the freezer and then using BART PE and GETDATA back for NTFS.  Usually when a disk is dead it just keeps on making the same clicking.  With this one I can feel it powering up.  It clicks 4-5 times and then feels like it’s starting to spin, then stops, then two more clicks, spins and stops.  In the bios it shows that there’s a WD drive at Sata port 5.  Again, usually with a dead drive the bios doesn’t even identify it.  When the system is up it feels like it’s spinning inside, but that might just be the power going to the device. When I get onto the system GETDATABACK can’t see it.  I put it in a Dell Vostro 220 desktop running XP SP3 and also with the BART PE disk.  I’ve also attached it as a USB device with no luck.
Any chance with this drive?
Thanks,
Al
Alan SilvermanOwnerAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

DavidPresidentCommented:
This is not one you can recover with software.  Every moment disk is powered up can do further damage.  if you want recovery, it will require a firm with a cleanroom and such, plus $500+ give or take.  Sorry.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
akahanCommented:
looks like a job for gillware:  http://www.gillware.com
0
nobusCommented:
if the problem comes from the logic board, you can try to replace that.
be sure to have one of the same model AND firmware !
0
ocanada_techguyCommented:
Since the drive is recognized it might be that the logic board is ok but the drive is spinning down due to excessive bad tracks (the clicking).  Refridgerating the drive is as often so the chips on the logic board that may have been subjected to overheating will hopefully run long enough to recover before it reaches temperatures where thermal failures start to occur.  In my experience "freezing" the drive doesn't help much with the physical part, with scratches on the platters.  One should be careful to seal the drive when refridgerating so as to avoid condensation,  ice, frost will worsen physical crashing and then water when it warms back up isn't adviseable either.
All that said, since you get clicking and it seems the drive "spins down" so quickly, seems it is still malfunctioning.  Why?  Perhaps the logic card is still failing, so swapping an identical logic board with the same firmware on it from an identical drive could be tried.  However, the clicking is more often a sign of bad spots encountered and the drive retrying, and yours gives up rather quickly suggesting either the bad sectors are alot and/or the "spare" area for bad sectoring has been used up.
If you use a utility to check the S.M.A.R.T. status of the drive (assuming it supports that) then you might discover whether the threshold has been reached/exceeded.  (Not every BIOS bothers to check drive SMART status on POST (power on self test) or sometimes SMART or the checking is turned off.

I have seen where even though there was clicking when encountering bad spots, a program like SpinRite 6 has been able to identify the bad sectors, try very hard to recover the data off the bad sectors and move said data block file chains to other sectors (it retries hundreds of times whereas chkdsk gives up after only a couple and almost never recovers the data being lost) and then will set aside the bad sectors never to be used again, whether it recovered data from them or not.  You can visit the GRC website to learn more about the tool and the situation.

I can say, that recovering most of the data off a failing drive in this way has been successful as often as not, maybe more often really.

I've had numerous good experiences with SpinRite 6, although one time I encountered a bug it had difficulty dealing with a particularly large partition.  Someone else has reccomended HDD Regenerator. I could be wrong but I got the impression that HDDRegenerator you need to be able to succesfully backup the data first because what that one does is wipe and re-do all badtracking and create an entirely new "spare area" for new bad tracks to be switched out to (perfect if the only problem is the badtrack space threshold has been reached) thus making an otherwise "reached end of life" HD as far as SMART is concerened as though new again.  I don't have enough personal experience to say for sure, I can only personally recommend SpinRite at this point.  SpinRite most definitely is aware of the drive's built-in SMART, but then it deals directly with the drive and uses complex analysis algorithms to try to recover the data.

However, be WARNED, that "pressing your luck" by way of putting software to work on trying and trying and retrying to recover data off of bad spots, does have a potential drawback.  If the bad spots were because of a physical "crash" serious enough to shave filings off the magnetic surface, when compared to the microscopic gap between surface of platter and read/write heads, said "shavings" could appear to be the size of boulders, and if they catch-up against the head then those could start scraping even more surface off the drive, making things much worse.  This is usually heard audibly as awful scraping noises.  Hear that STOP immediately.

The alternative, as commented, is to send the drive to a data recovery lab.  The open the seal on the drive, carefully removing the platter, cleaning out any shavings and the surfaces clean, the platters are then put into a special "drive" that is told what the tracking, sectoring, blocking of the platters are and then the information is read off the damaged platters to an image which they then try to recover the data from.   That had to be done is a "clean room", the guys in the white bunny suits and such, and special equipment, that is why there's a bit of expense to data recovery services at that level.

If the Partition tables, File Allocation Tables (FAT or MasterFT on NTFS) are lost or corrupt they would then use a tool pretty much like EasyRecovery Pro 6.04 (made, not uncoincidentally by OnTrack, a leader in drive technology and recovery services)

Many "data recovery services" have an option whereby they will "assess" the drive for free, giving you an idea what data they will be able to recover, and then IF you decide to go ahead, well the service is more expensive (they have to pay for all those free assessments somehow, mostly they're pretty good at it and hope the majority will opt to proceed and pay).  Some services have a little bit of negotiation room on the price, like say if you skip the assessment and just go for the recovery.  I've had good, (although not very often have had to resort to) OnTrack and Data Recovery Labs.

You should know what backups you have, how long ago out-of-date are they?  If you had to re-enter the data from that point to now, could you, is there a paper, audit, trasaction trail?  What would that entail and cost?  You should already be on task of verifying if your backups are readable / recoverable.

So it generally comes down to a question of cost / benefit.  They can give you a ballpark of how much it would cost to data recovery laboratory the drive, and you can decide if getting that data is worth it.

Good luck.
0
ocanada_techguyCommented:
Three additional thoughts and an example case:
sometimes the drive tries to spin-up and then spins down because it fails to achieve the proper RPMs.  That can happen if the bearings are seizing or contaminated (or again some glitch with electronics, so again swapping logic board can be tried)  Because you get the "clicking" I sort of ruled that out, it gets to speed and tries to read some stuff, then spins down, going by your description.
SpiRite estimates how much time it will take to go through the partition, however the time estimate FAILS to take into account the many minutes or even hours it can spend once it encounters a bad spot and has to retry hundreds of times.   I've seen it estimate 2 hrs and then take 2 days to complete one full pass when the number of delays due to dealing with bad sectors encountered was high.  This is an unavoidable shortcoming.
Also, vis a vis the "damage getting worse" thing.  I actually had the 2 day scan finish, then I tried to gHost to another drive (with gHost set to ignore read errors) and gHost still failed due to excessive errors, as the drive was still "ka-chunking" a bit more.  I though uh oh, might have to open the wallet wide.  I had a three week old backup, so I pressed on.  I then reran SpinRite tellingl SpinRite to try re-scanning just the % start % finish area that had alot of errors the first time, and it found some more, presumably newly created errors, did that a couple times taking a few more days, and then finally at the end of a week's time (effort time was less; checking in on it every few hours to see how it was going and in case of terrible grinding scraping noises) I was then able to gHost it to a new drive.  Were some of the programs and files on the destination drive corrupt after all that, yes.  I did a reinstall of Windows over that windows, reinstalled an app or two, and everything has been fine since (that system is still running, it's 10 years old) and 99% of my data was fine.
0
Alan SilvermanOwnerAuthor Commented:
nobus,
how can I find out the firmware version on the bad drive?
Al
0
DavidPresidentCommented:
like I wrote before, if the BIOS can not "see" the disk, you can't run any recovery software, like spinrite. You can't determine the firmware programmatically either.   WD does, however, put a firmware rev on the label (most if not all models), but this is a pointless path because you won't be able to flash the firmware.  And, of course, this assumes a firmware update will be of any use in this situation, which it wont.

So either get mentally prepared to spend $500 or so, or to junk the disk, or to cannibalize another HDD with same make/model (and preferably firmware), and hope for the best.  But really, forget about software if the BIOS is not seeing the disk.
0
nobusCommented:
it is printed on the disk label - if unsure, post a picture
0
ocanada_techguyCommented:
Seems there is confusion on whether you "see" the drive, since you write "usually" it doesn't even identify it, so I take it the opposite that yes your BIOS sees it, but no "getdataback" does not.

By the way I'd disagree that "usually" it just keeps making clicking, plenty of times I've seen then try and then give up, "spin down".  So yep, logic board talks to disk controller so it is "seen" but, logic board is going click click or ka-chunk ka-chunk WOAH excessive errors I'm done, give up and refuse any more read requests.

Next, I think you're using the wrong tool.  I looked up "getdataback" and this seems to be a tool for undeleting files and re-assembling files from their raw cluster block links and recuperating accidentally deleted partitions, probably from the secondary fat, mft, and partition tables (there are two for redundancy).  It's possible those occurr if a bad block occurrs and is not retrievable, but you'd be treating a symptom, not dealing with the real problem.  That it offers to "image" the drive so you can then do the reassembly on the image is a nice touch.  

As for BART PE, ok, you boot from CD to a rudementary tiny os, and....?  You're not going to format, I hope you don't chkdsk, which is ok for correcting file allocation errors but it's /R option for dealing with bad blocks is the WORST, it gives up on bad blocks after about six tries, and you lose the data, usually corrupting whatever that block was part of.  By the sounds of your problems, that would be ALLOT.

For diagnosing the drive's firmware, logic board problems, and what's ailing the drive, those things are manufacturer specific, so you'll need the manufaturer's tool that corresponds to the disk, in this case Western Digital.
http://www.tacktech.com/display.cfm?ttid=287

You may find this http://www.beyondlogic.org/solutions/smart/smart.htm
an informative tool now and in the future, for instance you might get an idea which errors raw increased from one try to the next.

Good luck
0
Alan SilvermanOwnerAuthor Commented:
Wonderful suggestions, especially since they’ve educated me to the actual processes.  I tried spinrite but it couldn’t see the drive.  This customer is not wealthy, so he won't go for the clean room.  

I am going to look for a cheap WD1600JS – 75NCB1 with B1 firmware.  Nobus, does the DCM matter also?  

Thanks,
Al
0
nobusCommented:
what is DCM ?
0
Alan SilvermanOwnerAuthor Commented:
Here's one part of the label:

MDL: WD1600JS - 75NCB1
DATE: 05 JAN 2006
DCM: HSBHCT2AA

Then a bit further down and to the right it says:  Firmware: B1

Before I saw the firmware version I thought it referred to that.
0
nobusCommented:
i must say i have no idea what it refers to.
0
DavidPresidentCommented:
DCM is the part # for the circuit board.  
0
DavidPresidentCommented:
But to re-re-state.  Odds are very low that replacing the logic board will solve the problem.  The motor assembly is also suspect, and it is sealed inside the media.

But I know you are in a bind, so rather than lecture you, let me help by explaining a few things, so you know what is in stone, and what gives you a little wiggle room.  

The DCM also includes specific build number/firmware, it is more than just the hardware part number.  If you can't match the DCM, then don't bother trying.  Part of the DCM number includes the P/N and revision of the head assembly.  This is vital.  IF the DCM is not an exact match. Forget it.  it will not work, and not only that, it will most likely fry your data.  

Also, for most WD disks, not necessarily this particular drive, because I don't have it in the database, the last 3 digits of the "Drive Parameters:  LBA xxxxxxxnnn" need to match.  If not, even if you repair the problem and the O/S sees the disk, your data will get messed up.

75NCB1 means distributor is #75, NC = revision, B1=firmware

If you can not source exact match, then you can get away with everything else being a match except the firmware revision.  You will have to remove the firmware chip and solder the one from the old HDD.   This, of course, assumes that you are running factory firmware and never upgraded it.  

0
Alan SilvermanOwnerAuthor Commented:
dlethe,
If I can find a drive just like it for a reasonable cost I’ll give it a try, just for the experience.  When I set up computers I set up automatic backups to a secondary drive, just to avoid this sort of thing.  
Thanks,
Al    
0
DavidPresidentCommented:
Great, nothing wrong with the learning experience, you may get lucky, and it isn't as if you will have to spend a lot of money on a replacement board, the trick is finding  the right one.
0
Alan SilvermanOwnerAuthor Commented:
Thanks again.
0
nobusCommented:
i replaced boards on different drives, so i can tell it works - if you have the correct board.
and thanks dlethe, for the info on DCM (i still wonder what it stands for)
0
DavidPresidentCommented:
Data Control Module .. made it up, I don't know.  It has always been called DCM, and for all the meetings I've had with WD FAEs over the years, the meaning of the acronym never came up.  I guess i wasn't curious enough to care.

0
nobusCommented:
good guess, tx for bringing it up !
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Storage Hardware

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.