Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions

Computer intermittently losing hard drives

Posted on 2006-11-08
Last Modified: 2012-08-14
Hello, Experts.

My computer has been intermittently losing 1 or 2 hard drives ever since I built it.  It is a home built computer running Windows XP with the relevant hardware being an Antec 450W power supply, Intel D945Pvs system board (82801 GR I/O controller hub with ICH7-R) and 3 Maxtor 250GB SATA hard drives of various different models including 6L250S0, 6V250F0, 7V250F0, 6L25020 with most of the space in a RAID-5 configuration.  About every 2-6 weeks 1 or 2 drives in the RAID fail.  

When just 1 drive goes down, the computer will reboot (I actually have an 8GB RAID-0 partition divided equally between the 3 drives for my swap file, and the remaining space on the 3 drives is used in a RAID-5 configuration) because the swap file is no longer available to read or write to.  Sometimes it will run just long enough to tell me that a drive from the RAID set is missing, and give me some errors about the swap file.  After the reboot, and during the POST, the hardware RAID shows 1 drive missing, and then the computer comes back in Windows with the swap file disabled and it runs in a degraded state.  Restarting the computer does not solve the problem, but shutting the computer all the way down and powering it back up allows the drive to be seen, and it gets rebuilt and works again for a few weeks.  

When 2 drives go down they appear to go down at the exact same time.  The computer reboots and during POST the RAID shows 2 drives missing.  Pressing the RESET button or warm booting do not solve the problem.  A power cycle does let both drives to be seen, and I have to go into the RAID configuration and tell it to recover the volumes.  Then it then boots to Windows and rebuilds the RAID set.  After this happened about the 3rd time, I started taking notes.  

Sometimes it happens when I am using the computer and sometimes it happens when I am not at my computer.
The problem does not follow any particular drive or drives.
The problem does not follow any particular SATA port or ports.

The first thing I did was upgrade the BIOS to the latest version. Did not fix the problem.
Then I replaced the SATA cables. Did not fix the problem.
Then I replaced the drive that was failing the most with a brand new drive (still Maxtor, but different model). Did not fix the problem.
Then I replaced the system board with a brand new Intel D945Pvs, and replaced the SATA cables again at the same time. Did not fix the problem.
Then I replaced all 3 drives at once. Did not fix the problem.
Then it happened to the same 2 drives twice in a row, and those drives happened to be on the same SATA power cable coming from the power supply.  So I swapped power connectors on the drives, but it happened again on one of the same drives on the new power connector.
The hottest spot on the external of the hottest drive in the cage is 33 degrees Celsius (I can't read the SMART info because of the hardware RAID), so I don't think it is a thermal issue.
I also replaced a couple more hard drives in between these steps with various models.

I am about out of ideas.  Sorry for the long post, but I wanted to include all information I thought relevant.  Please let me know if you have any more ideas or things to try.
Question by:GuruGary
  • 5
  • 3
  • 2
  • +1
LVL 34

Expert Comment

ID: 17903887
First just like to say in the future stay away from Maxtor they use to make good drives but in the last few years they royally suck and are failing left and right. I think the main issue is how the raid is setup in my experience its a bad idea to create multiple arrays on the same set of drives, worse if they are different raid types.  Have you tryed creating only 1 array (1 big raid 5)?
LVL 88

Accepted Solution

rindi earned 350 total points
ID: 17904174
I'm also one of those who is never again going to buy a new maxtor if I can help it. All maxtors I've seen have been very reliable in one thing, they never last more than 3 months! Having said that I have no idea whether they have improved on their quality in the last 2 years, and I recently heard a rumour maxtor was taken by seagate, and seagates are very reliable!

Also, don't use different raid levels on the same HD, rather get extra disks if you need other raid levels. There isn't much to gain (if there is anything at all) by using raid 0 and 5 on the same three disks. Raid 0 is usually good for speed, and raid 5 for redundancy, but because both are using the same hardware at the same time you are very unlikely to get a higher speed on the raid 0 array when there is also an active raid 5 array using the same hardware. Get extra disks for the raid 0 array.

The actual problem could be caused by bad cables. I've had a similar problem with a raid 5 array with the same controller you are using. It was in a shuttle XPC, and sometimes a disk would go offline, and then the array would have to be rebuilt again. Once 2 disks went offline and I had to reinstall and restore from backups. I got in touch with shuttle and they sent me new SATA data cables, and since then I didn't have any problems anymore. Maybe you need to get high quality SATA cables. If that doesn't help, I'd suspect a bad powersupply...

LVL 10

Author Comment

ID: 17904436
Thanks for the tips.  I agree that Maxtor has had some serious reliability issues the past couple of years.  And yes, Seagate did acquire Maxtor a few years ago.  In our shop we keep stacks of the failed drives we have replaced.  About 2-3 years ago, Maxtor was the lowest failed stack.  Over the past 2  years or so, Maxtor has grown to be the largest stack.  250 GB Maxtors just happeded to be the only large drives we had a good stock on when I built the computer.

I haven't had problems with mixing RAID-0 and RAID-5 before, but I'll try making it all just one RAID level.  I doubt it is the cables since I have replaced them a few times already ... but if it is still failing after the RAID consolidation and new power supply I guess a 4th set of SATA cables can't hurt.

If there are any more ideas, please let me know!
NAS Cloud Backup Strategies

This article explains backup scenarios when using network storage. We review the so-called “3-2-1 strategy” and summarize the methods you can use to send NAS data to the cloud

LVL 88

Expert Comment

ID: 17904460
The comment about not mixing raid levels probably isn't the main cause of the problem, but in my point of view you don't get any advantage with mixing. The main reason for using raid 0 is because of the speed which is acquired because all the disks are accessed at the same time. But if you are using raid 5 on the same disks at the same time you loose that speed advantage again because there will be access to the raid 5 array at the same time. This will reduce the raid 0 speed to less than the speed of your raid 5 array.
LVL 34

Assisted Solution

jamietoner earned 150 total points
ID: 17905608
If consoliditating the drives doesnt fix it i would suggest replacing the raid controller or if its and integrated controller just add a pci raid controller.
LVL 15

Expert Comment

ID: 17905797
So your only really left with a power issue?
Have you tried a different power supply? and do you get any electric surges or power failures?

LVL 10

Author Comment

ID: 17942022
I have not yet tried a different power supply.  It crossed my mind when I swapped power connectors on the drives, but since the issue didn't follow the power connector and the management software didn't detect any fluctuations in voltages, I didn't replace the power supply at that time.  The comptuer is plugged in to an APC Smart-UPS which has power conditioning.  The power at this location is usually very reliable and clean, and I think any surges, brownouts, etc. would be corrected by the UPS.  

For the next step, I will try replacing the power supply.  If that doesn't work then I will replace the system board which has the RAID support built-in.  I'll report back on the progress.
LVL 10

Author Comment

ID: 18081550
The power supply has been replaced, and so far there have been no problems, but it often takes longer than this to fail ... so if the problem does not occur in the next few weeks, I'll assume the problem is fixed.  For now I'll just wait to see if it fails again.
LVL 10

Author Comment

ID: 18083051
I think I jinxed myself.  About 2 hours after I posted that everything had been running fine since the new power supply was installed, it failed.  I took some notes, power cycled, let the RAID rebuild itself and now it is back online again.  Since the RAID-0 has been taken out and everything is running as RAID-5, and the power supply has been replaced, the only other suggestions I think I have left are replacing the SATA cables (which I have done once already) and replacing the RAID controller (which is built into the system board and has been replaced already).  

I'll replace all the SATA cables again with brand new quality cables, and see what happens.  If there are any other suggestions, please let me know.
LVL 34

Expert Comment

ID: 18084815
What you may want to try instead of replacing the motherboard for the integrated raid is add a pci sata raid controller. They are usually alot stabler than integrated driver based raid controllers ( most integrtaed sata raid is driver based). A controller card like this one should work and is about the same price as a D945pvs. http://www.newegg.com/Product/Product.asp?Item=N82E16816115029

LVL 10

Author Comment

ID: 18216859
Well, I ended up replacing all the drives (again) with WD RE2 drives, and also replaced all the cables (again) with brand new cables.  I haven't seen the error since, but it has only been 3 weeks so it may still happen.  Either way, I guess I have the information needed to fix the issue.  Thanks for the help and ideas, and hopefully the problem is fixed for good!

Featured Post

Best Practices: Disaster Recovery Testing

Besides backup, any IT division should have a disaster recovery plan. You will find a few tips below relating to the development of such a plan and to what issues one should pay special attention in the course of backup planning.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
P2000 SAN SAS - RAID5 or RAID50 10 41
How to create SAN for end users 9 48
Hyper-V server/storage 15 41
Adding drives to two Dell R620s. One finds the drives, the other dosen't 10 40
Each year, investment in cloud platforms grows more than 20% (https://www.immun.io/hubfs/Immunio_2016/Content/Marketing/Cloud-Security-Report-2016.pdf?submissionGuid=a8d80a00-6fee-4b85-81db-a4e28f681762) as an increasing number of companies begin to…
The business world is becoming increasingly integrated with tech. It’s not just for a select few anymore — but what about if you have a small business? It may be easier than you think to integrate technology into your small business, and it’s likely…
This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…
This Micro Tutorial will teach you how to reformat your flash drive. Sometimes your flash drive may have issues carrying files so this will completely restore it to manufacturing settings. Make sure to backup all files before reformatting. This w…

829 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question