Preventing hard disk failure

Hello,
I'm looking for best practices about preventing hard disk failures on Windows 2003 servers. I'm concerned about several Oracle databases that I have installed in different servers, and asking myself how to know or prevent if hard disks of my servers are going to fail soon or not.

I'm DBA, not a system or network manager, so I don't have detailed information about hard disks of servers, but I can access them as Windows administrator.

I suposse that my question depends on what hardware of server you have. In my case, my servers are HP. One server, for example, has HP StorageWorks MSA2012fc, which is an array of disks. It means that if one disk fails, there are other disks that still work and data is not lost.

I suppose that detecting disk failure would depend on HP firmwares and HP software utilities. However, I'm interesting to have some other tools or advices on how to prevent it.

- Are there any software tool that I could run periodically to detect or prevent disk failures?
- Should I check on Windows alerts logs every morning? What should I check or search?
- Should I depend only on HP tools?
LVL 1
miyahiraAsked:
Who is Participating?

[Webinar] Streamline your web hosting managementRegister Today

x
 
hatheharikenConnect With a Mentor Commented:
90% HDDs in operation fail because of one single reason - elevated temperature.
as a thumb rule, i never let my HDDs cross the 26C threshold. never had a operational HDD fail on me. never.

to keep temperatures in check, there are a variety of software in the market, but i prefer HWmonitor.
http://www.cpuid.com/softwares/hwmonitor-pro.html ( you can pay for support, but you it is not mandatory)
airpaths in the cabinets should be clean, ACs should be working.

SMART monitoring is also an useful practice, whether you have 20 drives or 2000.
there are network SMART monitors that are very robust and have a low false positive record
http://www.urltoy.com/asc.htm
0
 
pjamConnect With a Mentor Commented:
That would be HP Array Configuration Utility ACU.  Run Diagnostics and it should tell you if there is a chance of one failing in the future.  also each disk should have an LED and it should be Green.  I have a small MSA20 and have had practically new SAS drives die
0
 
miyahiraAuthor Commented:
I'm little curious about that failure of your SAS drive.

Would you able to prevent it or it just happened?

Since you bought them, where they running one month , or one year and then they fail?
0
 
pjamCommented:
They are covered for 3 years I think and i had two die at about 1 year.  Actually they didn't die, but ACU said they were saying "Warning Predictive Failure".
Nothing to prevent, sorry to say.  they were flashing yellow though so it's good I visit server room several times a week.
0
All Courses

From novice to tech pro — start learning today.