Windows replaced bad clusters in file xx on a HP scsi raid 5 array, how to identify defective drive?

Checking file system on C:
The type of the file system is NTFS.

A disk check has been scheduled.
Windows will now check the disk.
Cleaning up minor inconsistencies on the drive.
Cleaning up 57 unused index entries from index $SII of file 0x9.
Cleaning up 57 unused index entries from index $SDH of file 0x9.
Cleaning up 57 unused security descriptors.
CHKDSK is verifying Usn Journal...
Usn Journal verification completed.
CHKDSK is verifying file data (stage 4 of 5)...
Windows replaced bad clusters in file 87
of name \mssql\MSSQL$~1\Data\DISTRI~1.MDF.
Windows replaced bad clusters in file 7220
of name \mssql\MSSQL$~1\REPLDATA\unc\INSIGH~1\201004~1\TB5CD1~1.BCP.
Windows replaced bad clusters in file 26077
of name \mssql\MSSQL$~1\REPLDATA\unc\INSIGH~1\201004~1\TBLPDF~1.BCP.
Windows replaced bad clusters in file 32542
of name \mssql\MSSQL$~1\REPLDATA\unc\INSIGH~1\201003~1\TB5CD1~1.BCP.
Windows replaced bad clusters in file 34123
of name \mssql\MSSQL$~1\REPLDATA\unc\INSIGH~1\200802~1\TB50D9~1.BCP.
Windows replaced bad clusters in file 59114
of name \mssql\MSSQL$~1\REPLDATA\unc\INSIGH~1\200904~1\TB4CD1~1.BCP.
Windows replaced bad clusters in file 66747
of name \mssql\MSSQL$~1\REPLDATA\unc\INSIGH~1\200904~1\TBLPDF~1.BCP.
Windows replaced bad clusters in file 306249
of name \mssql\MSSQL$~1\REPLDATA\unc\INSIGH~1\200608~1\TB50D9~1.BCP.
Windows replaced bad clusters in file 313926
of name \mssql\MSSQL$~1\REPLDATA\unc\INSIGH~1\200608~2\TB50D9~1.BCP.
File data verification completed.
CHKDSK is verifying free space (stage 5 of 5)...
Free space verification is complete.
The size specified for the log file is too small.

213371743 KB total disk space.
137811912 KB in 82347 files.
42892 KB in 6088 indexes.
0 KB in bad sectors.
962587 KB in use by the system.
23040 KB occupied by the log file.
74554352 KB available on disk.

4096 bytes in each allocation unit.
53342935 total allocation units on disk.
18638588 allocation units available on disk.



Windows has finished checking your disk.
Please wait while your computer restarts.


For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

~~~~

this is on my domain controller, this is a HP raid5 array consisting of 4 72gb scsi disks. how can you get bad clusters on a raided drive? how can I know which physical drive is failing?

did I actually lose any data/get any data corruption?

I have backups of course, the problem if its hardware failure, and I am going to do migration to windows 2008 r2 from windows 2003, it will still take sometime to initiate things, buying a single replacement scsi might be viable but if I can't identify the drive and have to get 4 scsi drives and rebuild the array 1 disk at the time, it would be problematic not to mention prone to disaster.

Just to add this is a HP Proliant ML 350 G3, around 4+ years old.
I would also like to know what kind of remedial/repair action I can take
a) get 4 x 72gb scsi and rebuild the array disc by disc. (which I suspect is prone fo failure
b) mirror/ghost the drive using (recommend? I'm thinking macrium reflect) then plug in the new 4x 72gb scsi or 2x 300gb raid 0 and do a bare metal restore?
c) do nothing as the error appears to be logical rather than physical?

I am planning a infrastructure refresh to win2008 r2+sql2008 from the win2003 + sql2000 as well, so I think keeping costs down for the repair is probably best.

chrisloupAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

DavidPresidentCommented:
First, this is filesystem corruption, not RAID corruption. Treat this problem as if you had a single HDD, and ignore the RAID entirely.  

The way you test for RAID "corruption" is by checking the XOR/parity.  The controller has the ability to to do a consistency check, which goes through each block on each drive and makes sure that the XOR is correct.  In the process it repairs any unreadable blocks on any physical disks by recalculating what is supposed to be there, via the redundant data, and repair it.

a) One does not rebuild the array disk-by-disk.  Bad, awful idea.  Lose a disk during the rebuild, and you have 100% data loss.  Get an unreadable block during the process, you have partial loss. The correct way to replace the  disks is to backup, replace all disks, initialize the new array, then restore.

b) Do not use a 2x300 RAID0 in interim, unless you have the ability to predict the future and know that neither 300GB disk will fail or pick  up a bad block.  Use RAID1

c) It is logical, but that does not mean that there is not a physical cause.   Do you have a UPS with battery backup?  Do you perform proper shutdowns?  Try disabling the RAID write cache if it is enabled.  This will cause performance hit, but it insures data is written to disk drives on every I/O.     Read event logs.  Check for memory problems.  Use ECC memory if  you are not

But bottom line, ignore the RAID controller for purposes of correcting the situation.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
chrisloupAuthor Commented:
yes, I have confirmed it is a ntfs error due to improper shutdowns (cos the whole computer hanged/stalled due to some issues with a usb drive )

0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Storage

From novice to tech pro — start learning today.