Bad Block error in Event Viewer (Source "disk" with Event "7"), will mirroring the drive fix it?

We have a server that is reporting numerous bad block errors in Event Viewer.  It gives the event ID of 7 and source of Disk.  We've had this machine restart itself several times over the weekend, and our offline backup will not complete (freezes at 99%, then restarts).  It is reporting the error on Disk 0 (C: drive), and we have three other disks (1,2,&3) in this server that are configured as RAID-5, but we have never used them (it's the E: drive and there is no data on that drive).  So, we're thinking of deleting the volume (RAID-5) and taking one of those drives and mirroring the C: drive to a newly available drive, but, will this work?  We ran chkdsk /r remotely, but it didn't fix the error or bad sector(s), so would we be copying over the problem?  
Possibly related: the log of our backup system reports an error of a msdblog.ldf file, block 24 every time this backup fails, so we didn't know if block 24 was the one and only cause of this error...
Thanks for the help!
mdcr1Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Gary ColtharpSr. Systems EngineerCommented:
If you have a utility like ghost that can do a raw copy of the volume, you can replicate the install to one of the other disks. If there are bad sectors, you tell it to ignore them.

Once the copy is complete, a checkdsk will repair any issues.

Being that this is a server, I would strongly suggest making it fault tolerant. Once you get your boot volume stable, take another of those extra drives, convert your disks to dynamic and mirror them. Its not the best solution but at least there will be some degree of fault tolerance.

HTH
Gary
0
alicainCommented:
The bad blocks that are being reported are at the physical level on the disk, whereas chkdsk is working primeriliy at the logical level for NTFS.

The attempt to mirror the disk may or may not be successful depending upon the state of the file system and any logical corruption there might be.  While chkdsk isnt reporting any, which is good, the LDF is haing issues which might suggest some corruption.

It is not possible to "copy" corrutpion that exisits within a NTFS filesystem as part of a backup and restore proceure - the corrupt files (or the index entries for files) would just fail to be read from disk.  However, if there is corruption with NTFS, when attempting to mirror it, you may mirror that corruption.

The .ldf is probably large and therefore at increased probability of it being on part of the disk that is bad.  Some consistency checks of that database would be prudent.

I would say that the safest approach would be to backup the disk, remove the single disk,   create a new C: as a hardware mirror of the same size using the spare disks and then restore from tape.
Mount the other disk and you can try to get data off it if needed.

Regards,
Alastair.
0
mdcr1Author Commented:
Gary,
  If we were to use Ghost (which we don't have - yet), would it be v11.x, or the Ghost Solution Suite 2.5 (checking their website, it looks like the Suite is what is being offered currently)?  If it's offered for download, that is what we'd look to do immediately.  Thanks
0
Making Bulk Changes to Active Directory

Watch this video to see how easy it is to make mass changes to Active Directory from an external text file without using complicated scripts.

Gary ColtharpSr. Systems EngineerCommented:
Any recent version should work...the suite will have a lot of tools that you dont necessarily need right now. You just need the utility.
0
Gary ColtharpSr. Systems EngineerCommented:
BTW you need to boot from some removeable media, so if the suite includes a boot disk, that would simplify things. Essentially, you need to boot to a CD or bootable flash drive, then run the ghost utility from flash or <cough> floppy... Once the clone is complete, remove the drive with the bad sectors and place the cloned drive in its position.
0
alicainCommented:
Using Ghost or similar is a good suggestion but note that if there is corruption at the file system level, you will be duplicating that.  There are suggestions that there might be some of that in the backup freezes.  You may see any file system corruption snowball in the future.  A backup and restore to a clean filesystem would address that.

Regards,
Alastair.
0
Gary ColtharpSr. Systems EngineerCommented:
It is true that the corruption is duplicated. However, you are replicating to a viable disk that can then be repaired by chkdsk.
0
alicainCommented:
My concern is that chkdsk is apparently not repairing the issues currently being encountered.  It would be useful to see the log file generated to know if there is anything else not being fixed by chkdsk to have reassurance that ghosting will not duplicate any irreparable corruption.

I'm a cautious type but depending upon the value of the data and your appetite for risk, may make it worth trying...

Regards,
Alastair.
0
mdcr1Author Commented:
Alastair - completely agree; running chkdsk without any switches right now.....and, it's come back with:
"Windows has checked the file system and found no problems", don't know if that is a good or bad thing given that I'm having problems with this server....

When we do restart the server to do a chkdsk C: /r, does anyone have any idea how long a 225Gb hard drive would take?  Getting ready to notify users...Thanks!
0
Gary ColtharpSr. Systems EngineerCommented:
If your server is booted.... anything sitting on a bad block is recoverable from media. Cloning the drive to viable media and then running chkdsk puts the filesystem state to clean.

Bad blocks arent recovered with chkdsk...they are just marked as bad. Unless you are in the middle of a cascade failure, subsequent chkdsks would report no trouble. Yet if you have one bad block, you will likely have more in the near future.

As to the LDF...log files can be easily truncated or shrunk to move activity off of the failed block.  You will lose database transaction history but this it not usually a big deal.
0
mdcr1Author Commented:
Apparently was a cascading failure... After running chkdsk and then restarting, the server never got past crcdisk.sys on startup, restarting normally resulted in BSOD upon startup, server was rebuilt and data restored from backup...
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Seth SimmonsSr. Systems AdministratorCommented:
This question has been classified as abandoned and is closed as part of the Cleanup Program. See the recommendation for more details.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Storage Hardware

From novice to tech pro — start learning today.