Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions

System crashes when running chkdsk on RAID 0

Posted on 2006-06-23
Last Modified: 2008-01-09
OK, I'm at my wit's end with this one.

I have a customer that has brought in a computer that we built for them (spiffy gaming machine).  He was running an ASUS A8N SLI Deluxe (nForce4 Sli chipset) on a RAID 0 (striped) with twin WD 250GB Sata hard disks.  

The problem is this:

Due to an unrelated issue (crappy video drivers that have since been replaced), windows was shut down improperly, naturally, chkdsk came up when the computer booted and attempted to scan as it typically does.  During the first phase (verifying files) the computer locks at 7%, then restarts a few seconds later.  It is ALWAYS 7%.  I can clear the dirty bit with no problem, and we have in the past when it was previously brought in for this same problem, but as soon as he needs to run a chkdsk or windows craps out (as it typically does), we'll be right back where we are now (as we have been four times now).

So far we have:

destroyed and rebuilt the raid array, with a format and reload
low leved formatted the drives using WD Data Lifeguard
COMPLETELY tested the drives using WD data Lifeguard, and Microscope
Replaced the drives
replaced the sata cables
replaced the mainboard
replaced the memory
replaced the video card
removed EVERYTHING non-vital to operation, with new memory and new sata cables
set the speed cap jumpers on the hard disks to cap speed at 150 mb/s
enabled spread-spectrum clocking on both drives
Spent 5 hours on the phone with ASUS, who sent us two replacement motherboards, both of which have exibited the same problem
Attempted to call Microsoft, only to have them demand money for support.
called western digital, who, so for, has not said anything useful
had several heart-to-hearts with google to find others who have had a similar problem.
ran out of ideas.

This customer is driving us absolutely insane.  We have had his computer for about three months, now, and he calls us multiple times a day (I probably would, too).  We can't just disable the raid controller, because he is adamant that he wants to keep his raid setup.  We can't put any more money into this computer, because factoring lost labor time, we have lost hundreds on this deal.  We stand behind our machines, but this one has us dumbfounded.  ASUS claimed they tested the raid capability and its ability to run a checkdisk, however, it still doesn't work.

Any help with this would be immensly appreaciated.


Question by:dapsychous
  • 2
  • 2
  • 2
  • +1
LVL 44

Expert Comment

ID: 16974287
The answer to this is -- do NOT run RAID 0 on these drives.  RAID 0 is not fault tolerant, a simple CHKDSK can corrupt the entire array, and it is not worth losing your data for an outdated RAID concept that was not reliable.  Use RAID 1 in future, where you have some redundancy, and also do not run CHKDSK when you have any problems.  check disk can corrupt the system FAT faster than you can say it, and it is not a reliable tool to repair any disk problems.

Recomment you use RAID 1 with 2 disks, and you do not run checkdisk for the drives, let them run as is.

Author Comment

ID: 16975287
Well, that's all well and good, but my customer REQUIRES that this be a RAID 0.  We have tried to take him off of it, but he always pitches a fit.  Additionally, I'm not running chkdsk, it's autorunning at boot because of a flagged dirty bit.  Simply clearing the dirty bit won't help, because it just does it again next time an autocheck is scheduled due to a windows glitch or something.

Author Comment

ID: 16975292
plus, he doesn't have important data anyway, its his gaming machine.  He keeps his data on his other computer, so this isn't a data issue, it's a speed and whether-or-not-it-works issue.
Three Reasons Why Backup is Strategic

Backup is strategic to your business because your data is strategic to your business. Without backup, your business will fail. This white paper explains why it is vital for you to design and immediately execute a backup strategy to protect 100 percent of your data.

LVL 30

Expert Comment

ID: 16976608
Have you tried a different machine(non ASUS) or different (non WD)drives?
Sounds like a bug in the firmware on the ASUS board.

If a dirty bit is being set,it sounds as if the file system is not being shutdown properly(which could be a driver and or firmware).
LVL 30

Accepted Solution

pgm554 earned 500 total points
ID: 16976622
Some disks can be flakey depending upon controllers and chipsets.
I had some old WD 4 gib drives back in the 90's that didn't get recognized by certain controller chipsets.
WD does not test every MOBO and chipset for their drives,so try another brand of drive or disk controller.

Also maybe try turning off DMA or write caching on the controller and see what happens.
LVL 44

Expert Comment

ID: 16977244
Educate him on the problems with RAID 0 or discontinue support, that is my only other suggestion.  Just because someone is ignorant = unaware of the problems with RAID 0 doesnt mean you can solve it.  To run RAID 0 for a gaming machine is just about as stupid as you can get -- a nerd with much reading, but little understanding, perhaps?

Expert Comment

ID: 16977868
Clear the dirty flag and rename the chkdsk executable so it can't run anymore :)

I saw this once years ago: NT4 on a compaq 1600 running RAID5.  same sort of thing, we ran a chkdsk over a weekend and came back to find it hadn't moved beyond 32%.  It was our PDC too (which we then moved of course).
I did a parallel install to it, and then backed up the entire volume (kinda like ghosting the OS without ghost).  restored it to an identical bit of kit in the lab and ran chkdsk (our plan was to swap hardware out to fix the issue).  Well I'd be darned if the clone didn't exhibit the exact same issue.
Unfortunatley we never did resolve the problem - even running from the parallel install would hang. However, we could assume the issue was related to software rather than hardware.

Are you backing up the customers machine and then restoring after re-jiggin the hardware etc?  If you are, maybe try a fresh install and NO restore.

Featured Post

Networking for the Cloud Era

Join Microsoft and Riverbed for a discussion and demonstration of enhancements to SteelConnect:
-One-click orchestration and cloud connectivity in Azure environments
-Tight integration of SD-WAN and WAN optimization capabilities
-Scalability and resiliency equal to a data center

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
reclamation process error on TSM 6 90
Dell Server drives 9 67
add storage to HP DL 380 G7 25 55
Raid 1 Rebuilding sync fail 3 31
Solid State Drive Performance Tips: Solid state storage technology is now a standard.  After testing and using several different brands and revisions of SSD's over the years I have put together a collection of tips,tools and suggestions that I ha…
Having issues meeting security compliance criteria because of those pesky USB drives? Then I can help you! This article will explain how to disable USB Mass Storage devices in Windows Server 2008 R2.
This video teaches viewers how to encrypt an external drive that requires a password to read and edit the drive. All tasks are done in Disk Utility. Plug in the external drive you wish to encrypt: Make sure all previous data on the drive has been …
This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…

837 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question