Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17


System crashes when running chkdsk on RAID 0

Posted on 2006-06-23
Medium Priority
Last Modified: 2008-01-09
OK, I'm at my wit's end with this one.

I have a customer that has brought in a computer that we built for them (spiffy gaming machine).  He was running an ASUS A8N SLI Deluxe (nForce4 Sli chipset) on a RAID 0 (striped) with twin WD 250GB Sata hard disks.  

The problem is this:

Due to an unrelated issue (crappy video drivers that have since been replaced), windows was shut down improperly, naturally, chkdsk came up when the computer booted and attempted to scan as it typically does.  During the first phase (verifying files) the computer locks at 7%, then restarts a few seconds later.  It is ALWAYS 7%.  I can clear the dirty bit with no problem, and we have in the past when it was previously brought in for this same problem, but as soon as he needs to run a chkdsk or windows craps out (as it typically does), we'll be right back where we are now (as we have been four times now).

So far we have:

destroyed and rebuilt the raid array, with a format and reload
low leved formatted the drives using WD Data Lifeguard
COMPLETELY tested the drives using WD data Lifeguard, and Microscope
Replaced the drives
replaced the sata cables
replaced the mainboard
replaced the memory
replaced the video card
removed EVERYTHING non-vital to operation, with new memory and new sata cables
set the speed cap jumpers on the hard disks to cap speed at 150 mb/s
enabled spread-spectrum clocking on both drives
Spent 5 hours on the phone with ASUS, who sent us two replacement motherboards, both of which have exibited the same problem
Attempted to call Microsoft, only to have them demand money for support.
called western digital, who, so for, has not said anything useful
had several heart-to-hearts with google to find others who have had a similar problem.
ran out of ideas.

This customer is driving us absolutely insane.  We have had his computer for about three months, now, and he calls us multiple times a day (I probably would, too).  We can't just disable the raid controller, because he is adamant that he wants to keep his raid setup.  We can't put any more money into this computer, because factoring lost labor time, we have lost hundreds on this deal.  We stand behind our machines, but this one has us dumbfounded.  ASUS claimed they tested the raid capability and its ability to run a checkdisk, however, it still doesn't work.

Any help with this would be immensly appreaciated.


Question by:dapsychous
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
  • 2
  • +1
LVL 44

Expert Comment

ID: 16974287
The answer to this is -- do NOT run RAID 0 on these drives.  RAID 0 is not fault tolerant, a simple CHKDSK can corrupt the entire array, and it is not worth losing your data for an outdated RAID concept that was not reliable.  Use RAID 1 in future, where you have some redundancy, and also do not run CHKDSK when you have any problems.  check disk can corrupt the system FAT faster than you can say it, and it is not a reliable tool to repair any disk problems.

Recomment you use RAID 1 with 2 disks, and you do not run checkdisk for the drives, let them run as is.

Author Comment

ID: 16975287
Well, that's all well and good, but my customer REQUIRES that this be a RAID 0.  We have tried to take him off of it, but he always pitches a fit.  Additionally, I'm not running chkdsk, it's autorunning at boot because of a flagged dirty bit.  Simply clearing the dirty bit won't help, because it just does it again next time an autocheck is scheduled due to a windows glitch or something.

Author Comment

ID: 16975292
plus, he doesn't have important data anyway, its his gaming machine.  He keeps his data on his other computer, so this isn't a data issue, it's a speed and whether-or-not-it-works issue.
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

LVL 30

Expert Comment

ID: 16976608
Have you tried a different machine(non ASUS) or different (non WD)drives?
Sounds like a bug in the firmware on the ASUS board.

If a dirty bit is being set,it sounds as if the file system is not being shutdown properly(which could be a driver and or firmware).
LVL 30

Accepted Solution

pgm554 earned 1000 total points
ID: 16976622
Some disks can be flakey depending upon controllers and chipsets.
I had some old WD 4 gib drives back in the 90's that didn't get recognized by certain controller chipsets.
WD does not test every MOBO and chipset for their drives,so try another brand of drive or disk controller.

Also maybe try turning off DMA or write caching on the controller and see what happens.
LVL 44

Expert Comment

ID: 16977244
Educate him on the problems with RAID 0 or discontinue support, that is my only other suggestion.  Just because someone is ignorant = unaware of the problems with RAID 0 doesnt mean you can solve it.  To run RAID 0 for a gaming machine is just about as stupid as you can get -- a nerd with much reading, but little understanding, perhaps?

Expert Comment

ID: 16977868
Clear the dirty flag and rename the chkdsk executable so it can't run anymore :)

I saw this once years ago: NT4 on a compaq 1600 running RAID5.  same sort of thing, we ran a chkdsk over a weekend and came back to find it hadn't moved beyond 32%.  It was our PDC too (which we then moved of course).
I did a parallel install to it, and then backed up the entire volume (kinda like ghosting the OS without ghost).  restored it to an identical bit of kit in the lab and ran chkdsk (our plan was to swap hardware out to fix the issue).  Well I'd be darned if the clone didn't exhibit the exact same issue.
Unfortunatley we never did resolve the problem - even running from the parallel install would hang. However, we could assume the issue was related to software rather than hardware.

Are you backing up the customers machine and then restoring after re-jiggin the hardware etc?  If you are, maybe try a fresh install and NO restore.

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The question appears often enough, how do I transfer my data from my old server to the new server while preserving file shares, share permissions, and NTFS permisions.  Here are my tips for handling such a transfer.
Is your phone running out of space to hold pictures?  This article will show you quick tips on how to solve this problem.
This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…
This Micro Tutorial will teach you how to reformat your flash drive. Sometimes your flash drive may have issues carrying files so this will completely restore it to manufacturing settings. Make sure to backup all files before reformatting. This w…
Suggested Courses

721 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question