Faster ChkDsk?

To do a read-only chkdsk it's just "chkdsk <driveletter>:" and then there are a few other options in Server 2008 R2. They are

/I
/C
/F
/R

My understanding is that /R takes the longest and is the most exhaustive. /F is the next slowest. And adding /I and /C can reduce the time even more and are usually pretty good at fixing the corruption issues shown in task manager.

Here's my goal: I manage a remote system. I need to run chkdsk on a non-boot-volume that has 10tb of data across hundreds of millions of files. I need it to finish as fast as possible with the best chance at fixing corruption. I think what I want to do is "chkdsk G: /f /i /c" right?

To help I will have stopped any service that touches the drive, stopped AV watching the volume, etc. Is there anything else I can do to help it go as fast as possible? We've maxed out the ram in the system. If I have it run at boot-up is that faster? I potentially could have remote hands hook up a monitor to it and check in with them from time to time. Last time I ran this remotely over RDP it took 15 hours on a smaller volume  but similar makeup.
MrVaultAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Neil RussellTechnical Development LeadCommented:
You have 10tb of data that is NOT on a raid array?
0
Neil RussellTechnical Development LeadCommented:
Bear in mind that CKDDSK is a dos based app that is NOT RAID aware. If you run this on a raid disk you risk destroying the raid array beyond repair.
Never run chkdsk on a raid array disk.
0
MrVaultAuthor Commented:
It is on a raid array. I have never heard anyone ever tell someone to not run chkdsk on a raid array. any production system in any respectable company is composed of raid arrays. who would store their critical data on an unprotected disk set?

in any case, this is on a SAN and we have run chkdsk on them in the past. just trying to get a better understanding of the chkdsk options vs performance.
0
Acronis Data Cloud 7.8 Enhances Cyber Protection

A closer look at five essential enhancements that benefit end-users and help MSPs take their cloud data protection business further.

MrVaultAuthor Commented:
btw - i read some articles and at least one seemed to suggest the risk in runny chkdsk on a raid array has more to do when running with the /R switch than the others, particularly /F. Do you find that to be the case? Of course Msft doesn't seem to want to give an official stance either.

0
Neil RussellTechnical Development LeadCommented:
"Read this http://www.dataclinic.co.uk/raid-server-faults.htm

You should never run chkdsk in /F or /F/R mode on a RAID array. Chkdsk is
not RAID aware, and will simply try to analyse and fix the NTFS tree as if
dealing with a normal disk. In doing so, it can write over information in a
degraded RAID array, that might have been recoverable at a lower level. You
can run it in read mode, but I would use the RAID controllers software.  You
should be able access during machine boot up process.  This should have
several options on checking for disk failure and recovery.

Chkdsk might be able to fix file system errors but it cannot do a surface
scan since it does not have access to any of the physical surfaces.  I have
lost an array doing chkdsk on a raid array.  That is why proper power
backup, data backup and recovery procedures are such a critical part of any
server setup."


An array need not show as degraded to be "Degraded". If ALL parity is not in perfect shape then the disk should be considered as unhealthy. Unhealthy disks lead to a degraded array.  Concensus is unclear, as you say Some say you can, some say you can't, some say I DID, others say DAMN! I DID And now I have no array.

The choice is yours at the end of the day but I would advise you to at the very least do an array verification from your controller first to make sure you are in as healthy a condition as you can be.

Quote was from http://www.eggheadcafe.com/microsoft/Windows-Server/32708508/raid-1-and-5-chkdsk.aspx
0
MrVaultAuthor Commented:
Thanks. I did see that entry (though not indexed by egghead).

I'm not sure what an array verification is on our controller. We are using Dell Equallogic arrays connected over iSCSI, not internal disks.

What other options do we have when our system crashes and we start getting tons of chkdsk errors, ntfs errors, etc in Windows. Reading specific corrupted files is not working and our app cannot perform without that data. This cannot be good and with 10TB and hundreds of millions of files, need less to say, doing some sort of Acronis imaging of it is not feasilble. We can take a snapshot of the array on the SAN before hand which we do which should let us recover fully if something were to get messed up. Unfortunately at the data storage sizes we're dealing with we can only recover up to an hour ago, not days. so if the system crashes and the storage array takes it's next snapshot before we can stop it, then we lose that backup. It's not the greatest strategy, I know but frankly resources prevent anything better at this point.

0
MrVaultAuthor Commented:
Besides those questions, the question still remains: what is the fastest way to run chkdsk and is "chkdsk g: /f /i /c" the correct way for those switches?
0
Neil RussellTechnical Development LeadCommented:
IF you were running against a standard hard disk and wanted the quickest check then I would say Yes, thats the quickest chkdsk you can run to do what you want.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
kevinhsiehCommented:
A better long term strategy may be to not use such a large volume. Break it up into smaller LUNS and use DFS namespace or mount points to make it look like a unified space if needed. May way want to reduce your snapshot schedule while doing the chkdsk so you can keep a good snapshot. Sounds like you need more capacity. I found the PS4100X to be a great performer in terms of IOPS, capacity, and price. With 24 900 GB drives it is like buying a small PS6500 but at a fraction of the price.

I believe that I heard that Windows Server 8 will run a chkdsk while the volume is online.
0
MrVaultAuthor Commented:
yes, in 2008 R2 we can run chkdsk while it's online. we turn off all access to it from apps/users, but we don't have to take the server down.

we would only do a snapshot right before starting chkdsk manually and not schedule another one until we were sure the volume was finished, back being used, etc. that should protect us against any chkdsk screwups right?
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Windows Server 2008

From novice to tech pro — start learning today.