Solved

help finding root cause of file system corruption

Posted on 2014-02-24
12
537 Views
Last Modified: 2014-03-18
We ship a PC/software bundle and I have a customer whose system failed to startup (quick bluescreen/reboot/bluscreen/reboot etc..) and the last known & recovery options didn't work.  

We pulled a disk, chkdsk shows thousands of bad files, and a virus scan shows no virus.  (This is Win7 Pro with dual SSD drives using Windows Dynamic Disk mirroring, and both drives are equally affected.)

The customer wants to know what happened, and I'm at a loss. Sure I can say "the OS corrupted the file-system" but that seems incomplete. There was no sudden loss of power that the customer remembers.

Can anybody suggest a way I can actually find root cause?
0
Comment
Question by:PMH4514
  • 4
  • 3
  • 3
  • +1
12 Comments
 
LVL 46

Expert Comment

by:noxcho
ID: 39882860
If it had BSOD then it generated a minidump file. Go to C:\Windows\Minidump and get the dump files from there. Upload the latest one.
Then tell me what type of errors was CHKDSK showing?
0
 

Author Comment

by:PMH4514
ID: 39883056
Hi. No BSOD, so no minidump file.

CHKDSK output was mostly lines like this (sorry for German)

Beschädigter Attributeintrag (128, "") wird
vom Datensatzsegment 1253237 gelöscht.

(ie. Damaged attribute record (128, "") is deleted from file record segment 1253237.)

several others related to "index entry deleted"..

ending with (again translated to English):

Free space verification is complete.
Error in (MFT Master File Table) to be corrected.
Error in the attribute BITMAP the Master File Table (MFT) to be corrected.
Errors in volume bitmap to be corrected.
Windows has made corrections to the file system.


Following CHKDSK, still no ability to boot.
0
 
LVL 80

Expert Comment

by:David Johnson, CD, MVP
ID: 39883074
replace the hard drive.. maybe the computer was knocked over.. Either way it is a failing hard drive
0
NAS Cloud Backup Strategies

This article explains backup scenarios when using network storage. We review the so-called “3-2-1 strategy” and summarize the methods you can use to send NAS data to the cloud

 

Author Comment

by:PMH4514
ID: 39883184
well of course, the drive has been replaced, needed data is recovered.. There is no issue I need help "fixing" here.. rather,  I was asked by the customer to explain what happened. I have no idea, and so am just looking for resources to help me figure it out.
0
 
LVL 46

Expert Comment

by:noxcho
ID: 39883266
German is not a problem, I speak it as well. So the errors you got point to the corrupt MFT which would leave the whole file system helpless.
You said the drive was SSD, did its size in Bytes get smaller after this problem? Possible cause is dead page on the SSD page system, und unluckily the MFT was located in this page exactly. Page is like clusters on HDD.
What did this cause? "Quick bluescreen" - the output of this bluescreen would give us a hint. Are you sure it did not create any .dmp file?
Have you checked in the given directory?
0
 

Author Comment

by:PMH4514
ID: 39883609
I would have thought the whole file-system would be helpless. But when I plugged into a dock to make it an external drive, I was able to recover from it most of the data.

The output of the blue-screen was unfortunately not described by the customer before they sent me the drive and I was not able to view it prior to the first CHKDSK attempt which was done by somebody else.  I find no .dmp files when I search the disk.

But I have to wonder about SSDs in the first place.. As I understand (rumor?) there is a limited number of read/write cycles..  Our product does a very high amount of I/O, potentially millions of small files written and read (but very rarely ever deleted.)  Maybe we hit the limit? I'm going to research some SSD diagnostic tools.
0
 
LVL 80

Expert Comment

by:David Johnson, CD, MVP
ID: 39883890
You're lucky usually when an SSD fails there is nothing at all that can be done.. The limited # of write cycles (there is no limit on read cycles) actually is a very large #.. and it is each cell can only be written a number of times before it fails and is swapped out.. Most SSD's have pretty decent wear leveling to help mitigate the drive wearing out prematurely
0
 
LVL 46

Accepted Solution

by:
noxcho earned 250 total points
ID: 39884042
Lets think logically, this is not a single machine with SSD and your software that you sell or work with, is it? If not then this problem occurred on this machine only - right?
Then if only one disk made this problem then it is the disk itself but not outworn pages. The hardware can fail, even new one. So don't make conclusions in hurry. Consider that this was a problem with a single drive only.
And still do not forget that this problem could occur also due to the software reason such as controller driver, installed program which corrupt first sector. Thats why I was asking for BSOD exception or minidump.
0
 
LVL 80

Expert Comment

by:David Johnson, CD, MVP
ID: 39884120
SSD's like most electronics tend to fail within the first 30 days of use ad the parts get to operating temperature and burn in, otherwise go on for years,
0
 
LVL 92

Assisted Solution

by:nobus
nobus earned 250 total points
ID: 39884917
0
 

Author Comment

by:PMH4514
ID: 39934998
sorry for the delay.
@noxcho = I think I understand what you're getting at. Yes, we sell a combined package of the PC/drives and software.  They are all the same. So far it is only this PC that has shown the problem, but we know this customer has used it more than most.  I'm just looking for ways to verify if it's a fluke, or if this is a ticking time-bomb for all others. ]

@nobus - not Intel, but those are interesting looking tools.
0
 
LVL 92

Expert Comment

by:nobus
ID: 39936181
tx for feedback
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Moving from Windows 7 to Windows 10... 13 69
Access database located in "cloud" storage 12 82
Dell R710 raid config 9 53
Can I clone an SSD to a SATA? 12 43
Is your phone running out of space to hold pictures?  This article will show you quick tips on how to solve this problem.
The business world is becoming increasingly integrated with tech. It’s not just for a select few anymore — but what about if you have a small business? It may be easier than you think to integrate technology into your small business, and it’s likely…
This Micro Tutorial will teach you how to reformat your flash drive. Sometimes your flash drive may have issues carrying files so this will completely restore it to manufacturing settings. Make sure to backup all files before reformatting. This w…
The Task Scheduler is a powerful tool that is built into Windows. It allows you to schedule tasks (actions) on a recurring basis, such as hourly, daily, weekly, monthly, at log on, at startup, on idle, etc. This video Micro Tutorial is a brief intro…

789 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question