Solved

help finding root cause of file system corruption

Posted on 2014-02-24
12
529 Views
Last Modified: 2014-03-18
We ship a PC/software bundle and I have a customer whose system failed to startup (quick bluescreen/reboot/bluscreen/reboot etc..) and the last known & recovery options didn't work.  

We pulled a disk, chkdsk shows thousands of bad files, and a virus scan shows no virus.  (This is Win7 Pro with dual SSD drives using Windows Dynamic Disk mirroring, and both drives are equally affected.)

The customer wants to know what happened, and I'm at a loss. Sure I can say "the OS corrupted the file-system" but that seems incomplete. There was no sudden loss of power that the customer remembers.

Can anybody suggest a way I can actually find root cause?
0
Comment
Question by:PMH4514
  • 4
  • 3
  • 3
  • +1
12 Comments
 
LVL 46

Expert Comment

by:noxcho
ID: 39882860
If it had BSOD then it generated a minidump file. Go to C:\Windows\Minidump and get the dump files from there. Upload the latest one.
Then tell me what type of errors was CHKDSK showing?
0
 

Author Comment

by:PMH4514
ID: 39883056
Hi. No BSOD, so no minidump file.

CHKDSK output was mostly lines like this (sorry for German)

Beschädigter Attributeintrag (128, "") wird
vom Datensatzsegment 1253237 gelöscht.

(ie. Damaged attribute record (128, "") is deleted from file record segment 1253237.)

several others related to "index entry deleted"..

ending with (again translated to English):

Free space verification is complete.
Error in (MFT Master File Table) to be corrected.
Error in the attribute BITMAP the Master File Table (MFT) to be corrected.
Errors in volume bitmap to be corrected.
Windows has made corrections to the file system.


Following CHKDSK, still no ability to boot.
0
 
LVL 78

Expert Comment

by:David Johnson, CD, MVP
ID: 39883074
replace the hard drive.. maybe the computer was knocked over.. Either way it is a failing hard drive
0
 

Author Comment

by:PMH4514
ID: 39883184
well of course, the drive has been replaced, needed data is recovered.. There is no issue I need help "fixing" here.. rather,  I was asked by the customer to explain what happened. I have no idea, and so am just looking for resources to help me figure it out.
0
 
LVL 46

Expert Comment

by:noxcho
ID: 39883266
German is not a problem, I speak it as well. So the errors you got point to the corrupt MFT which would leave the whole file system helpless.
You said the drive was SSD, did its size in Bytes get smaller after this problem? Possible cause is dead page on the SSD page system, und unluckily the MFT was located in this page exactly. Page is like clusters on HDD.
What did this cause? "Quick bluescreen" - the output of this bluescreen would give us a hint. Are you sure it did not create any .dmp file?
Have you checked in the given directory?
0
 

Author Comment

by:PMH4514
ID: 39883609
I would have thought the whole file-system would be helpless. But when I plugged into a dock to make it an external drive, I was able to recover from it most of the data.

The output of the blue-screen was unfortunately not described by the customer before they sent me the drive and I was not able to view it prior to the first CHKDSK attempt which was done by somebody else.  I find no .dmp files when I search the disk.

But I have to wonder about SSDs in the first place.. As I understand (rumor?) there is a limited number of read/write cycles..  Our product does a very high amount of I/O, potentially millions of small files written and read (but very rarely ever deleted.)  Maybe we hit the limit? I'm going to research some SSD diagnostic tools.
0
Save on storage to protect fatherhood memories

You're the dad who has everything. This Father's Day, make sure your family memories are protected. My Passport Ultra has automatic backup and password protection to keep your cherished photos and videos safe. With up to 3TB, you have plenty of room to hold the adventures ahead.

 
LVL 78

Expert Comment

by:David Johnson, CD, MVP
ID: 39883890
You're lucky usually when an SSD fails there is nothing at all that can be done.. The limited # of write cycles (there is no limit on read cycles) actually is a very large #.. and it is each cell can only be written a number of times before it fails and is swapped out.. Most SSD's have pretty decent wear leveling to help mitigate the drive wearing out prematurely
0
 
LVL 46

Accepted Solution

by:
noxcho earned 250 total points
ID: 39884042
Lets think logically, this is not a single machine with SSD and your software that you sell or work with, is it? If not then this problem occurred on this machine only - right?
Then if only one disk made this problem then it is the disk itself but not outworn pages. The hardware can fail, even new one. So don't make conclusions in hurry. Consider that this was a problem with a single drive only.
And still do not forget that this problem could occur also due to the software reason such as controller driver, installed program which corrupt first sector. Thats why I was asking for BSOD exception or minidump.
0
 
LVL 78

Expert Comment

by:David Johnson, CD, MVP
ID: 39884120
SSD's like most electronics tend to fail within the first 30 days of use ad the parts get to operating temperature and burn in, otherwise go on for years,
0
 
LVL 91

Assisted Solution

by:nobus
nobus earned 250 total points
ID: 39884917
0
 

Author Comment

by:PMH4514
ID: 39934998
sorry for the delay.
@noxcho = I think I understand what you're getting at. Yes, we sell a combined package of the PC/drives and software.  They are all the same. So far it is only this PC that has shown the problem, but we know this customer has used it more than most.  I'm just looking for ways to verify if it's a fluke, or if this is a ticking time-bomb for all others. ]

@nobus - not Intel, but those are interesting looking tools.
0
 
LVL 91

Expert Comment

by:nobus
ID: 39936181
tx for feedback
0

Featured Post

Free camera licenses with purchase of My Cloud NAS

Milestone Arcus software is compatible with thousands of industry-leading cameras for added flexibility. Upon installation on your My Cloud NAS, you will receive two (2) camera licenses already enabled in the software. And for a limited time, get additional camera licenses FREE.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The recent Microsoft changes on update philosophy for Windows pre-10 and their impact on existing WSUS implementations.
By default the complete memory dump option is disabled in windows . If we want to enable the complete memory dump for a diagnostic purpose, we have a solution for it. here we are using the registry method to enable this.
This video teaches viewers how to encrypt an external drive that requires a password to read and edit the drive. All tasks are done in Disk Utility. Plug in the external drive you wish to encrypt: Make sure all previous data on the drive has been …
This Micro Tutorial will go in depth within Systems and Security in Windows 7 and will go into detail regarding Action Center, Windows Firewall, System, etc. This will be demonstrated using Windows 7 operating system.

911 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now