Event id 55 NTFS error - blue screen

Hi experts.

Arrived into work this morning to quite a major issue.

My main file server last night logged Event 55 - The file system structure on the disk is corrupt and unusable. Please run the chkdsk utility on the volume. It refers to my data partition.

It has crashed twice since 2 AM last night

Its a pretty new machine with one mirrored array for OS and four other disks in a RAID 5.

I have about 80 users on this box but I have logged them all out to take a full backup which is currently running as last nights failed with the crash.

Most recent patches are from last week and about 10 days ago i installed Symantec Endpoint.

What should I do next? I ran chkdsk in read only mode and it didnt report any errors and the allocation unit size is 4096.

I have seen some posts saying the drive needs to be reformatted because it could be a corrupt MFT but surely there are better ways?

Thanks in advance
LVL 1
BGilhooleyAsked:
Who is Participating?
 
BGilhooleyConnect With a Mentor Author Commented:

I contacted HP this morning and they told me to upgrade the P400 controller firmware, run the firmware maintenance CD and the Proliant Support pack. They think it will fix BSOD on boot.

Anyone out there had similar issues and have the upgrades worked?






0
 
fgolemoCommented:
Hi BGilhooley,
Please try to run chkdsk in write-enabled mode first (with both options checked or on cmd with f and r).
If that should bring fixable errors start dancing.
If that should bring up unfixable errors, you've got a problem:
(a) Worst case: Bad sectors (maybe caused by age or heat)
(b) corrupted MFT: grab a free copy of "testdisk"
www.cgsecurity.org/wiki/TestDisk
and fix it
0
 
fgolemoCommented:
I've read that enabled write-caching my be a severe problem, too - so try and disable it.
0
Improve Your Query Performance Tuning

In this FREE six-day email course, you'll learn from Janis Griffin, Database Performance Evangelist. She'll teach 12 steps that you can use to optimize your queries as much as possible and see measurable results in your work. Get started today!

 
fgolemoCommented:
microsoft on how to do that:
http://support.microsoft.com/kb/259716/en-us/
0
 
BGilhooleyAuthor Commented:
Thanks for the suggestions fgolemo. I am going to let an evening backup run and tonight try the chkdsk repair. Thankfully it was run ok all day and I have a good backup. Hopefully the chkdsk might work. I will post back tomoorrow
0
 
BGilhooleyAuthor Commented:

Update:

Ran chkdsk /r last night, found no bad sectors and completed without an issue.
Next I uninstalled windows updates from 25/10/08 and also symantec endpoint 11.0 which I had installed about 10 days ago. Replaced with the previously used symantec antivirus 10.2.

I've posted a server dump from last night. I'm not experienced from reading these but from what I can gather it could be anything from a bad HD, to a bad ram to a corrupt ntoskrnl file.
Anyone experienced at reading these dumps your input is appreciated.

By the way the initial error id 55 hasnt appeared in the system log for over 48 hours.. there is nothing in the system log of any



*********************************************************************
Unable to load image \WINDOWS\system32\ntoskrnl.exe, Win32 error 0n2
*** WARNING: Unable to verify timestamp for ntoskrnl.exe
*** ERROR: Module load completed but symbols could not be loaded for ntoskrnl.exe
Windows Server 2003 Kernel Version 3790 (Service Pack 2) MP (4 procs) Free x86 compatible
Product: Server, suite: TerminalServer SingleUserTS
Kernel base = 0x80800000 PsLoadedModuleList = 0x808af9c8
Debug session time: Wed Nov  5 23:39:15.957 2008 (GMT+0)
System Uptime: 0 days 0:39:04.828




MODULE_NAME: nt

FAULTING_MODULE: 80800000 nt

DEBUG_FLR_IMAGE_TIMESTAMP:  48a2bc85

READ_ADDRESS: unable to get nt!MmSpecialPoolStart
unable to get nt!MmSpecialPoolEnd
unable to get nt!MmPoolCodeStart
unable to get nt!MmPoolCodeEnd
 00000000

CURRENT_IRQL:  d0000002

FAULTING_IP:
+0
00000000 ??              ???

CUSTOMER_CRASH_COUNT:  7

DEFAULT_BUCKET_ID:  WRONG_SYMBOLS

BUGCHECK_STR:  0xD1

LAST_CONTROL_TRANSFER:  from 00000000 to 80836df5

FAILED_INSTRUCTION_ADDRESS:
+0
00000000 ??              ???

STACK_TEXT:  
808a3528 00000000 badb0d00 894a0001 8a34ab88 nt+0x36df5


STACK_COMMAND:  kb

FOLLOWUP_IP:
nt+36df5
80836df5 833d40ee8a8000  cmp     dword ptr [nt+0xaee40 (808aee40)],0

SYMBOL_STACK_INDEX:  0

SYMBOL_NAME:  nt+36df5

FOLLOWUP_NAME:  MachineOwner

IMAGE_NAME:  ntoskrnl.exe

BUCKET_ID:  WRONG_SYMBOLS

Followup: MachineOwner
---------

0
 
BGilhooleyAuthor Commented:

I should say also that since uninstalling the updates and symantec endpoint the issue has changed somewhat.

Now instead of BSOD while the server is running it now gives a BSOD while doing a reboot. I have to physically power down and back on the server. thank God though obviously far from ideal at least it isnt crashing randomly when users are working (though I hope i haven't spoken too soon).

0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.