Link to home
Start Free TrialLog in
Avatar of inf2300
inf2300

asked on

Blue Screen during repair on 2003 server

I am unable to load up my windows 2003 server.  Every time I boot i get a blue screen and the computer restarts.  The computer is my DC and Fileserver so I really need it to boot up!

Here are the things I tried to do since

-Boot in safe mode, safe mode with network support, last know working configuration
-I changed The motherboard and the ram
-I ran a chkdsk /r

Same result...


I plugged in another hard drive with windows 2003 and I was able to see all the files in my partitions and back them up.  I also did a fresh install on a different hard drive and it works well so i'm pretty sure it's not a hardware problem.

I decided to do a Windows repair.  After WIndows finishes copying the installation files to the drive, it tries to load the current windows configuration files and then gives me a blue screen again!

Here are the blue screen details.  It is the same one I get when I try to boot in windows.  I usually get the error message after the "applying network settings" box but once in a while i'm also able to log on before it blus screens.

PAGE_FAULT_IN_NONPAGED_AREA


Technical Information

***STOP 0x00000050 (0XCDCDD9FD, 0x00000000, 0x80902BE9, 0x00000000)


I also get the error 0x0000008e sometimes.  The research I've done up to now seem to indicate it's a defective memeory chip but I've changed the ram and the motherboard.  I haven't changed the raid card yet but all the drives in my array appear as healthy.

Any help would be appreciated, i can't afford to rebuild my domain
from scratch...

thanks!
Avatar of nurein99
nurein99

When did this start happening? It maybe your array driver!
Avatar of Jeffrey Kane - TechSoEasy
Have you run a current Virus Scan on the system volume?  (ie, when you had the drive plugged into another system).

Check out: http://support.microsoft.com/kb/903251  for more info on the HaxVirus.

Otherwise, that error is NOT indicative of a RAM issue... you need to follow the troubleshooting path for 0x00000050 found here:  http://www.microsoft.com/resources/documentation/windowsserv/2003/standard/proddocs/en-us/troubleshooting_specific_stop_messages.asp.  But it really sounds like the virus issue.

Jeff
TechSoEasy



Avatar of inf2300

ASKER

Thanks for the quick replies.

It started happening a few weeks ago.  The computer woulkd reboot by itself once or twice and would then run normaly.  Last week, I had to reboot it about 25 times before it finally loaded.  I called Dell and got an updated driver for the raid controller but that did not change anything.  Now i can't boot at all because it blue screened during the repair...

I did run a virus scan with symantec 10 and it didn't find anything, the files mentionned in the HAxvirus article are not on the drive and I can't access the registry to verify the keys.

During the repair, the BSOD appears after loading hive***.inf files. (hivecls.inf is the last one I can see before the error)  Is there any way to repair that file?

0x80902BE9 is the failing instruction address and the failing module is ntoskrnl which is windows kernel. The crash may be caused by faulty ram or device driver error. The system event log and the minidump has the most useful diagnostic information.
When Windows crashes with blue screen, it writes a system event 1001 or 1003 and a minidump to the folder \windows\minidump
Check system event 1001 and 1003 and it has the content of the blue screen.

Event ID: 1001
Source: Save Dump
Description:
The computer has rebooted from a bugcheck.The bugcheck was : 0xc000000a (0xe1270188, 0x00000002, 0x00000000, 0x804032100).
Microsoft Windows..... A dump was saved in: .......

Event Source: System Error
Event Category: (102)
Event ID: 1003
Description:
Error code 1000007f, parameter1 0000000d, parameter2 00000000, parameter3 00000000, parameter4 00000000

Control Panel -> Adminstrative Tools -> Event Viewer -> System -> Event 1001/1003. Copy the content and paste it back here

Zip 5 to 6 minidumps to a zip file and attach it at any webspace. I will study the dump and find out the culprit.

You can run memtest to stress the ram. If memtest reports the ram is faulty, ram is bad. However Memtest is not a perfect tool to test the memory as some faulty ram can pass memtest.

Suggestion
1. Check the temperature of the CPU and make sure that it is not overheat (ie temperature < 60C)
   Make sure that the CPU fan works properly
2. Reseat the memory stick to another memory slot. Reseat video card as well.
3. Downclock the ram. Check to default setting if you video card is overclocked.
4. Clean the dust inside the computer case
5. Make sure that the ram is compatible to the motherboard
6. Check the bios setting about memory timing and make sure that it is on
   For example : DIMM1 and DIMM2 do not have the same timing.
   DIMM1: Corsair CMX512-3200C2 512 MB PC3200 DDR SDRAM (2.5-3-3-8 @ 200 MHz) (2.0-3-3-7 @ 166 MHz)
   DIMM2: Corsair CMX512-3200C2 512 MB PC3200 DDR SDRAM (3.0-3-3-8 @ 200 MHz)
   DIMM3: Corsair CMX512-3200C2 512 MB PC3200 DDR SDRAM (3.0-3-3-8 @ 200 MHz)
7. Make sure that your PSU have adequate power to drive all the hardware including USB devices
8. Run chkdsk /r at command prompt

If it still crashes, diagnostic which memory stick is faulty
Take out one memory stick. If windows does not crash, the removed memory stick is faulty.


Avatar of inf2300

ASKER

I made an image of my C drive on another hard drive (IDE) that I plugged in a different computer (No raid).  I ran a repair and got the same error.  I think this proves it's definitely not hardware related.

I'm starting to think it's coming from the registry.  Is there a way to verify/fix it?
You CAN run a separate OS from the CD ROM drive to diagnose, repair your install.  Understand that this is NOT supported by Microsoft, and it could cause further damage.  

That being said, I've repaired a number of these type issues, or at least was able to do some forensics to determine the cause... by being able to access the registry as well as the entire file stucture.

Go to http://www.ubcd4win.com and see how to create your CD.  You can see a video overview of this here:  http://flynntargart.blogsite.org/build.htm.  It sounds a bit complicated, but if you follow the instructions, you can have the CD created in about 45 minutes or so.  (Then it will be your life saver forever!!).

If you want to check out the tools that are included with this build, see http://www.ubcd4win.com/contents.html.  Especially of interest are the Registry Tools:

Erunt       1.1i      Emergency Recovery Utility -Registry backup and restore
RegBrws      1.2.2      Browses local/remote system registry using a specified account
RegCleaner4.3      Remove obsolete registry entries from software that you may have deleted
RegEditPE      0.9c      Registry Editor for PE, SourceForge project
RegResWiz 1.0.0.4      Wizard that restores the registry to previous saved states

As long as you have images of your drive, it can't really hurt to give these a quick try.

Jeff
TechSoEasy
cpc2004... your comment regarding ntoskrnl got me thinking... I didn't agree with your analysis, however, because usually the RAM errors are IRQLNOT_LESS_OR_EQUAL instead of PAGE_FAULT_IN_NONPAGED_AREA

So, I Googled PAGE_FAULT_IN_NONPAGED_AREA and ntoskrnl --- lo and behold there's a hotfix KB that just came out a few weeks ago: http://support.microsoft.com/kb/832336.

I'm hoping this is your fix inf2300... because that would make it TOO easy.

Jeff
TechSoEasy
Avatar of inf2300

ASKER

Thanks... I will create the CD... In the meantime here is the log from Microsoft WinDbg

Microsoft (R) Windows Debugger  Version 6.5.0003.7
Copyright (c) Microsoft Corporation. All rights reserved.


Loading Dump File [C:\Documents and Settings\Administrator\Desktop\Mem Dumps\Mini112605-18.dmp]
Mini Kernel Dump File: Only registers and stack trace are available

Symbol search path is: SRV*c:\symbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
Windows Server 2003 Kernel Version 3790 MP (2 procs) Free x86 compatible
Product: LanManNt, suite: TerminalServer SingleUserTS
Built by: 3790.srv03_rtm.030324-2048
Kernel base = 0x804de000 PsLoadedModuleList = 0x8057b6a8
Debug session time: Sat Nov 26 12:06:15.296 2005 (GMT-5)
System Uptime: 0 days 0:02:28.953
Loading Kernel Symbols
.........................................................................................................
Loading unloaded module list
...
Loading User Symbols
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 50, {c8fb1ca8, 0, 80597b88, 0}


Could not read faulting driver name
Probably caused by : ntkrnlmp.exe ( nt!HvpGetCellMapped+7f )

Followup: MachineOwner
---------

0: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

PAGE_FAULT_IN_NONPAGED_AREA (50)
Invalid system memory was referenced.  This cannot be protected by try-except,
it must be protected by a Probe.  Typically the address is just plain bad or it
is pointing at freed memory.
Arguments:
Arg1: c8fb1ca8, memory referenced.
Arg2: 00000000, value 0 = read operation, 1 = write operation.
Arg3: 80597b88, If non-zero, the instruction address which referenced the bad memory
      address.
Arg4: 00000000, (reserved)

Debugging Details:
------------------


Could not read faulting driver name

READ_ADDRESS:  c8fb1ca8

FAULTING_IP:
nt!HvpGetCellMapped+7f
80597b88 8b4604           mov     eax,[esi+0x4]

MM_INTERNAL_CODE:  0

CUSTOMER_CRASH_COUNT:  18

DEFAULT_BUCKET_ID:  DRIVER_FAULT_SERVER_MINIDUMP

BUGCHECK_STR:  0x50

CURRENT_IRQL:  1

LAST_CONTROL_TRANSFER:  from 805ae410 to 80597b88

TRAP_FRAME:  f71a9860 -- (.trap fffffffff71a9860)
ErrCode = 00000000
eax=000003ff ebx=e1982a90 ecx=e1982c2c edx=85f271b8 esi=c8fb1ca4 edi=0000013c
eip=80597b88 esp=f71a98d4 ebp=f71a992c iopl=0         nv up ei ng nz na pe nc
cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00010282
nt!HvpGetCellMapped+0x7f:
80597b88 8b4604           mov     eax,[esi+0x4]     ds:0023:c8fb1ca8=????????
Resetting default scope

STACK_TEXT:  
f71a992c 805ae410 e1982a90 ffffdc30 e1982a90 nt!HvpGetCellMapped+0x7f
f71a9948 805acc33 e1982a90 f71a9b8c ffffdc30 nt!CmpDoCompareKeyName+0x10
f71a9968 805aee46 e1982a90 f71a9b8c ffffffff nt!CmpCompareInIndex+0x104
f71a9998 805aeecc d3600024 d36319ec f71a9b8c nt!CmpFindSubKeyInRoot+0x34
f71a99c0 805ae176 ffffffff d3358cf0 f71a9b8c nt!CmpFindSubKeyByName+0x50
f71a9b94 8058d6c8 00357cd0 00357cd0 85a68690 nt!CmpParseKey+0x47f
f71a9c10 8058d9a9 00000abc f71a9c50 00000040 nt!ObpLookupObjectName+0x117
f71a9c64 8059830e 00000000 8659e688 00e2ed01 nt!ObOpenObjectByName+0xe8
f71a9d50 804dfd24 00e2efa0 00020019 00e2ecc0 nt!NtOpenKey+0x1bd
f71a9d50 7ffe0304 00e2efa0 00020019 00e2ecc0 nt!KiSystemService+0xd0
00e2ed00 00000000 00000000 00000000 00000000 SharedUserData!SystemCallStub+0x4


FOLLOWUP_IP:
nt!HvpGetCellMapped+7f
80597b88 8b4604           mov     eax,[esi+0x4]

SYMBOL_STACK_INDEX:  0

FOLLOWUP_NAME:  MachineOwner

SYMBOL_NAME:  nt!HvpGetCellMapped+7f

MODULE_NAME:  nt

IMAGE_NAME:  ntkrnlmp.exe

DEBUG_FLR_IMAGE_TIMESTAMP:  3e8015c6

STACK_COMMAND:  .trap fffffffff71a9860 ; kb

FAILURE_BUCKET_ID:  0x50_nt!HvpGetCellMapped+7f

BUCKET_ID:  0x50_nt!HvpGetCellMapped+7f

Followup: MachineOwner
---------

Our posts must have crossed... so be sure to check out the hotfix.  If you hadn't replaced both the mobo and RAM, I'd probably bet it was either bad RAM or dust in the RAM slots.  Have you tested all the RAM you've used on this machine in another one to see if it errors there?

Jeff
TechSoEasy
Avatar of inf2300

ASKER

Hi Jeff,

Thanks for the replies... the RAM was tested in different machines & we have changed it... I'm 100% positive the ram is fine

Hi cpc2004

here is a link to the files

http://216.46.10.6/minidumps/minidumps.zip

Thanks for your help!
From your minudmp, I find the some modules of Norton AV is at different level.  De-install or upgrade Norton AV may resolve the problem.

SYMEVENT SYMEVENT.SYS Thu Jan 15 10:02:13 2004 (4005F4A5)
naveng   naveng.sys   Wed Oct 26 10:01:15 2005 (435EE36B)
Avatar of inf2300

ASKER

Hi cpc2004,

Well like i wrote before i'm at the point where i tried to do a repair and it failled after the file copy. When it loads some files hive***.inf just before the first reboot. So it's impossible for me to uninstall Symantec because the system doesn't boot. I added a HD & installed Windows 2003 so i can access the files & load the hives. Can i disabled those services like this?? also i noticed my software hive is 13.5 Megs is this a problem??

Thanks
Avatar of inf2300

ASKER

Just to clarify the system blue screens during the repair just before the first reboot. So all the files are copied, then it says loading default configuration & finally it loads a couple hive files(the last one being HIVECLS.INF) and blue screens...
Avatar of inf2300

ASKER

Well just to give you an update... We called Microsoft Support... sent them our system & software hive files... They emailed us back our 2 files modified... We installed them and ran the repair and everything is all good... I'm really not sure what they did. They claim to have only run a chkreg /c but we did it and it didn't work so who knows......

Thanks for all your help....
horray... I've actually been saying lately that Microsoft PS has been rather good lately... what sounds like a lot of money at first, turns out to be a great value... especially if they don't charge at all (usually if your problem is business critical, they won't.


Congrats!

Jeff
TechSoEasy
Would you tell me which files fix the problems.
Avatar of inf2300

ASKER

We sent them the system & software hive files. They sent them back to us. The file size were much smaller than the ones we sent. We replaced our files with them and everything worked right away... I'm really not sure what they did...
Avatar of inf2300

ASKER

This is Microsoft's answer if anyone's interested:

It was my pleasure to assist you with Stop 0x50 issue.  I hope that you were delighted with the service I provided.

Based on our conversation/correspondence it appears that this problem has been resolved.  If this is premature, or you are not satisfied with all aspects of the service you received, please let me know as soon as possible.  Otherwise, I will close this case by the close of business today.  Thank you for choosing Microsoft.

The following is a summary of the key points of the case for your records.  I hope that you will find it useful.

Problem:

=====================================================

 1. Corruption in Software HIVE.

 Resolution: =====================================================

 

1.      XP and Windows 2003 registry editor has an inbuilt feature to repair corruption of Hives.
2.      We loaded the HIVE in Windows 2003 machine but it was not able to detect corruption
3.      We loaded the same in XP machine and XP was able to detect and correct software hive corruption
4.      We compacted Software and System Hive using chkreg.exe
5.      Using new repaired Hives we were able to complete OS repair.
6.      After OS repair, everything was working fine.

It’s been a pleasure being able to help you and once again sorry for the long time taken to resolve this issue!

Thank You
Thanks for your information
ASKER CERTIFIED SOLUTION
Avatar of GhostMod
GhostMod
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial