Link to home
Start Free TrialLog in
Avatar of PMH4514
PMH4514

asked on

Help track source of "bugcheck" reboot.

I'm having a problem with some of our machines, I'm not sure if it's related to our custom software or not, but unfortunately we are unable to determine a repeatable set of steps to reproduce this, but for the machines on which it is happening (they're all configured the same, Windows 2000) it happens frequently, other machines never see this.  

What happens is the computer just reboots. (a "courtesy reboot" as we've been tounge-in-cheek calling it)

All we can see are event logs left behind. The event shows in the Event Viewer like this:

Source: Save Dump
 Category: None
 Event ID: 1001

 Description:
 The computer has rebooted from a bugcheck. The bugcheck was: 0x0000000a (0x00000000, 0x00000002, 0x00000001, 0x80448ee5). Microsoft Windows 2000 [v15.2195]. Adump was saved in: C:\WINNT\Minidump\Min060404-01.dmp.

The output of DUMPCHK looks like:
Filename . . . . . . .Mini060404-01.dmp
Signature. . . . . . .PAGE
ValidDump. . . . . . .DUMP
MajorVersion . . . . .free system
MinorVersion . . . . .2195
DirectoryTableBase . .0x13f55000
PfnDataBase. . . . . .0x85410000
PsLoadedModuleList . .0x80484520
PsActiveProcessHead. .0x80485c68
MachineImageType . . .i386
NumberProcessors . . .2
BugCheckCode . . . . .0x0000000a
BugCheckParameter1 . .0x00000000
BugCheckParameter2 . .0x00000002
BugCheckParameter3 . .0x00000001
BugCheckParameter4 . .0x80448ee5

ExceptionCode. . . . .0x80000003
ExceptionFlags . . . .0x00000001
ExceptionAddress . . .0x8046987c

plus (in -v verbose mode) it then lists all the modules loaded:
Module ntoskrnl.exe loaded at 0x80400000
Module hal.dll loaded at 0x80062000
Module BOOTVID.dll loaded at 0xeb810000
.. and on and on and on.

I assume the "ExceptionAddress" would point to the address of whatever failed. but that address appears nowhere within the long list of Module lines.

Thoughts?

thanks!
-Paul



Avatar of BigC666
BigC666

howdy,

right click my computer->properties->advanced->startup and recovery uncheck auto reboot, this will hold the error on the screen until you do a manual reboot. see what it says and repost.

hope we can  help
Avatar of Luc Franken
Hi PMH4514,

Please use pstat.exe as explained here:
http://support.microsoft.com/default.aspx?kbid=192463

This will point out what driver/program would have caused the problem.

But as you say it happens frequently, I suggest you to start with suspecting the RAM. Check it with a tool like http://www.memtest86.com (if this tool doesn't find errors, doesn't mean it's good, the only way to be 100% sure is to replace it. But if it finds any errors, you can be sure it's bad)

Greetings,

LucF
BigC666,
It'll say the following:

STOP: IRQL_NOT_LESS_OR_EQUAL
0x0000000a (0x00000000,0x00000002,0x00000001,0x80448ee5)
ok, as long as it is holding and not doing an auto reboot, what LucF says abouve is correct. the pstat program will let you check for a possible driver problem. however i've found that doing the memtest is quite misleading.1) make sure that the cpu fan is running 2)pull all but one stick of ram and reboot if runs ok, then sub. other ram sticks until you find the one that's giving you problems and replace. fought one of these for a week before just doing ram sub. found the bad stick in 2 reboots.

hope that this helps
Avatar of PMH4514

ASKER

Yeah, I had turned the auto-reboot option off, we keep that on becuase the blue screen of death scares customers should it happen.  I'll see if I can duplicate it again when a box arrives from a customer later today. LucF -how do you know it'll show STOP: IRQL_NOT_LESS_OR_EQUAL ?

I'll have to find a copy of pstat, I don't know if we own the resource kit.
Avatar of PMH4514

ASKER

in what way is memtest misleading? if we have defective ram chips, we'll need a way to prove it to our vendor.
>>LucF -how do you know it'll show STOP: IRQL_NOT_LESS_OR_EQUAL <<
Pretty easy, 0x0000000A is IRQL_NOT_LESS_OR_EQUAL

You can find a whole list of bugcheckcodes and their names and possible solutions at:
http://www.aumha.org/win5/kbestop.htm

Btw, I just noticed, you won't need pstat.exe
The exception address is 0x8046987c
ntoskrnl.exe starts at 0x80400000 (and goes on to 0x80062000)

So this is where your problem exists.
If memmory checking doesn't help, try the windows system file checker:

Description of the Windows 2000 System File Checker (Sfc.exe)
http://support.microsoft.com/?kbid=222471

LucF
most of the time that i run the memtest programs that i have the don't uncover leaks and this is the major problem in these cases. they will find a dead chip but not the leaks. so that's why the ram sub suggestion, for me it was faster. also to answer for LucF that the usual text with the stop error that you provided.

hope that this helps
Avatar of PMH4514

ASKER

also,
>>2)pull all but one stick of ram and reboot if runs ok, then sub. other ram sticks until you find the one that's giving you problems and replace. fought one of these for a week before just doing ram sub. found the bad stick in 2 reboots.

this may be problematic, as the crash doesn't happen all of the time, and when it does, it's at seemingly random points in time. ie. I could pull all but one memory stick and reboot, and not see the problem, even if perhaps that stick was corrupted in some way.. know what i mean?

if you have a problem stick it will show rather quickly.
Avatar of PMH4514

ASKER

>>Btw, I just noticed, you won't need pstat.exe
>>The exception address is 0x8046987c
>>ntoskrnl.exe starts at 0x80400000 (and goes on to 0x80062000)
ahh, I was just about to ask that. I wasn't able to find 0x8046987c in the list from dumpchk, but I thought maybe "find the closest" would be it. You've verified that for me.

so ntoskrnl.exe is definitely the problem? We have several Minidump files from a few months time, they all show the same thing as far as pointing to ntoskrnl.exe

googling that finds:
http://support.microsoft.com/default.aspx?scid=kb;en-us;294690
That's a completely different dumpcheckcode ( 0x0000001E: KMODE_EXCEPTION_NOT_HANDLED)
Although it's related to 0xA, it isn't the same error.

But anyway, upgrading to the latest Service Pack is never a dumb idea in case you haven't done that yet.
Avatar of PMH4514

ASKER

re: service pack upgrades, I don't configure the machines as they go out, I understand they are configured with the latest service packs. I'll check though.

this is the right tech note? http://support.microsoft.com/default.aspx?scid=kb;EN-US;165456
Avatar of PMH4514

ASKER

that tech note I posted in the last comment says "Microsoft has confirmed this to be a problem in Windows NT version 4.0" - am I looking at the right thing? we're on Windows 2000.
haven't encountered this perticular problem with win2k, on nt yes
ditto :)
Avatar of PMH4514

ASKER

ok thanks.. I've documented everything I've learned from reading your comments and the various posted links. I have to get the folks here who are supposed to be doing this as their job (rather than standing around all day chit-chatting about golf) to go ahead and do the related grunt work.. I have code to write on a deadline. I will report back, hopefully this afternoon if they get their acts together and let ya know what we find.  thanks!
Ok, good luck :)

LucF
yep, good luck
Avatar of PMH4514

ASKER

hmm.. so far, they've run the memtest for a few hours with no problems showing. they've installed the latest service pack as the MS tech note mentioned. The app did eventually crash with the same exact stop message. I believe they have been able to make "our app" crash, as well as MS Paint..  if the memory is testing OK and the hard-drives are OK, and windows is up to date, what else could it be??
as i said in an earlier post, the memtest is not definitave, the only sure way to test memory is to pull and run one stick at a time.
sorry
Avatar of PMH4514

ASKER

just an update. the guys have been trying to methodically replace things piece by piece. They have also encountered the bluescreen for these stop codes:

0x0000004E PFN_LIST_CORRUPT
portcls.sys 0x000000D1

I had thought they were already trying swapping ram chips, I'm told today they begin that..  what a pain :)

I'm thinking it could also be the onboard video driver.

ASKER CERTIFIED SOLUTION
Avatar of Luc Franken
Luc Franken
Flag of Netherlands image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of PMH4514

ASKER

new RAM chips. haven't seen the problem yet!
Great to hear!

Glad to help,

LucF