Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Help analyzing Windows minidump files to troubleshoot Blue Screens

Posted on 2008-10-09
11
Medium Priority
?
854 Views
Last Modified: 2013-12-01
One of the Dell computers at our company has been experiencing the BSOD for quite some time, however, they are few and far between (about once per month).

The most recent message states:

The problem seems to be caused by the following file: Ntfs.sys
PAGE_FAULT_IN_NONPAGED_AREA
Stop: 0x00000050
NTFS.sys -Address F730663F base at F7304000, DATESTAMP 45XX56A7

I am not able to find out what the previous ones said, but by analyzing the minidumps in WinDbg, i can see that the errors are not the same.

Attached are the 9 most recent minidump files. I have tried opening some of these in WinDbg, a few of them say Memory Corruption, others reference problems with other files, such as NTFS.sys, ntoskrnl. My initial impression is that the ones referencing specific files are probably caused by the underlying memory problem.

I ran the dell diagnostic utility included in the boot menu for this computer, and found no errors. I know that bad memory can still pass this test, but I wanted to make sure that there aren't other problems and that memory is the only problem.

Can we tell if those other errors are caused by bad memory as well? I am new to using windbg. EE will not let me attach a zip file containing the dmp files, i will try hosting it and updating this question.
0
Comment
Question by:bradl3y
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 4
  • 2
11 Comments
 
LVL 6

Author Comment

by:bradl3y
ID: 22679926
0
 
LVL 6

Author Comment

by:bradl3y
ID: 22686299
We ran 20 loops of the Dell memory test, and all passed. Any idea what else could be wrong?
0
 
LVL 27

Expert Comment

by:Jonvee
ID: 22689877
Have used WinDbg to open four of the minidumps & i'm getting random errors also.
Examples>
IMAGE_NAME:  win32k.sys

FAILURE_BUCKET_ID:  0xD1_CODE_AV_BAD_IP_win32k!GreAcquireSemaphore+18

FAILURE_BUCKET_ID:  0x8E_win32k!EXLATEOBJ::bInitXlateObj+66

FAILURE_BUCKET_ID:  0x8E_SiSGRV+77b0

This is indicative of memory problems as you suspected.  Suspect RAM even though you had no failures.

Recommend you try the excellent memtest86+  v1.7
http://www.memtest.org/
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
LVL 27

Expert Comment

by:Jonvee
ID: 22689988
Cannot  be absolutely sure that faulty RAM is your *only* problem, but changing it seems the next logical step.
If you change it we could take a look at more Minidumps, and look for some consistancy in the errors.

Also suggest you check RAM socket(s) condition, and if you have more than one RAM stick, you could try removing all but one, then retest.

Presume cabinet and CPU cooling are ok  .. higher than normal temperature, dust?
Also assuming that you're using the correct RAM type.
0
 
LVL 6

Author Comment

by:bradl3y
ID: 22690001
Thanks, I will give memtest86+ a try. I am afraid dell is still going to be very stubborn with providing a replacement as it will not be their utility that is reporting an error. I am considering just telling them that their test failed. The Dell representative did not agree that if the memory passed 10 loops of their memory test, that it can still be defective. I know that it can, and I know that no memory test will be able to detect an error 100% of the time.

That is the whole reason you get to choose the number of loops, to increase the chance that the error will occur and be detected, keyword there being chance. Dell does not seem to want to accept that and was basically ignoring my results of WinDbg and simply stating "blue screens and minidumps can be cause by a long list of problems, such as OS corruption, hard drive errors, driver errors, or memory errors, if it passes 10 loops of memory testing, you will need to reinstall the OS". That is what windbg is for, to narrow down that long list!

Sorry, end of rant.

I will give Memtest86+ a try tonight to see if it reproduces errors. I just wanted to make sure that I was correct, as I am not very experienced using WinDbg.
0
 
LVL 27

Expert Comment

by:Jonvee
ID: 22690116
> I know that no memory test will be able to detect an error 100% of the time <
i absolutely agree with you, although if a RAM *is* faulty memtest will almost certainly indicate that this is so.   And i understand your rant!

If you'd like some assistance in analysing your own dump file, this should help>
"How to read the small memory dump files that Windows creates for debugging":
http://support.microsoft.com/kb/315263

The !analyze -v command will probably be your most used command.

You can download windbg from this microsoft website.
http://www.microsoft.com/whdc/devtools/debugging/default.mspx

Perhaps the best article of all>
"Windows system crashes":
http://www.networkworld.com/news/2005/041105-windows-crash.html

Note that even with WinDbg, there's only about a 50% chance of a good result .. but we've made a start!
0
 
LVL 27

Accepted Solution

by:
Jonvee earned 1050 total points
ID: 22692957
From your most recent Stop error, please view the "aumha" link below:
0x00000050: PAGE_FAULT_IN_NONPAGED_AREA

Here, suspect memory (including main memory, L2 RAM cache, video RAM) is named as the possible cause, but also incompatible software including remote control, and antivirus s/w.  It can also be other hardware problems.
http://aumha.org/a/stop.htm

Notes:  With only about one Stop occurance a month i have to admit that it does seem a little unusual if RAM is the cause.  Thought Stop error would have appeared more frequently.

From WinDbg you may notice that half of the time the failing module is shown as win32k.sys and ntoskrnl.exe, but it's unlikely these are the culprit(s).

Analysed 4 more of your Minidumps but with similar results, as expected.

Finally, you may wish to look at your antivirus software on that one machine, & perhaps consider an AV uninstall/reinstall.
0
 
LVL 88

Assisted Solution

by:rindi
rindi earned 450 total points
ID: 22693679
Also test your HD with the HD manufacturer's diagnostic tool. You'll find it on the UBCD.

http://ultimatebootcd.com
0
 
LVL 6

Author Comment

by:bradl3y
ID: 22701923
Thanks for your help so far. I let Memtest86+ run all weekend, it passed 195 loops, no errors. So it seems the RAM isn't the cause. Video RAM is shared i beleive, and would a CPU stress test find problems with the L2 Cache, or is there another way to test it?

I am running the dell harddrive diagnostics from their utility partition right now. Is there any point in running this in more than one loop, or should one be good? I've got the UBCD, so i will try the manufacturers test next.

If everything passes with the harddrive, i will check on the issue of AV software.
0
 
LVL 27

Expert Comment

by:Jonvee
ID: 22701940
i was just about to post this>>

Reviewing our past comments ...
Refering to an earlier statement of mine that Memtest will almost certainly indicate that a RAM is suspect if indeed it is, should have said that Memtest will *not necessarily* indicate that a RAM is faulty, when it is.

Assuming by now that Memtest also gave the green light to your RAM, and you've considered our other proposals, perhaps a faulty driver is running over the memory.
You could therefore try checking for driver updates.
0
 
LVL 88

Expert Comment

by:rindi
ID: 22702049
I wouldn't run the Dell HD test at all, they don't build disks.
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

If you have done a reformat of your hard drive and proceeded to do a successful Windows XP installation, you may notice that a choice between two operating systems when you start up the machine. Here is how to get rid of this: Click Start Clic…
I previously wrote an article addressing the use of UBCD4WIN and SARDU. All are great, but I have always been an advocate of SARDU. Recently it was suggested that I go back and take a look at Easy2Boot in comparison.
Two types of users will appreciate AOMEI Backupper Pro: 1 - Those with PCIe drives (and haven't found cloning software that works on them). 2 - Those who want a fast clone of their boot drive (no re-boots needed) and it can clone your drive wh…
Finding and deleting duplicate (picture) files can be a time consuming task. My wife and I, our three kids and their families all share one dilemma: Managing our pictures. Between desktops, laptops, phones, tablets, and cameras; over the last decade…

610 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question