• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 902
  • Last Modified:

HP, Proliant ML150 G3, Server with MS SBS 2003 R2 Premium, Core Memory Dump Blue Screen

I get irregular, yet frequent, system crash blue screen core memory dumps on my HP Proliant ML150G3 running MS SBS2003 R2 Premium with SQL and ISA Server installed. Everything worked fine until about 10 days ago.

I get a memory.dmp file and have downloaded the latest analysis and debugging tools from the MS website but I don't know where to start or what to do.

Other than MS updates and patches, there have been no system reconfigurations of any sort, no new hardware or software, nothing. My usually very reliable server is now rubbish!
0
MarcusN
Asked:
MarcusN
  • 4
  • 3
  • 3
3 Solutions
 
Keith AlabasterCommented:
Are you also running sp2?
0
 
rindiCommented:
Can you tell us the error code you get on the Blue Screen?

I'd also test the RAM with memtest96+, make sure all the fans are running, and test the disks using the manufacturer's utility. The tools are on the UBCD.

http://ultimatebootcd.com
0
 
MarcusNAuthor Commented:
Keith asked, "am I running SP2?"
Yes

Rindi asked, "what is the blue screen error code?"
How do I find it? Is it logged somewhere?

Rindi asked whether I've tested the RAM and disks.
I've not checked these but the message in the event log is as follows.

Event Type:      Error
Event Source:      System Error
Event Category:      (102)
Event ID:      1003
Date:            22/11/2007
Time:            19:58:26
User:            N/A
Computer:      SATURN
Description:
Error code 0000000a, parameter1 00004074, parameter2 d000001b, parameter3 00000001, parameter4 808312bd.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 53 79 73 74 65 6d 20 45   System E
0008: 72 72 6f 72 20 20 45 72   rror  Er
0010: 72 6f 72 20 63 6f 64 65   ror code
0018: 20 30 30 30 30 30 30 30    0000000
0020: 61 20 20 50 61 72 61 6d   a  Param
0028: 65 74 65 72 73 20 30 30   eters 00
0030: 30 30 34 30 37 34 2c 20   004074,
0038: 64 30 30 30 30 30 31 62   d000001b
0040: 2c 20 30 30 30 30 30 30   , 000000
0048: 30 31 2c 20 38 30 38 33   01, 8083
0050: 31 32 62 64               12bd  
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
Keith AlabasterCommented:
Do you know which ms updates were installed?
http://support.microsoft.com/default.aspx?scid=kb;EN-US;927695

this has been an issue on a number of 2003 r2 sp2 systems lately - might be worth having a review - it is hitting ISA servers quite badly.
0
 
MarcusNAuthor Commented:
Keith mentioned a problem stemming from TCP receive side scaling.

Humm, this is interesting on two counts.
a) The article males no mention of core memory dump system crashes, so I don't see how it is really relevant to my problem.
b) If it is relevant, and I confess to not understanding it, could it have anything to do with my Intel PRO 1000/MT Dual Port NIC (the firmware of which I recently updated to version 8.9.1.0) or Adaptec AAR-2420SA 4 channel SATA RAID controller (of which I recently updated the driver to version 5.2.0.11737)?

Is there anything that the file C:\WINDOWS\Memory.dmp file can help to identify?
0
 
Keith AlabasterCommented:
The reason why it is relevant is that it has been the cause of many issues for 2003 R2 SP2 based systems that use NAT functionality. This feature is a prime driver for ISA server as it is a core function.

I am also confused - I am assuming the changes you mention were sometime ago as they seem to conflict with the 'no changes to the congiuration etc' type statement you made.

Yes, these updates could certainly be relevant or a trigger for the blue screens as they both directly affect how the hardware is accessed.
0
 
rindiCommented:
The error code is shown on the Bluescreen itself, the first number usually.
0
 
MarcusNAuthor Commented:
Keith queried my apparently contradicting statement about no changes.

Yes, the two changes I mention pre-date the start of the system instability by several weeks. They are the only two changes in the past year that I have consciously made that affect either the NIC or the RAID controller. That's why I mentioned them.

Rindi asked about the numbers on the Blue Screen.

Looking at the event log they would appear to be:
0000000a, parameter1 00004074, parameter2 d000001b, parameter3 00000001, parameter4 808312bd.

Interestingly, the eventlog for the crash that preceeded the one I provided data for in an earler post is as follows. It looks a little different.

Event Type:      Error
Event Source:      System Error
Event Category:      (102)
Event ID:      1003
Date:            21/11/2007
Time:            08:42:55
User:            N/A
Computer:      SATURN
Description:
Error code 00000019, parameter1 00000020, parameter2 88119ed0, parameter3 88119f40, parameter4 0a0e0005.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 53 79 73 74 65 6d 20 45   System E
0008: 72 72 6f 72 20 20 45 72   rror  Er
0010: 72 6f 72 20 63 6f 64 65   ror code
0018: 20 30 30 30 30 30 30 31    0000001
0020: 39 20 20 50 61 72 61 6d   9  Param
0028: 65 74 65 72 73 20 30 30   eters 00
0030: 30 30 30 30 32 30 2c 20   000020,
0038: 38 38 31 31 39 65 64 30   88119ed0
0040: 2c 20 38 38 31 31 39 66   , 88119f
0048: 34 30 2c 20 30 61 30 65   40, 0a0e
0050: 30 30 30 35               0005  


0
 
rindiCommented:
Both those error codes point to a bad driver or bad hardware. Reduce the server to the barebones minimum, test it and if it works add one device at a time until the device with the bad driver causes it to crash again.
0
 
MarcusNAuthor Commented:
Rindi recommends a barebones re-configuration.

Humm, I would like to narrow down this activity firstly my analysing the minidump files that I have (lots of now) and by analysing the current and next MEMORY.DMP file. What I need to know is how to do that.

As I said, I have downloaded the tools from the MS website but don't know what to do with them (and I have tried a fair bit of surfing to seek ideas). I want to narrow down the drivers which relate to the devices which are failing and then to remove those to test rather than remove a stack of things un-necessarily.

Are there any easy to understand references which help me to work with windbg.exe and the other debugging tools?
0

Featured Post

New feature and membership benefit!

New feature! Upgrade and increase expert visibility of your issues with Priority Questions.

  • 4
  • 3
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now