Axis52401
asked on
Virtual Server blue Screen Error
I have several servers running in VMware. One of them seems to be Blue screening. AT first it was just once about 6 months ago so I ignored it but it has happened twice in the past week. English portion of the blue screen error says 'fault in non paged area'
Can anyone help, I don't think it's H/W problem since the other servers running on the esxi server don't have this problem so I think its a WIndows problem but I don't know what.
Can anyone help, I don't think it's H/W problem since the other servers running on the esxi server don't have this problem so I think its a WIndows problem but I don't know what.
You need to analyze the memory.dmp (or minidump.dmp) file generated when it BSODs.
This should get you started:
http://support.microsoft.com/kb/315263
This should get you started:
http://support.microsoft.com/kb/315263
very odd, as virtual hardware does not often suffer hardware failure like memory failure.
Any event log errors of relevance?
Is the VM based on a SAN, some disk read and write latencies can cause virtual disk timeout issues which can cause BSODs.
other than that drivers, what NICs are you using in the VM, although ive not personally, so admins have reported issues with E1000 drivet, and switching to VMXNET2/3 Nics may help, youll need to install vmware tools to switch to this hardware nic.
Any event log errors of relevance?
Is the VM based on a SAN, some disk read and write latencies can cause virtual disk timeout issues which can cause BSODs.
other than that drivers, what NICs are you using in the VM, although ive not personally, so admins have reported issues with E1000 drivet, and switching to VMXNET2/3 Nics may help, youll need to install vmware tools to switch to this hardware nic.
is it always this vm?
ASKER
How do I analyze a dump file? And it's always with this VM and not the others on the same VM server.
do you have crash dump files?
ASKER
Yes, one it's 239M
ASKER
There is another one I found in C:\WINDOWS\Minidump that is smaller and dated for today
do you also have a good anti virus policy, and checkef with microsoft security essentials, malwarebytes and superantispywate for trojans or malware?
ASKER
We use Trend Antivirus and I also ran a Malwarebytes scan, it doesn't appear to be virus related.
Follow the link I provided - http://support.microsoft.com/kb/315263. Basically, you'll load the file in windbg and run '!analyze -v'
Analysing crash dumps may be helpful in this scenario, but may not provide anything useful. Because it involves alot of work, loading the symbol files from the Service Pack you currently have loaded.
It's quicker to check the obvious issue first.
1. Event log errors?
2. Any recent changes made to the server?
3. What is the server' function?
4. Does it provide terminal services to other users?
5. Latest VMware Tools installed?
6. E1000 driver in use?
7. VMXNET2 or VMXNET3 driver in use.
8. Any disk related timeout messages in the event log?
9. Any SAN related I/O read and write latency issues on the VMFS datastore.
10. Virus?
11. Server doesn't BSOD under Backup or Heavy Network traffic
12. Which storage drive in use Buslogic or LSI?
13. Was the Server P2V-ed.
14 then I would inspect dump files, if anything useful is present.
It's quicker to check the obvious issue first.
1. Event log errors?
2. Any recent changes made to the server?
3. What is the server' function?
4. Does it provide terminal services to other users?
5. Latest VMware Tools installed?
6. E1000 driver in use?
7. VMXNET2 or VMXNET3 driver in use.
8. Any disk related timeout messages in the event log?
9. Any SAN related I/O read and write latency issues on the VMFS datastore.
10. Virus?
11. Server doesn't BSOD under Backup or Heavy Network traffic
12. Which storage drive in use Buslogic or LSI?
13. Was the Server P2V-ed.
14 then I would inspect dump files, if anything useful is present.
ASKER
Do either of these make any sense to you
Error code 00000050, parameter1 bcbe0038, parameter2 00000000, parameter3 bf8b83bf, parameter4 00000000.
The reason supplied by user domain\admini for the last unexpected shutdown of this computer is: System Failure: Stop error
Reason Code: 0x805000f
Bug ID:
Bugcheck String: 0x00000050 (0xbcbe0038, 0x00000000, 0xbf8b83bf, 0x00000000)
Comment: 0x00000050 (0xbcbe0038, 0x00000000, 0xbf8b83bf, 0x00000000)
Error code 00000050, parameter1 bcbe0038, parameter2 00000000, parameter3 bf8b83bf, parameter4 00000000.
The reason supplied by user domain\admini for the last unexpected shutdown of this computer is: System Failure: Stop error
Reason Code: 0x805000f
Bug ID:
Bugcheck String: 0x00000050 (0xbcbe0038, 0x00000000, 0xbf8b83bf, 0x00000000)
Comment: 0x00000050 (0xbcbe0038, 0x00000000, 0xbf8b83bf, 0x00000000)
ASKER
I just posted the relevant event errors
There have been no recent additions or changes to the server
The Server is the domain controller and File Server
4. No
5. Yes
6. Yes
7.No
8.No
9. No
10, No
11. No backups run at night and this has hapened during the day.
12. LSI
13. Not sure what that means
There have been no recent additions or changes to the server
The Server is the domain controller and File Server
4. No
5. Yes
6. Yes
7.No
8.No
9. No
10, No
11. No backups run at night and this has hapened during the day.
12. LSI
13. Not sure what that means
I think you'll save a lot of time if you just read the dump files. It could point to the driver/application that is causing the problem rather than trying to "fix everything" by patching, updating, etc... If you learn to use Windbg for doing a simple '!analyze -v' you'll be better off in the long run as well.
Not seen a Stop 0x50 in a while. you are not using the server as au print server?
might be some mileage in using vmxnet3 driver.
might be some mileage in using vmxnet3 driver.
ASKER
How do I use this Windbg program?
Open the program then do File->Open Crash Dump. Then type '!analyze -v'. Post the results.
We'll start there, if you need to load symbols we can show you how to do that also.
We'll start there, if you need to load symbols we can show you how to do that also.
download symbol files here for your service pack version
http://msdn.microsoft.com/en-us/windows/hardware/gg463028
http://msdn.microsoft.com/en-us/windows/hardware/gg463028
this maybe quicker and easier
http://www.smidgeonsoft.prohosting.com/pebrowse-crash-dump-analyzer.html
http://www.networkworld.com/news/2005/041105-windows-crash.html
http://thebackroomtech.com/2008/01/31/howto-use-the-windows-debugging-tools-to-analyze-a-crash-dump-bsod/
online analysys very quick and easy
http://www.osronline.com/page.cfm?name=analyze
http://www.smidgeonsoft.prohosting.com/pebrowse-crash-dump-analyzer.html
http://www.networkworld.com/news/2005/041105-windows-crash.html
http://thebackroomtech.com/2008/01/31/howto-use-the-windows-debugging-tools-to-analyze-a-crash-dump-bsod/
online analysys very quick and easy
http://www.osronline.com/page.cfm?name=analyze
ASKER
I used the online analysis too and got this (below) Can you interpret this?
Crash Dump Analysis provided by OSR Open Systems Resources, Inc. (http://www.osr.com)
Online Crash Dump Analysis Service
See http://www.osronline.com for more information
Windows Server 2003 Kernel Version 3790 (Service Pack 2) MP (2 procs) Free x86 compatible
Product: LanManNt, suite: TerminalServer SingleUserTS
Built by: 3790.srv03_sp2_gdr.100216- 1301
Machine Name:
Kernel base = 0x80800000 PsLoadedModuleList = 0x808a6ea8
Debug session time: Sun May 1 11:01:20.407 2011 (UTC - 4:00)
System Uptime: 8 days 20:33:21.181
************************** ********** ********** ********** ********** ********** ***
* *
* Bugcheck Analysis *
* *
************************** ********** ********** ********** ********** ********** ***
PAGE_FAULT_IN_NONPAGED_ARE A (50)
Invalid system memory was referenced. This cannot be protected by try-except,
it must be protected by a Probe. Typically the address is just plain bad or it
is pointing at freed memory.
Arguments:
Arg1: bcbe0038, memory referenced.
Arg2: 00000000, value 0 = read operation, 1 = write operation.
Arg3: bf8b83bf, If non-zero, the instruction address which referenced the bad memory
address.
Arg4: 00000000, (reserved)
Debugging Details:
------------------
Could not read faulting driver name
READ_ADDRESS: bcbe0038
FAULTING_IP:
win32k!DestroyThreadsObjec ts+4f
bf8b83bf 8b01 mov eax,dword ptr [ecx]
MM_INTERNAL_CODE: 0
CUSTOMER_CRASH_COUNT: 1
DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDU MP
BUGCHECK_STR: 0x50
PROCESS_NAME: cgiRqUpd.exe
CURRENT_IRQL: 1
TRAP_FRAME: b8f07a74 -- (.trap 0xffffffffb8f07a74)
ErrCode = 00000000
eax=bcbe0008 ebx=00000335 ecx=bcbe0038 edx=bc510002 esi=e80e5850 edi=0000267c
eip=bf8b83bf esp=b8f07ae8 ebp=b8f07b34 iopl=0 nv up ei pl zr na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010246
win32k!DestroyThreadsObjec ts+0x4f:
bf8b83bf 8b01 mov eax,dword ptr [ecx] ds:0023:bcbe0038=????????
Resetting default scope
LAST_CONTROL_TRANSFER: from 8085ed25 to 80827c83
STACK_TEXT:
b8f079e4 8085ed25 00000050 bcbe0038 00000000 nt!KeBugCheckEx+0x1b
b8f07a5c 8088c800 00000000 bcbe0038 00000000 nt!MmAccessFault+0xb25
b8f07a5c bf8b83bf 00000000 bcbe0038 00000000 nt!KiTrap0E+0xdc
b8f07af0 bf8b870c 84ef6db0 00000000 00000000 win32k!DestroyThreadsObjec ts+0x4f
b8f07b34 bf8b6fb1 00000001 b8f07b5c bf8b7e0e win32k!xxxDestroyThreadInf o+0x206
b8f07b40 bf8b7e0e 84ef6db0 00000001 00000000 win32k!UserThreadCallout+0 x4b
b8f07b5c 8094c38a 84ef6db0 00000001 84ef6db0 win32k!W32pThreadCallout+0 x3a
b8f07be8 8094c71d 00000000 00000000 85f56148 nt!PspExitThread+0x3b2
b8f07c00 8094c917 84ef6db0 00000000 00000001 nt!PspTerminateThreadByPoi nter+0x4b
b8f07c30 ba1292f6 00000000 00000000 8ad40f7c nt!NtTerminateProcess+0x12 5
WARNING: Stack unwind information not available. Following frames may be wrong.
b8f07c50 ba129ec4 00000002 b8f07d48 8094c7f2 tmevtmgr+0x72f6
b8f07d2c ba1268c4 b8f07d48 b8f07d50 ba12692d tmevtmgr+0x7ec4
b8f07d38 ba12692d 8ad40f7c b8f07d48 ffffffff tmevtmgr+0x48c4
b8f07d50 808897cc 8ad40f7c ffffffff 00000000 tmevtmgr+0x492d
b8f07d50 7c82860c 8ad40f7c ffffffff 00000000 nt!KiFastCallEntry+0xfc
0012ff10 00000000 00000000 00000000 00000000 0x7c82860c
STACK_COMMAND: kb
FOLLOWUP_IP:
tmevtmgr+72f6
ba1292f6 ?? ???
SYMBOL_STACK_INDEX: a
SYMBOL_NAME: tmevtmgr+72f6
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: tmevtmgr
IMAGE_NAME: tmevtmgr.sys
DEBUG_FLR_IMAGE_TIMESTAMP: 4d674754
FAILURE_BUCKET_ID: 0x50_tmevtmgr+72f6
BUCKET_ID: 0x50_tmevtmgr+72f6
Followup: MachineOwner
Crash Dump Analysis provided by OSR Open Systems Resources, Inc. (http://www.osr.com)
Online Crash Dump Analysis Service
See http://www.osronline.com for more information
Windows Server 2003 Kernel Version 3790 (Service Pack 2) MP (2 procs) Free x86 compatible
Product: LanManNt, suite: TerminalServer SingleUserTS
Built by: 3790.srv03_sp2_gdr.100216-
Machine Name:
Kernel base = 0x80800000 PsLoadedModuleList = 0x808a6ea8
Debug session time: Sun May 1 11:01:20.407 2011 (UTC - 4:00)
System Uptime: 8 days 20:33:21.181
**************************
* *
* Bugcheck Analysis *
* *
**************************
PAGE_FAULT_IN_NONPAGED_ARE
Invalid system memory was referenced. This cannot be protected by try-except,
it must be protected by a Probe. Typically the address is just plain bad or it
is pointing at freed memory.
Arguments:
Arg1: bcbe0038, memory referenced.
Arg2: 00000000, value 0 = read operation, 1 = write operation.
Arg3: bf8b83bf, If non-zero, the instruction address which referenced the bad memory
address.
Arg4: 00000000, (reserved)
Debugging Details:
------------------
Could not read faulting driver name
READ_ADDRESS: bcbe0038
FAULTING_IP:
win32k!DestroyThreadsObjec
bf8b83bf 8b01 mov eax,dword ptr [ecx]
MM_INTERNAL_CODE: 0
CUSTOMER_CRASH_COUNT: 1
DEFAULT_BUCKET_ID: DRIVER_FAULT_SERVER_MINIDU
BUGCHECK_STR: 0x50
PROCESS_NAME: cgiRqUpd.exe
CURRENT_IRQL: 1
TRAP_FRAME: b8f07a74 -- (.trap 0xffffffffb8f07a74)
ErrCode = 00000000
eax=bcbe0008 ebx=00000335 ecx=bcbe0038 edx=bc510002 esi=e80e5850 edi=0000267c
eip=bf8b83bf esp=b8f07ae8 ebp=b8f07b34 iopl=0 nv up ei pl zr na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010246
win32k!DestroyThreadsObjec
bf8b83bf 8b01 mov eax,dword ptr [ecx] ds:0023:bcbe0038=????????
Resetting default scope
LAST_CONTROL_TRANSFER: from 8085ed25 to 80827c83
STACK_TEXT:
b8f079e4 8085ed25 00000050 bcbe0038 00000000 nt!KeBugCheckEx+0x1b
b8f07a5c 8088c800 00000000 bcbe0038 00000000 nt!MmAccessFault+0xb25
b8f07a5c bf8b83bf 00000000 bcbe0038 00000000 nt!KiTrap0E+0xdc
b8f07af0 bf8b870c 84ef6db0 00000000 00000000 win32k!DestroyThreadsObjec
b8f07b34 bf8b6fb1 00000001 b8f07b5c bf8b7e0e win32k!xxxDestroyThreadInf
b8f07b40 bf8b7e0e 84ef6db0 00000001 00000000 win32k!UserThreadCallout+0
b8f07b5c 8094c38a 84ef6db0 00000001 84ef6db0 win32k!W32pThreadCallout+0
b8f07be8 8094c71d 00000000 00000000 85f56148 nt!PspExitThread+0x3b2
b8f07c00 8094c917 84ef6db0 00000000 00000001 nt!PspTerminateThreadByPoi
b8f07c30 ba1292f6 00000000 00000000 8ad40f7c nt!NtTerminateProcess+0x12
WARNING: Stack unwind information not available. Following frames may be wrong.
b8f07c50 ba129ec4 00000002 b8f07d48 8094c7f2 tmevtmgr+0x72f6
b8f07d2c ba1268c4 b8f07d48 b8f07d50 ba12692d tmevtmgr+0x7ec4
b8f07d38 ba12692d 8ad40f7c b8f07d48 ffffffff tmevtmgr+0x48c4
b8f07d50 808897cc 8ad40f7c ffffffff 00000000 tmevtmgr+0x492d
b8f07d50 7c82860c 8ad40f7c ffffffff 00000000 nt!KiFastCallEntry+0xfc
0012ff10 00000000 00000000 00000000 00000000 0x7c82860c
STACK_COMMAND: kb
FOLLOWUP_IP:
tmevtmgr+72f6
ba1292f6 ?? ???
SYMBOL_STACK_INDEX: a
SYMBOL_NAME: tmevtmgr+72f6
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: tmevtmgr
IMAGE_NAME: tmevtmgr.sys
DEBUG_FLR_IMAGE_TIMESTAMP:
FAILURE_BUCKET_ID: 0x50_tmevtmgr+72f6
BUCKET_ID: 0x50_tmevtmgr+72f6
Followup: MachineOwner
cgiRqUpd.exe is part of Trend. You'll want to update everything you have to do with Trend and/or uninstall/reinstall. Then contact Trend and provide them with your dmp files if the problem continues.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
We use Worry Free Business security version 7. Thats for interepreting that and giving me a place to start. I didn't see anything in the event logs except below, maybe I'll try reinstalling the client on that server to see if that helps.
Error code 00000050, parameter1 bcbe0038, parameter2 00000000, parameter3 bf8b83bf, parameter4 00000000.
The reason supplied by user domain\admini for the last unexpected shutdown of this computer is: System Failure: Stop error
Reason Code: 0x805000f
Bug ID:
Bugcheck String: 0x00000050 (0xbcbe0038, 0x00000000, 0xbf8b83bf, 0x00000000)
Comment: 0x00000050 (0xbcbe0038, 0x00000000, 0xbf8b83bf, 0x00000000)
Error code 00000050, parameter1 bcbe0038, parameter2 00000000, parameter3 bf8b83bf, parameter4 00000000.
The reason supplied by user domain\admini for the last unexpected shutdown of this computer is: System Failure: Stop error
Reason Code: 0x805000f
Bug ID:
Bugcheck String: 0x00000050 (0xbcbe0038, 0x00000000, 0xbf8b83bf, 0x00000000)
Comment: 0x00000050 (0xbcbe0038, 0x00000000, 0xbf8b83bf, 0x00000000)
Yeah, at least you know the cause now. You'll probably find that an uninstall/reboot/reinstall will fix the problem. I run WFBS also and don't have any of my servers rebooting. So, hopefully it's just a corrupted install. Good luck...
No partial credit even?!?! Shame.
Check the Event view logs and google any error messages that coorespond with the BSOD time.