Link to home
Start Free TrialLog in
Avatar of Axis52401
Axis52401Flag for United States of America

asked on

Virtual Server blue Screen Error

I have  several servers running in VMware. One of them seems to be Blue screening. AT first it was just once about 6 months ago so I ignored it but it has happened twice in the past week. English portion of the blue screen error says 'fault in non paged area'

Can anyone help, I don't think it's H/W problem since the other servers running on the esxi server don't have this problem so I think its a WIndows problem but I don't know what.

Avatar of Tony Giangreco
Tony Giangreco
Flag of United States of America image

Start by checking for the latest hardware drivers. This resolves most BSOD situations. After that, make sure the server and VM is up to the current patch level.

Check the Event view logs and google any error messages that coorespond with the BSOD time.
You need to analyze the memory.dmp (or minidump.dmp) file generated when it BSODs.

This should get you started:
http://support.microsoft.com/kb/315263
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
very odd, as virtual hardware does not often suffer hardware failure like memory failure.

Any event log errors of relevance?

Is the VM based on a SAN, some disk read and write latencies can cause virtual disk timeout issues which can cause BSODs.

other than that drivers, what NICs are you using in the VM, although ive not personally, so admins have reported issues with E1000 drivet, and switching to VMXNET2/3 Nics may help, youll need to install vmware tools to switch to this hardware nic.
Avatar of Axis52401

ASKER

How do I analyze a dump file? And it's always with this VM and not the others on the same VM server.

Yes, one it's 239M
There is another one I found in C:\WINDOWS\Minidump that is smaller and dated for today
do you also have a good anti virus policy, and checkef with microsoft security essentials, malwarebytes and superantispywate for trojans or malware?
We use Trend Antivirus and I also ran a Malwarebytes scan, it doesn't appear to be virus related.
Follow the link I provided - http://support.microsoft.com/kb/315263.  Basically, you'll load the file in windbg and run '!analyze -v'
Analysing crash dumps may be helpful in this scenario, but may not provide anything useful. Because it involves alot of work, loading the symbol files from the Service Pack you currently have loaded.

It's quicker to check the obvious issue first.

1. Event log errors?
2. Any recent changes made to the server?
3. What is the server' function?
4. Does it provide terminal services to other users?
5. Latest VMware Tools installed?
6. E1000 driver in use?
7. VMXNET2 or VMXNET3 driver in use.
8. Any disk related timeout messages in the event log?
9. Any SAN related I/O read and write latency issues on the VMFS datastore.
10. Virus?
11. Server doesn't BSOD under Backup or Heavy Network traffic
12. Which storage drive in use Buslogic or LSI?
13. Was the Server P2V-ed.
14 then I would inspect dump files, if anything useful is present.
Do either of these make any sense to you

Error code 00000050, parameter1 bcbe0038, parameter2 00000000, parameter3 bf8b83bf, parameter4 00000000.


The reason supplied by user domain\admini for the last unexpected shutdown of this computer is: System Failure: Stop error
 Reason Code: 0x805000f
 Bug ID:
 Bugcheck String: 0x00000050 (0xbcbe0038, 0x00000000, 0xbf8b83bf, 0x00000000)
 Comment: 0x00000050 (0xbcbe0038, 0x00000000, 0xbf8b83bf, 0x00000000)
I just posted the relevant event errors
There have been no recent additions or changes to the server
The Server is the domain controller and File Server
4. No
5. Yes
6. Yes
7.No
8.No
9. No
10, No
11. No backups run at night and this has hapened during the day.
12. LSI
13. Not sure what that means
I think you'll save a lot of time if you just read the dump files.  It could point to the driver/application that is causing the problem rather than trying to "fix everything" by patching, updating, etc...  If you learn to use Windbg for doing a simple '!analyze -v' you'll be better off in the long run as well.
Not seen a Stop 0x50 in a while. you are not using the server as au print server?

might be some mileage in using vmxnet3 driver.

How do I use this Windbg program?
Open the program then do File->Open Crash Dump.  Then type '!analyze -v'.  Post the results.  

We'll start there, if you need to load symbols we can show you how to do that also.
I used the online analysis too and got this (below) Can you interpret this?


Crash Dump Analysis provided by OSR Open Systems Resources, Inc. (http://www.osr.com)
Online Crash Dump Analysis Service
See http://www.osronline.com for more information
Windows Server 2003 Kernel Version 3790 (Service Pack 2) MP (2 procs) Free x86 compatible
Product: LanManNt, suite: TerminalServer SingleUserTS
Built by: 3790.srv03_sp2_gdr.100216-1301
Machine Name:
Kernel base = 0x80800000 PsLoadedModuleList = 0x808a6ea8
Debug session time: Sun May  1 11:01:20.407 2011 (UTC - 4:00)
System Uptime: 8 days 20:33:21.181
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

PAGE_FAULT_IN_NONPAGED_AREA (50)
Invalid system memory was referenced.  This cannot be protected by try-except,
it must be protected by a Probe.  Typically the address is just plain bad or it
is pointing at freed memory.
Arguments:
Arg1: bcbe0038, memory referenced.
Arg2: 00000000, value 0 = read operation, 1 = write operation.
Arg3: bf8b83bf, If non-zero, the instruction address which referenced the bad memory
      address.
Arg4: 00000000, (reserved)

Debugging Details:
------------------


Could not read faulting driver name

READ_ADDRESS:  bcbe0038

FAULTING_IP:
win32k!DestroyThreadsObjects+4f
bf8b83bf 8b01            mov     eax,dword ptr [ecx]

MM_INTERNAL_CODE:  0

CUSTOMER_CRASH_COUNT:  1

DEFAULT_BUCKET_ID:  DRIVER_FAULT_SERVER_MINIDUMP

BUGCHECK_STR:  0x50

PROCESS_NAME:  cgiRqUpd.exe

CURRENT_IRQL:  1

TRAP_FRAME:  b8f07a74 -- (.trap 0xffffffffb8f07a74)
ErrCode = 00000000
eax=bcbe0008 ebx=00000335 ecx=bcbe0038 edx=bc510002 esi=e80e5850 edi=0000267c
eip=bf8b83bf esp=b8f07ae8 ebp=b8f07b34 iopl=0         nv up ei pl zr na pe nc
cs=0008  ss=0010  ds=0023  es=0023  fs=0030  gs=0000             efl=00010246
win32k!DestroyThreadsObjects+0x4f:
bf8b83bf 8b01            mov     eax,dword ptr [ecx]  ds:0023:bcbe0038=????????
Resetting default scope

LAST_CONTROL_TRANSFER:  from 8085ed25 to 80827c83

STACK_TEXT:  
b8f079e4 8085ed25 00000050 bcbe0038 00000000 nt!KeBugCheckEx+0x1b
b8f07a5c 8088c800 00000000 bcbe0038 00000000 nt!MmAccessFault+0xb25
b8f07a5c bf8b83bf 00000000 bcbe0038 00000000 nt!KiTrap0E+0xdc
b8f07af0 bf8b870c 84ef6db0 00000000 00000000 win32k!DestroyThreadsObjects+0x4f
b8f07b34 bf8b6fb1 00000001 b8f07b5c bf8b7e0e win32k!xxxDestroyThreadInfo+0x206
b8f07b40 bf8b7e0e 84ef6db0 00000001 00000000 win32k!UserThreadCallout+0x4b
b8f07b5c 8094c38a 84ef6db0 00000001 84ef6db0 win32k!W32pThreadCallout+0x3a
b8f07be8 8094c71d 00000000 00000000 85f56148 nt!PspExitThread+0x3b2
b8f07c00 8094c917 84ef6db0 00000000 00000001 nt!PspTerminateThreadByPointer+0x4b
b8f07c30 ba1292f6 00000000 00000000 8ad40f7c nt!NtTerminateProcess+0x125
WARNING: Stack unwind information not available. Following frames may be wrong.
b8f07c50 ba129ec4 00000002 b8f07d48 8094c7f2 tmevtmgr+0x72f6
b8f07d2c ba1268c4 b8f07d48 b8f07d50 ba12692d tmevtmgr+0x7ec4
b8f07d38 ba12692d 8ad40f7c b8f07d48 ffffffff tmevtmgr+0x48c4
b8f07d50 808897cc 8ad40f7c ffffffff 00000000 tmevtmgr+0x492d
b8f07d50 7c82860c 8ad40f7c ffffffff 00000000 nt!KiFastCallEntry+0xfc
0012ff10 00000000 00000000 00000000 00000000 0x7c82860c


STACK_COMMAND:  kb

FOLLOWUP_IP:
tmevtmgr+72f6
ba1292f6 ??              ???

SYMBOL_STACK_INDEX:  a

SYMBOL_NAME:  tmevtmgr+72f6

FOLLOWUP_NAME:  MachineOwner

MODULE_NAME: tmevtmgr

IMAGE_NAME:  tmevtmgr.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  4d674754

FAILURE_BUCKET_ID:  0x50_tmevtmgr+72f6

BUCKET_ID:  0x50_tmevtmgr+72f6

Followup: MachineOwner
cgiRqUpd.exe is part of Trend.  You'll want to update everything you have to do with Trend and/or uninstall/reinstall.  Then contact Trend and provide them with your dmp files if the problem continues.
ASKER CERTIFIED SOLUTION
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
We use Worry Free Business security version 7. Thats for interepreting that and giving me a place to start. I didn't see anything in the event logs except below, maybe I'll try reinstalling the client on that server to see if that helps.



Error code 00000050, parameter1 bcbe0038, parameter2 00000000, parameter3 bf8b83bf, parameter4 00000000.


The reason supplied by user domain\admini for the last unexpected shutdown of this computer is: System Failure: Stop error
 Reason Code: 0x805000f
 Bug ID:
 Bugcheck String: 0x00000050 (0xbcbe0038, 0x00000000, 0xbf8b83bf, 0x00000000)
 Comment: 0x00000050 (0xbcbe0038, 0x00000000, 0xbf8b83bf, 0x00000000)
Yeah, at least you know the cause now.  You'll probably find that an uninstall/reboot/reinstall will fix the problem.  I run WFBS also and don't have any of my servers rebooting.  So, hopefully it's just a corrupted install.  Good luck...
No partial credit even?!?!  Shame.