Link to home
Start Free TrialLog in
Avatar of ponedog
ponedogFlag for United States of America

asked on

Hyper-V Virtual Servers (Windows 2003) getting Blue Screen of Death - Troubleshooting Ideas...

Dell PowerEdge T710, 2 physical CPU's (X5677's), 24 GB Memory, running Windows Server 2008 R2 Enterprise.

The RAID DIsk Array is split into a C: drive (465 GB) and a D: drive (1.81 TB).

The physical server hosts 3 VM's.   A Windows Server 2008 R2 Standard (64bit) Domain Controller, a Windows Server 2003 (32 bit) File Server, and a Windows Server 2003 (32 bit) Citrix Server (used for legacy 16bit application).

System has been running without major problems for many years.   About a month ago, we started getting blue screens (BSOD's) on both VM's running Server 2003.  There are no entries in the Physical Server indicating a problem.   We are able to restart the virtual servers and they run for a while (sometime days, sometime hours) before they crash again.   The VM running Windows Server 2008 (that is a domain controller) has never crashed this whole time.   The minidumps have not pointed us to any particular driver or problem.   We have swapped out memory and rolled back any software including patches installed in the last month or so.
Both 2003 VM's tend to crash about the same time - maybe a 20 minute window or so.

I am attaching a report (using BlueScreenView) of one VM's minidumps.   MiniDump.html     Any troubleshooting ideas ?

PS: the attached file is HTML that if you examine it and then copy it to notepad and save it with an HTML extension, you will see the formatted screen...   :)
Avatar of Patrick Bogers
Patrick Bogers
Flag of Netherlands image

Hi

If the last crash was yesterdat it suggests there is a problem with ntfs.sys, have checkdisk run the disks (pref with repair) or have windows check the integrity of it core components with    sfc /scannow

Cheers

NB: If both tries do not fix the issue maybe C:\WINDOWS\Minidump\Mini102119-13.dmp can tell us more.
Avatar of ponedog

ASKER

Thanks Patrick for your suggestion.  Both VM's and the physical host have had chkdsk with repair run.   No problems found.   Also, out an abundance of caution, I have defragged both VM's (although it is claimed that does not matter for VM's).

Also, the various minidumps point to different "causes" - ntfs.sys, fltmgr.sys, netvsc50.sys, ntkrnlpa.exe, wlbs.sys, tcpip.sys, netbt.sys, svc.sys, hal.dll....

So, it really is a head-scratcher !
What about system file checker?         Sfc /scannow
Avatar of ponedog

ASKER

Of the two VM's, I ran SFC /scannow on both.   The Citrix VM one ran fine (no issues found).   The other VM (being used as a file server) would not let me run it - it wanted the 2nd installation CD for Windows Server 2003.     So, I gave up on it.     Not sure if it worth spending the time trying to figure out where the Cab files are for the 2nd CD of a Windows Server 2003 ...
Has the host been patched recently? Microsoft hasn't supported 2003 for many years, so they are not doing any testing for compatibility issues with your hypervisor.

Your 2008 R2 Enterprise license includes virtualization rights for 4 OSEs (VMs). You could replace the file server with 2008 R2, and replace the Citrix Server with 2008 (32bit). Those would get you to at least currently supported until January 2020, and would be at least a little more modern.

You should be thinking about the long term support issues of running unsupported OS. Issues like this could become more frequent.

Have you tried disabling AV software on host and VMs? How exposed to risk are these servers?
Avatar of ponedog

ASKER

Thanks kevin !    I was concerned that a Microsoft Patch / Security Update was the culprit.   However, we have rolled back all the patches recently applied and it did not solve the problem.   I have uninstalled the antivirus - doesn't help.   I have scanned the VM's with different antivirus engines (limited since they are Windows Server 2003) - did not find any problems.

So...   I decided to go with Windows Server 2012 R2 and setup RDS - then convert the application to run in a 64bit environment.   I am part way through the conversion and trust that the old system will remain stable until I am done.
Avatar of ponedog

ASKER

The new terminal server is installed, software migrated, and a quick test run...    Monday we go live.

We have shut off one of the 2 virtual 2003 machines on the Hyper-V physical server - I will be watching to see if the other 2003 VM (being used as a file server) remains stable.

I will close this request after I update with more results/history.   Thanks again for those that commented :)
This question needs an answer!
Become an EE member today
7 DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform.
View membership options
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.