ponedog
asked on
Hyper-V Virtual Servers (Windows 2003) getting Blue Screen of Death - Troubleshooting Ideas...
Dell PowerEdge T710, 2 physical CPU's (X5677's), 24 GB Memory, running Windows Server 2008 R2 Enterprise.
The RAID DIsk Array is split into a C: drive (465 GB) and a D: drive (1.81 TB).
The physical server hosts 3 VM's. A Windows Server 2008 R2 Standard (64bit) Domain Controller, a Windows Server 2003 (32 bit) File Server, and a Windows Server 2003 (32 bit) Citrix Server (used for legacy 16bit application).
System has been running without major problems for many years. About a month ago, we started getting blue screens (BSOD's) on both VM's running Server 2003. There are no entries in the Physical Server indicating a problem. We are able to restart the virtual servers and they run for a while (sometime days, sometime hours) before they crash again. The VM running Windows Server 2008 (that is a domain controller) has never crashed this whole time. The minidumps have not pointed us to any particular driver or problem. We have swapped out memory and rolled back any software including patches installed in the last month or so.
Both 2003 VM's tend to crash about the same time - maybe a 20 minute window or so.
I am attaching a report (using BlueScreenView) of one VM's minidumps. MiniDump.html Any troubleshooting ideas ?
PS: the attached file is HTML that if you examine it and then copy it to notepad and save it with an HTML extension, you will see the formatted screen... :)
The RAID DIsk Array is split into a C: drive (465 GB) and a D: drive (1.81 TB).
The physical server hosts 3 VM's. A Windows Server 2008 R2 Standard (64bit) Domain Controller, a Windows Server 2003 (32 bit) File Server, and a Windows Server 2003 (32 bit) Citrix Server (used for legacy 16bit application).
System has been running without major problems for many years. About a month ago, we started getting blue screens (BSOD's) on both VM's running Server 2003. There are no entries in the Physical Server indicating a problem. We are able to restart the virtual servers and they run for a while (sometime days, sometime hours) before they crash again. The VM running Windows Server 2008 (that is a domain controller) has never crashed this whole time. The minidumps have not pointed us to any particular driver or problem. We have swapped out memory and rolled back any software including patches installed in the last month or so.
Both 2003 VM's tend to crash about the same time - maybe a 20 minute window or so.
I am attaching a report (using BlueScreenView) of one VM's minidumps. MiniDump.html Any troubleshooting ideas ?
PS: the attached file is HTML that if you examine it and then copy it to notepad and save it with an HTML extension, you will see the formatted screen... :)
ASKER
Thanks Patrick for your suggestion. Both VM's and the physical host have had chkdsk with repair run. No problems found. Also, out an abundance of caution, I have defragged both VM's (although it is claimed that does not matter for VM's).
Also, the various minidumps point to different "causes" - ntfs.sys, fltmgr.sys, netvsc50.sys, ntkrnlpa.exe, wlbs.sys, tcpip.sys, netbt.sys, svc.sys, hal.dll....
So, it really is a head-scratcher !
Also, the various minidumps point to different "causes" - ntfs.sys, fltmgr.sys, netvsc50.sys, ntkrnlpa.exe, wlbs.sys, tcpip.sys, netbt.sys, svc.sys, hal.dll....
So, it really is a head-scratcher !
What about system file checker? Sfc /scannow
ASKER
Of the two VM's, I ran SFC /scannow on both. The Citrix VM one ran fine (no issues found). The other VM (being used as a file server) would not let me run it - it wanted the 2nd installation CD for Windows Server 2003. So, I gave up on it. Not sure if it worth spending the time trying to figure out where the Cab files are for the 2nd CD of a Windows Server 2003 ...
Has the host been patched recently? Microsoft hasn't supported 2003 for many years, so they are not doing any testing for compatibility issues with your hypervisor.
Your 2008 R2 Enterprise license includes virtualization rights for 4 OSEs (VMs). You could replace the file server with 2008 R2, and replace the Citrix Server with 2008 (32bit). Those would get you to at least currently supported until January 2020, and would be at least a little more modern.
You should be thinking about the long term support issues of running unsupported OS. Issues like this could become more frequent.
Have you tried disabling AV software on host and VMs? How exposed to risk are these servers?
Your 2008 R2 Enterprise license includes virtualization rights for 4 OSEs (VMs). You could replace the file server with 2008 R2, and replace the Citrix Server with 2008 (32bit). Those would get you to at least currently supported until January 2020, and would be at least a little more modern.
You should be thinking about the long term support issues of running unsupported OS. Issues like this could become more frequent.
Have you tried disabling AV software on host and VMs? How exposed to risk are these servers?
ASKER
Thanks kevin ! I was concerned that a Microsoft Patch / Security Update was the culprit. However, we have rolled back all the patches recently applied and it did not solve the problem. I have uninstalled the antivirus - doesn't help. I have scanned the VM's with different antivirus engines (limited since they are Windows Server 2003) - did not find any problems.
So... I decided to go with Windows Server 2012 R2 and setup RDS - then convert the application to run in a 64bit environment. I am part way through the conversion and trust that the old system will remain stable until I am done.
So... I decided to go with Windows Server 2012 R2 and setup RDS - then convert the application to run in a 64bit environment. I am part way through the conversion and trust that the old system will remain stable until I am done.
ASKER
The new terminal server is installed, software migrated, and a quick test run... Monday we go live.
We have shut off one of the 2 virtual 2003 machines on the Hyper-V physical server - I will be watching to see if the other 2003 VM (being used as a file server) remains stable.
I will close this request after I update with more results/history. Thanks again for those that commented :)
We have shut off one of the 2 virtual 2003 machines on the Hyper-V physical server - I will be watching to see if the other 2003 VM (being used as a file server) remains stable.
I will close this request after I update with more results/history. Thanks again for those that commented :)
This question needs an answer!
Become an EE member today
7 DAY FREE TRIALMembers can start a 7-Day Free trial then enjoy unlimited access to the platform.
View membership options
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
If the last crash was yesterdat it suggests there is a problem with ntfs.sys, have checkdisk run the disks (pref with repair) or have windows check the integrity of it core components with sfc /scannow
Cheers
NB: If both tries do not fix the issue maybe C:\WINDOWS\Minidump\Mini10