Solved

Possible Disk Errors on Server 2012 R2 Virtual Domain Controller?

Posted on 2016-11-07
7
27 Views
Last Modified: 2016-11-20
Good afternoon,

I have a PDC running on server 2012 R2 which is running on ESXI, on a PERC 700 series RAID 1 array.  I am using server backup to run nightly system state backup of it.  It looks like it got jammed up on friday and stayed jammed up until today.  I have restarted the server and I am hoping the backup will just run tonight without any issues.  

However...I am noticing some more ominous issues.  I am seeing more exotic and malevolent errors in the windows event viewer.  

THESE ERRORS OCCURRED around 2am when the server backup runs.

1.  The backup seemed to fail and get stuck on the following..."Dhcp Jet Writer..."  VSS
2.   The backup operation that started at '‎2016‎-‎11‎-‎07T07:00:01.018843100Z' has failed because another backup or recovery operation is in progress. Please stop the conflicting operation, and then rerun the backup operation.
3.   The volume \\?\Volume*\ was not optimized because an error was encountered: Neither Slab Consolidation nor Slab Analysis will run if slabs are less than 8 MB. (0x8900002D)

THESE ERRORS OCCURRED RIGHT AFTER THE FIRST TIME I GRACEFULLY RESTARTED THE SERVER

4.  An error was detected on device \Device\Harddisk1\DR37 during a paging operation.
5.  The system failed to flush data to the transaction log. Corruption may occur in VolumeId: \\?\Volume*, DeviceName: \Device\HarddiskVolume76.
(A device which does not exist was specified.)
6.  The default transaction resource manager on volume \\?\Volume* encountered a non-retryable error and could not start.  The data contains the error code.
7.  Errors 4 to 6 rinse and repeat for a bit until this last error occurs which seems to shut them up.
Volume Shadow Copy Service error: Unexpected error calling routine RegOpenKeyExW(-2147483646,SYSTEM\CurrentControlSet\Services\VSS\Diag,...).  hr = 0x80070005, Access is denied.
8.  Some new errors started to occur, which I have googled, could mean volume resizing errors (i didnt do this) or mouse pointer, touch display issues.  Event ID 265: A pointer device did not report a valid unit of angular measurement.

I restarted the server a second time after this and these frightening errors did not occur again, with the exception of error 8, which settles down and ceases after the computer has finished booting up.

What's going on here?  I have not changed anything on this server, other than day to day updating users accounts and DNS.  It is running on VMDK which is on a hardware backed RAID1 array.  I checked the hardware status for the server in vCenter Server under hardware for that ESXI host and everything is green.  Also did a c drive check disk and no problems found.  

My anecdotal theory, is all these disk errors are related to the hung Server Backup job not being able to write to the Microsoft Virtual hard drive file that it uses to do its backup.  And then post restart it gives up?  Or should I migrate the VM to another host and datastore?

Thanks for the info :-)
0
Comment
Question by:CnicNV
  • 4
  • 3
7 Comments
 
LVL 117
ID: 41877853
what application are you using to backup the VM ?
0
 

Author Comment

by:CnicNV
ID: 41877863
Hi Andrew,

Currently I am not backing up the VM its self, rather I am running the Windows Server Backup application from within the VM its self, which is copying the system state over to a NAS nightly.  This is so I could in theory do an authoritative restore of the DC.  But I have another non PDC DC that is actively replicating with it, so if it fails I have that of course.  I could do a seize FSMO role on that in worst case I suppose.  

I heard using VM based backups for DC is not the best idea, because if your restore, it can cause all sorts of issues with the other partner DCs.

I am just wondering what's going on.  If it's just a glitch or something more ominous.
0
 
LVL 117
ID: 41877880
it looks like the virtual machine disk, and the datastore it resides on is having difficulties with performance.

two disks, and RAID 1 does not give you much performance in terms of IOPS.

It is far beneficial to backup the VM at the host level, and take advantage of block copy operations, rather than file and folder backup.

You could use Unitrends Backup for FREE!
0
Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

 

Author Comment

by:CnicNV
ID: 41877887
Agreed, but note that this data store is only running the one VM and as you know, DCs on small networks don't have all that much overhead.

Do you know if there are any issues migrating a DC VM from one host (and data store) to another?

Is there a way to check the health of the hard disks from within vCenter server for that host?  IE is going into the hardware status tab and looking at all the green check boxes for the hard drives good enough?  How would someone tell if the hard drives or RAID array is having issues?
0
 
LVL 117

Assisted Solution

by:Andrew Hancock (VMware vExpert / EE MVE)
Andrew Hancock (VMware vExpert / EE MVE) earned 500 total points
ID: 41877893
Agreed, but note that this data store is only running the one VM and as you know, DCs on small networks don't have all that much overhead.

but if the datastore, cannot keep up with the read operations, it could struggle.

Do you know if there are any issues migrating a DC VM from one host (and data store) to another?

None.

Is there a way to check the health of the hard disks from within vCenter server for that host?  IE is going into the hardware status tab and looking at all the green check boxes for the hard drives good enough?  How would someone tell if the hard drives or RAID array is having issues?

Check performance...

HOW TO:  Performance Monitor vSphere 4.x or 5.0
0
 

Accepted Solution

by:
CnicNV earned 0 total points
ID: 41888220
It turned out to be as I thought (probably).  

Something got jammed up while running the system state backup.  IE the DHCP service was not responding for whatever reason (it does this maybe 4 times a year), and the backup failed while trying to backup this specific portion.  Subsequent backups were unable to run until I restarted the entire server.  I am guessing the disk errors were read errors on the backup disk .vhdx file.  Since I have done the restart, the backups are able to run without issue and I have yet received any disk errors on that server.  Do I know this conclusively?  No.  Could this be causation vs correlation, sure.  But I have strong belief this was what was happening.

Thanks Andrew for your help as well.  As the issue could still be possibly related to what you are saying, but I have too much anecdotal error correlation going on for me to discount my theory.
0
 

Author Closing Comment

by:CnicNV
ID: 41894662
I think it's correct :-P
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

HOW TO: Upload an ISO image to a VMware datastore for use with VMware vSphere Hypervisor 6.5 (ESXi 6.5) using the vSphere Host Client, and checking its MD5 checksum signature is correct.  It's a good idea to compare checksums, because many installat…
In this article, I show you step by step with screenshots to assist you - HOW TO: Deploy and Install the VMware vCenter Server Appliance 6.5 (VCSA 6.5), with some helpful tips along the way.
In this Micro Tutorial viewers will learn how to restore their server from Bare Metal Backup image created with Windows Server Backup feature. As an example Windows 2012R2 is used.
This Micro Tutorial walks you through using a remote console to access a server and install ESXi 5.1. This example is showing remote access and installation using a Dell server. The hypervisor is the very first component of your virtual infrastructu…

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now