Solved

Possible Disk Errors on Server 2012 R2 Virtual Domain Controller?

Posted on 2016-11-07
7
53 Views
Last Modified: 2016-11-20
Good afternoon,

I have a PDC running on server 2012 R2 which is running on ESXI, on a PERC 700 series RAID 1 array.  I am using server backup to run nightly system state backup of it.  It looks like it got jammed up on friday and stayed jammed up until today.  I have restarted the server and I am hoping the backup will just run tonight without any issues.  

However...I am noticing some more ominous issues.  I am seeing more exotic and malevolent errors in the windows event viewer.  

THESE ERRORS OCCURRED around 2am when the server backup runs.

1.  The backup seemed to fail and get stuck on the following..."Dhcp Jet Writer..."  VSS
2.   The backup operation that started at '‎2016‎-‎11‎-‎07T07:00:01.018843100Z' has failed because another backup or recovery operation is in progress. Please stop the conflicting operation, and then rerun the backup operation.
3.   The volume \\?\Volume*\ was not optimized because an error was encountered: Neither Slab Consolidation nor Slab Analysis will run if slabs are less than 8 MB. (0x8900002D)

THESE ERRORS OCCURRED RIGHT AFTER THE FIRST TIME I GRACEFULLY RESTARTED THE SERVER

4.  An error was detected on device \Device\Harddisk1\DR37 during a paging operation.
5.  The system failed to flush data to the transaction log. Corruption may occur in VolumeId: \\?\Volume*, DeviceName: \Device\HarddiskVolume76.
(A device which does not exist was specified.)
6.  The default transaction resource manager on volume \\?\Volume* encountered a non-retryable error and could not start.  The data contains the error code.
7.  Errors 4 to 6 rinse and repeat for a bit until this last error occurs which seems to shut them up.
Volume Shadow Copy Service error: Unexpected error calling routine RegOpenKeyExW(-2147483646,SYSTEM\CurrentControlSet\Services\VSS\Diag,...).  hr = 0x80070005, Access is denied.
8.  Some new errors started to occur, which I have googled, could mean volume resizing errors (i didnt do this) or mouse pointer, touch display issues.  Event ID 265: A pointer device did not report a valid unit of angular measurement.

I restarted the server a second time after this and these frightening errors did not occur again, with the exception of error 8, which settles down and ceases after the computer has finished booting up.

What's going on here?  I have not changed anything on this server, other than day to day updating users accounts and DNS.  It is running on VMDK which is on a hardware backed RAID1 array.  I checked the hardware status for the server in vCenter Server under hardware for that ESXI host and everything is green.  Also did a c drive check disk and no problems found.  

My anecdotal theory, is all these disk errors are related to the hung Server Backup job not being able to write to the Microsoft Virtual hard drive file that it uses to do its backup.  And then post restart it gives up?  Or should I migrate the VM to another host and datastore?

Thanks for the info :-)
0
Comment
Question by:CnicNV
  • 4
  • 3
7 Comments
 
LVL 119
ID: 41877853
what application are you using to backup the VM ?
0
 

Author Comment

by:CnicNV
ID: 41877863
Hi Andrew,

Currently I am not backing up the VM its self, rather I am running the Windows Server Backup application from within the VM its self, which is copying the system state over to a NAS nightly.  This is so I could in theory do an authoritative restore of the DC.  But I have another non PDC DC that is actively replicating with it, so if it fails I have that of course.  I could do a seize FSMO role on that in worst case I suppose.  

I heard using VM based backups for DC is not the best idea, because if your restore, it can cause all sorts of issues with the other partner DCs.

I am just wondering what's going on.  If it's just a glitch or something more ominous.
0
 
LVL 119
ID: 41877880
it looks like the virtual machine disk, and the datastore it resides on is having difficulties with performance.

two disks, and RAID 1 does not give you much performance in terms of IOPS.

It is far beneficial to backup the VM at the host level, and take advantage of block copy operations, rather than file and folder backup.

You could use Unitrends Backup for FREE!
0
Back Up Your Microsoft Windows Server®

Back up all your Microsoft Windows Server – on-premises, in remote locations, in private and hybrid clouds. Your entire Windows Server will be backed up in one easy step with patented, block-level disk imaging. We achieve RTOs (recovery time objectives) as low as 15 seconds.

 

Author Comment

by:CnicNV
ID: 41877887
Agreed, but note that this data store is only running the one VM and as you know, DCs on small networks don't have all that much overhead.

Do you know if there are any issues migrating a DC VM from one host (and data store) to another?

Is there a way to check the health of the hard disks from within vCenter server for that host?  IE is going into the hardware status tab and looking at all the green check boxes for the hard drives good enough?  How would someone tell if the hard drives or RAID array is having issues?
0
 
LVL 119

Assisted Solution

by:Andrew Hancock (VMware vExpert / EE MVE^2)
Andrew Hancock (VMware vExpert / EE MVE^2) earned 500 total points
ID: 41877893
Agreed, but note that this data store is only running the one VM and as you know, DCs on small networks don't have all that much overhead.

but if the datastore, cannot keep up with the read operations, it could struggle.

Do you know if there are any issues migrating a DC VM from one host (and data store) to another?

None.

Is there a way to check the health of the hard disks from within vCenter server for that host?  IE is going into the hardware status tab and looking at all the green check boxes for the hard drives good enough?  How would someone tell if the hard drives or RAID array is having issues?

Check performance...

HOW TO:  Performance Monitor vSphere 4.x or 5.0
0
 

Accepted Solution

by:
CnicNV earned 0 total points
ID: 41888220
It turned out to be as I thought (probably).  

Something got jammed up while running the system state backup.  IE the DHCP service was not responding for whatever reason (it does this maybe 4 times a year), and the backup failed while trying to backup this specific portion.  Subsequent backups were unable to run until I restarted the entire server.  I am guessing the disk errors were read errors on the backup disk .vhdx file.  Since I have done the restart, the backups are able to run without issue and I have yet received any disk errors on that server.  Do I know this conclusively?  No.  Could this be causation vs correlation, sure.  But I have strong belief this was what was happening.

Thanks Andrew for your help as well.  As the issue could still be possibly related to what you are saying, but I have too much anecdotal error correlation going on for me to discount my theory.
0
 

Author Closing Comment

by:CnicNV
ID: 41894662
I think it's correct :-P
0

Featured Post

Has Powershell sent you back into the Stone Age?

If managing Active Directory using Windows Powershell® is making you feel like you stepped back in time, you are not alone.  For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When converting a physical machine to a virtual machine using VMware vCenter Converter Standalone or vCenter Converter Enterprise, if an adapter type is not selected during the initial customization the resulting virtual machine may contain an IDE d…
This article outlines why you need to choose a backup solution that protects your entire environment – including your VMware ESXi and Microsoft Hyper-V virtualization hosts – not just your virtual machines.
Teach the user how to install log collectors and how to configure ESXi 5.5 for remote logging Open console session and mount vCenter Server installer: Install vSphere Core Dump Collector: Install vSphere Syslog Collector: Open vSphere Client: Config…
This tutorial will walk an individual through the process of configuring their Windows Server 2012 domain controller to synchronize its time with a trusted, external resource. Use Google, Bing, or other preferred search engine to locate trusted NTP …

820 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question