CnicNV
asked on
Possible Disk Errors on Server 2012 R2 Virtual Domain Controller?
Good afternoon,
I have a PDC running on server 2012 R2 which is running on ESXI, on a PERC 700 series RAID 1 array. I am using server backup to run nightly system state backup of it. It looks like it got jammed up on friday and stayed jammed up until today. I have restarted the server and I am hoping the backup will just run tonight without any issues.
However...I am noticing some more ominous issues. I am seeing more exotic and malevolent errors in the windows event viewer.
THESE ERRORS OCCURRED around 2am when the server backup runs.
1. The backup seemed to fail and get stuck on the following..."Dhcp Jet Writer..." VSS
2. The backup operation that started at '2016-11-07T07:00:01. 018843100Z ' has failed because another backup or recovery operation is in progress. Please stop the conflicting operation, and then rerun the backup operation.
3. The volume \\?\Volume*\ was not optimized because an error was encountered: Neither Slab Consolidation nor Slab Analysis will run if slabs are less than 8 MB. (0x8900002D)
THESE ERRORS OCCURRED RIGHT AFTER THE FIRST TIME I GRACEFULLY RESTARTED THE SERVER
4. An error was detected on device \Device\Harddisk1\DR37 during a paging operation.
5. The system failed to flush data to the transaction log. Corruption may occur in VolumeId: \\?\Volume*, DeviceName: \Device\HarddiskVolume76.
(A device which does not exist was specified.)
6. The default transaction resource manager on volume \\?\Volume* encountered a non-retryable error and could not start. The data contains the error code.
7. Errors 4 to 6 rinse and repeat for a bit until this last error occurs which seems to shut them up.
Volume Shadow Copy Service error: Unexpected error calling routine RegOpenKeyExW(-2147483646, SYSTEM\Cur rentContro lSet\Servi ces\VSS\Di ag,...). hr = 0x80070005, Access is denied.
8. Some new errors started to occur, which I have googled, could mean volume resizing errors (i didnt do this) or mouse pointer, touch display issues. Event ID 265: A pointer device did not report a valid unit of angular measurement.
I restarted the server a second time after this and these frightening errors did not occur again, with the exception of error 8, which settles down and ceases after the computer has finished booting up.
What's going on here? I have not changed anything on this server, other than day to day updating users accounts and DNS. It is running on VMDK which is on a hardware backed RAID1 array. I checked the hardware status for the server in vCenter Server under hardware for that ESXI host and everything is green. Also did a c drive check disk and no problems found.
My anecdotal theory, is all these disk errors are related to the hung Server Backup job not being able to write to the Microsoft Virtual hard drive file that it uses to do its backup. And then post restart it gives up? Or should I migrate the VM to another host and datastore?
Thanks for the info :-)
I have a PDC running on server 2012 R2 which is running on ESXI, on a PERC 700 series RAID 1 array. I am using server backup to run nightly system state backup of it. It looks like it got jammed up on friday and stayed jammed up until today. I have restarted the server and I am hoping the backup will just run tonight without any issues.
However...I am noticing some more ominous issues. I am seeing more exotic and malevolent errors in the windows event viewer.
THESE ERRORS OCCURRED around 2am when the server backup runs.
1. The backup seemed to fail and get stuck on the following..."Dhcp Jet Writer..." VSS
2. The backup operation that started at '2016-11-07T07:00:01.
3. The volume \\?\Volume*\ was not optimized because an error was encountered: Neither Slab Consolidation nor Slab Analysis will run if slabs are less than 8 MB. (0x8900002D)
THESE ERRORS OCCURRED RIGHT AFTER THE FIRST TIME I GRACEFULLY RESTARTED THE SERVER
4. An error was detected on device \Device\Harddisk1\DR37 during a paging operation.
5. The system failed to flush data to the transaction log. Corruption may occur in VolumeId: \\?\Volume*, DeviceName: \Device\HarddiskVolume76.
(A device which does not exist was specified.)
6. The default transaction resource manager on volume \\?\Volume* encountered a non-retryable error and could not start. The data contains the error code.
7. Errors 4 to 6 rinse and repeat for a bit until this last error occurs which seems to shut them up.
Volume Shadow Copy Service error: Unexpected error calling routine RegOpenKeyExW(-2147483646,
8. Some new errors started to occur, which I have googled, could mean volume resizing errors (i didnt do this) or mouse pointer, touch display issues. Event ID 265: A pointer device did not report a valid unit of angular measurement.
I restarted the server a second time after this and these frightening errors did not occur again, with the exception of error 8, which settles down and ceases after the computer has finished booting up.
What's going on here? I have not changed anything on this server, other than day to day updating users accounts and DNS. It is running on VMDK which is on a hardware backed RAID1 array. I checked the hardware status for the server in vCenter Server under hardware for that ESXI host and everything is green. Also did a c drive check disk and no problems found.
My anecdotal theory, is all these disk errors are related to the hung Server Backup job not being able to write to the Microsoft Virtual hard drive file that it uses to do its backup. And then post restart it gives up? Or should I migrate the VM to another host and datastore?
Thanks for the info :-)
what application are you using to backup the VM ?
ASKER
Hi Andrew,
Currently I am not backing up the VM its self, rather I am running the Windows Server Backup application from within the VM its self, which is copying the system state over to a NAS nightly. This is so I could in theory do an authoritative restore of the DC. But I have another non PDC DC that is actively replicating with it, so if it fails I have that of course. I could do a seize FSMO role on that in worst case I suppose.
I heard using VM based backups for DC is not the best idea, because if your restore, it can cause all sorts of issues with the other partner DCs.
I am just wondering what's going on. If it's just a glitch or something more ominous.
Currently I am not backing up the VM its self, rather I am running the Windows Server Backup application from within the VM its self, which is copying the system state over to a NAS nightly. This is so I could in theory do an authoritative restore of the DC. But I have another non PDC DC that is actively replicating with it, so if it fails I have that of course. I could do a seize FSMO role on that in worst case I suppose.
I heard using VM based backups for DC is not the best idea, because if your restore, it can cause all sorts of issues with the other partner DCs.
I am just wondering what's going on. If it's just a glitch or something more ominous.
it looks like the virtual machine disk, and the datastore it resides on is having difficulties with performance.
two disks, and RAID 1 does not give you much performance in terms of IOPS.
It is far beneficial to backup the VM at the host level, and take advantage of block copy operations, rather than file and folder backup.
You could use Unitrends Backup for FREE!
two disks, and RAID 1 does not give you much performance in terms of IOPS.
It is far beneficial to backup the VM at the host level, and take advantage of block copy operations, rather than file and folder backup.
You could use Unitrends Backup for FREE!
ASKER
Agreed, but note that this data store is only running the one VM and as you know, DCs on small networks don't have all that much overhead.
Do you know if there are any issues migrating a DC VM from one host (and data store) to another?
Is there a way to check the health of the hard disks from within vCenter server for that host? IE is going into the hardware status tab and looking at all the green check boxes for the hard drives good enough? How would someone tell if the hard drives or RAID array is having issues?
Do you know if there are any issues migrating a DC VM from one host (and data store) to another?
Is there a way to check the health of the hard disks from within vCenter server for that host? IE is going into the hardware status tab and looking at all the green check boxes for the hard drives good enough? How would someone tell if the hard drives or RAID array is having issues?
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I think it's correct :-P