Windows Server Backup Hard Locks a Hyper-V Host Server

Hello,

We are attempting to utilize Windows Server Backup to backup running vhd's without downtime. I've followed all documented procedures but every so often, and inconsistently, WSB will hard lock a host server. All services in active memory continue running but anything that accesses disk stops working. The only way to fix it is to hard boot the server. Symptoms include, but are not limited to:

* unable to log in
* can ping all servers
* can access certain services and files if they were "active" before the lock
* cannot remote manage the host server
* on occasion, but not typically, we will have full access to virtual servers for a long time (12-15 hours) before they finally crash.

We've called Microsoft support and they did nothing more than say run some VSS hotfixes.  The only consistent behavior is that all these network also use Backup Exec though they are not scheduled to run at the same time.  

Other information:

* Host systems are running 64-bit W2K8 Enterprise SP2  .
* VM's are running a mix of W2K3 SP2 or higher, W2K3 R2, W2K8, and W2K8 R2.
* It doesn't matter if the .vhd's are fixed or dynamic as far as I can tell as we have mixed disks.
* It doesn't seem to matter if on local or SAN disks.
* It doesn't seem to matter if the WSB job is scheduled or run from script to just do a volume and not the whole server.
* We have followed http://support.microsoft.com/kb/958662 as well as several other KB's.
* All Hyper-V integrated services are installed and in use except for time sync.
* No App or System eventlog entries on the physical machines when the issue is occurring (at all).
* Backing up the VM's manually always works, either offline or online.

I'll include more info below.  ANY input into a successful resolution would be appreciated.


Event log entries just prior:

8:30:10 DCOM  started the service wbengine with arguments "" in order to run the server:
{37734C4D-FFA8-4139-9AAC-60FBE55BF3DF}

8:30:10 DCOM  started the service vds with arguments "" in order to run the server:
{7D1933CB-86F6-4A98-8628-01BE94C9A575}

8:30:10 Virtual Disk Service Started.

8:30:10 The Block Level Backup Engine Service service entered the running state.

8:30:10 The Virtual Disk service entered the running state.

8:32:13 Driver Management has concluded the process to add Service storvsc for Device Instance ID VMBUS\{DE9369D0-ECDC-4676-B67E-6F1197C82A47}\1&3189FC23&0&{DE9369D0-ECDC-4676-B67E-6F1197C82A47} with the following status: 0.

8:32:13 Driver Management concluded the process to install driver FileRepository\wstorvsc.inf_9e707b08\wstorvsc.inf for Device Instance ID VMBUS\{DE9369D0-ECDC-4676-B67E-6F1197C82A47}\1&3189FC23&0&{DE9369D0-ECDC-4676-B67E-6F1197C82A47} with the following status: 0.

8:32:14 Driver Management has concluded the process to add Service disk for Device Instance ID SCSI\DISK&VEN_MSFT&PROD_VIRTUAL_DISK\2&52993DD&0&000000 with the following status: 0.

8:32:14 Driver Management concluded the process to install driver FileRepository\disk.inf_f14e87fb\disk.inf for Device Instance ID SCSI\DISK&VEN_MSFT&PROD_VIRTUAL_DISK\2&52993DD&0&000000 with the following status: 0.

8:32:17 Driver Management has concluded the process to add Service storvsc for Device Instance ID VMBUS\{9558598B-660B-4FDE-AD05-4442FBE28321}\1&3189FC23&0&{9558598B-660B-4FDE-AD05-4442FBE28321} with the following status: 0.

8:32:18 Driver Management concluded the process to install driver FileRepository\wstorvsc.inf_9e707b08\wstorvsc.inf for Device Instance ID VMBUS\{9558598B-660B-4FDE-AD05-4442FBE28321}\1&3189FC23&0&{9558598B-660B-4FDE-AD05-4442FBE28321} with the following status: 0.

8:32:19 Driver Management has concluded the process to add Service disk for Device Instance ID SCSI\DISK&VEN_MSFT&PROD_VIRTUAL_DISK\2&281F6F8D&0&000000 with the following status: 0.

8:32:19 Driver Management concluded the process to install driver FileRepository\disk.inf_f14e87fb\disk.inf for Device Instance ID SCSI\DISK&VEN_MSFT&PROD_VIRTUAL_DISK\2&281F6F8D&0&000000 with the following status: 0.

8:32:35 Volume \\?\Volume{85cd81e0-c2f9-11de-a87a-001517598fae} is being reverted to the state of a previous shadow copy.

8:32:37 The reverting of volume \\?\Volume{85cd81e0-c2f9-11de-a87a-001517598fae} to the state of a previous shadow copy is complete.

8:32:44 Status 0x00001069 determining that device interface \\?\VMBUS#{de9369d0-ecdc-4676-b67e-6f1197c82a47}#1&3189fc23&0&{de9369d0-ecdc-4676-b67e-6f1197c82a47}#{2accfe60-c130-11d2-b082-00a0c91efb8b} does not support iSCSI WMI interfaces. If this device is not an iSCSI HBA then this error can be ignored.  

All but the last entry are informational.  The 8:32:44 error is a warning and may be the root of the problem, though I can find no info on it.


"vmmadmin list writers" Output:

vssadmin 1.1 - Volume Shadow Copy Service administrative command-line tool
(C) Copyright 2001-2005 Microsoft Corp.

Writer name: 'Microsoft Hyper-V VSS Writer'
   Writer Id: {66841cd4-6ded-4f4b-8f17-fd23f8ddc3de}
   Writer Instance Id: {e549b9e0-7ae0-4fb5-a469-0183d24caaf4}
   State: [1] Stable
   Last error: No error

Writer name: 'System Writer'
   Writer Id: {e8132975-6f93-4464-a53e-1050253ae220}
   Writer Instance Id: {c27bf9fd-ece8-4dc3-b3aa-2f439301a2ef}
   State: [1] Stable
   Last error: No error

Writer name: 'ASR Writer'
   Writer Id: {be000cbe-11fe-4426-9c58-531aa6355fc4}
   Writer Instance Id: {269f6f5a-8310-4273-aef5-230198bda8bd}
   State: [1] Stable
   Last error: No error

Writer name: 'SqlServerWriter'
   Writer Id: {a65faa63-5ea8-4ebc-9dbd-a0c4db26912a}
   Writer Instance Id: {79d90ad0-1393-48f8-b146-35a07f6cb988}
   State: [1] Stable
   Last error: No error

Writer name: 'Shadow Copy Optimization Writer'
   Writer Id: {4dc3bdd4-ab48-4d07-adb0-3bee2926fd7f}
   Writer Instance Id: {c0af2ba1-c497-4f5d-870a-dea270e2331a}
   State: [1] Stable
   Last error: No error

Writer name: 'Registry Writer'
   Writer Id: {afbab4a2-367d-4d15-a586-71dbb18f8485}
   Writer Instance Id: {dc4f3c84-aea5-48f4-a1eb-e7e56a522b5d}
   State: [1] Stable
   Last error: No error

Writer name: 'BITS Writer'
   Writer Id: {4969d978-be47-48b0-b100-f328f07ac1e0}
   Writer Instance Id: {675cb50e-3ea4-4020-8881-a713e7505ed8}
   State: [1] Stable
   Last error: No error

Writer name: 'WMI Writer'
   Writer Id: {a6ad56c2-b509-4e6c-bb19-49d8f43532f0}
   Writer Instance Id: {be455104-c087-4567-b8c3-eb3bc6973a24}
   State: [1] Stable
   Last error: No error

Writer name: 'COM+ REGDB Writer'
   Writer Id: {542da469-d3e1-473c-9f4f-7847f01fc64f}
   Writer Instance Id: {3e57442c-a568-4c8d-b9c8-c477b622dbf9}
   State: [1] Stable
   Last error: No error

Writer name: 'Cluster Database'
   Writer Id: {41e12264-35d8-479b-8e5c-9b23d1dad37e}
   Writer Instance Id: {ace5e130-cca4-4386-9980-77b0d7443942}
   State: [1] Stable
   Last error: No error


SafetyNet-TCAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

msmamjiCommented:
I am pretty sure that thats the output to "vssadmin list writers"

Are you using iSCSI initiator on the host server?

Regards,
Shahid
0
SafetyNet-TCAuthor Commented:
Thanks for the follow-up.

Yes, vssadmin.  Sorry.  To many vm's and vss's.

Yes, we are using iSCSI initiator on the systems having the issue.  The host servers live on a SAN.

0
SafetyNet-TCAuthor Commented:
Anyone?  Bueller?
0
10 Tips to Protect Your Business from Ransomware

Did you know that ransomware is the most widespread, destructive malware in the world today? It accounts for 39% of all security breaches, with ransomware gangsters projected to make $11.5B in profits from online extortion by 2019.

SafetyNet-TCAuthor Commented:
Re-bump.  Any new takers?
0
msmamjiCommented:
Hi,
First all of I would like to apologize for this situation. I kind of feel responsible. This is what happens here at EE. People don't even bother looking at the question if it has been responded to even once.
Secondly If you noticed anything else recently other than what you have already told would help in carrying this forward.
Meanwhile you should recheck the
iSCSI initiator service
MPIO software/drivers
NIC/HBA drivers
and anything that is remotely involved with VSS. Post anything that strikes odd.

If all else fails, requestion attention from the admins.

Regards,
Shahid
0
SafetyNet-TCAuthor Commented:
Well, I need to ask another question and it won't let me until I close this one or ask for attention.  That seems pretty ridiculous, but I'll play the game.
0
SafetyNet-TCAuthor Commented:
Closing with no resolution.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
intellitechsolutionsCommented:
Were you ever able to get this answered or find a solution yourself?   We are having an issue similar to this on one of our servers.  It is using a third party backup utility but it still uses the Hyper-v snapshot function.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Storage Software

From novice to tech pro — start learning today.