windows 2003 low disk performance on machine with VSS problem

Hello,

We are having disk performance problems on several of our Windows 2003 virtual machines.
When performing a file copy on that machine (400Mb), the average disk queue length goes to 100% and even when the file is fully copied, it stays at 100% for (about) 10 additional seconds.
During this file copy, the entire virtual machine is slowing down.

As said, we have this issue on several machines who are running  Windows 2003 R2, both in x86 or x64 architecture.
File copying is also reasonably slower on these machines then on other machines in the same environment (EXS 5.1)

Coincidence or not, but on all of these machines, we have VSS problems and are unable to perform backups using VSS (although the service is running).

Does anyone have a clue on how to solve this issue? Can anyone tell us if both problems can be related?

Please advise.
saphicoAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

sjepsonCommented:
This could be the Scalable Network Pack causing issues when copying files. Microsoft had lots of issues with this 2003 SP2 update causing slow performance on file copy or RDP connections.

Turn the SNP features off on one of the servers exhibiting the problem and repeat the problem transfer.

http://support.microsoft.com/kb/948496

Steve
0
saphicoAuthor Commented:
Hi Steve,

These settings were already disabled on all of our servers....

Kind regards,
Bert
0
sjepsonCommented:
Not that then. :-)

What is the underlying storage are all the badly performing machines on the same physical device, the same physical spindles and/or the same LUN?

Steve
0
Big Business Goals? Which KPIs Will Help You

The most successful MSPs rely on metrics – known as key performance indicators (KPIs) – for making informed decisions that help their businesses thrive, rather than just survive. This eBook provides an overview of the most important KPIs used by top MSPs.

saphicoAuthor Commented:
the underlying storage is a nexenta, with HA activated
the machines are spread out over several different lun's and physical spindles

we have other machines running on these same disks (connect with NFS) that are not having problems...
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
This can be caused by slow datastores.

So what technology and how is your datastore configured?

RAID type, SATA or SAS, speed of disks, etc

Non of these VMs suffering this issue, have SNAPSHOTS?
0
saphicoAuthor Commented:
datastore is configured in Raid 10 with both sas and SSD disks


in none of these vm's i'm able to perform a vss snapshot. the vss software simply hangs when performing the snapshot.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Do you mean, when you select Take Snapshot, it does not completed?

and you also using the quiece option? e.g. it's ticked?

This could be VMware Tools or VSS writer issue
0
saphicoAuthor Commented:
take snapshot is not completing, that is correct. the window just freezes...;

this is not vmware reltated, because we have had this problem also before (when this machine was running in a Xenserver environment)
0
sjepsonCommented:
There is something like a 20 second window between VSS freezing the server and Vmware taking a snapshot that can then be backed up. If the Vmware snapshot doesn't happen within 20 seconds then the VSS backup fails. This suggests that the VSS issue is related to the slow disc performance.

What monitoring tools do you have for you disc subsystem? ESXTOP will measure disk latency from the hosts perspective.

Steve
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
any errors in the VMware Event viewer, or in the VM Event log?
0
saphicoAuthor Commented:
As i said before, we don't think this is storage or ESX related as other machines work perfecly and are running at full performance.

We really think this is a windows 2003 issue.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Re-install VMware Tools.
0
saphicoAuthor Commented:
can you tell me why you think this is vmware tools related?
As i said before, we also had this problem BEFORE we migrated the VM's from Xenserver to ESX....
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
So these Windows 2003 VMs had the same issue under Xen?

If so VSS is broke!

VMware Tools, and the Snapshot Quiece option, instructs the Sync driver in VMware Tools to q. the VM, using VSS.
0
saphicoAuthor Commented:
Yes, i had the same problem in Xen.  how can i repair VSS?
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
VSS is somewhat difficult to repair, if it has broke.

Do you have an errors in the event log which relates to VSS?
0
saphicoAuthor Commented:
we've already tried to solve the problem of vss, but haven't found a solution.
Can vss being broke cause system latency?
0
saphicoAuthor Commented:
hi. to exclude storage being an issue, we've just moved this vm to esx local storage.

We still have the same issue....

anyone?
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Any events in the local event log of the Windows 2003 VM, at the time of VMware Snapshot?
0
saphicoAuthor Commented:
when i try to perform a shadow copy i get:

failed to create a shadow copy of volume c:\

Error 0x80042306: the shadow copy provider has an error. Please see the system and application event logs for more information


In the application logs i've found:
event id: 12310 on Source VSS
With description:
Volume Shadow Copy Service error: The shadow copy could not be committed - operation timed out. Error context: DeviceIoControl(\\?\Volume{e66f9797-8d77-11e0-aab0-806e6f6e6963} - 0000018C,0x0053c010,00037D00,0,00038D08,4096,[0]).


AND

event id: 12298 on source VSS
Volume Shadow Copy Service error: The I/O writes cannot be held during the shadow copy creation period on volume \\?\Volume{e66f9797-8d77-11e0-aab0-806e6f6e6963}\. The volume index in the shadow copy set is 0. Error details: Open[0x00000000], Flush[0x00000000], Release[0x80042314], OnRun[0x00000000].


In the system logs i've found

event id: 8 on source Volsnap
The flush and hold writes operation on volume C: timed out while waiting for a release writes command.




i hope this is helpfull...
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Okay, this was what I was referring to, as VSS broken.

In the past we've copied and pasted VSS registries keys from another working system.

Some times, this has been successful, and sometimes not.
0
saphicoAuthor Commented:
Which keys exactly are you referring to?
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
saphicoAuthor Commented:
we've simple swapped to a windows 2012 environment which solve this issue
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Windows Server 2003

From novice to tech pro — start learning today.