Link to home
Start Free TrialLog in
Avatar of Stephan Avanozian
Stephan AvanozianFlag for United States of America

asked on

VMware: Slow VM launch & performance, low disk i/o and no apparent resource bottlenecks

Symptoms:
VMs are slow to load and operate, eg: 10 minutes to login & get automatic services running. Launching Server Manager or EventViewer takes 2-3 minutes, and app installs take 30+ min. While launching, no apparent bottlenecks in CPU/Memory/Disk/Network as shown below. This is true for any VM on this host. Please forgive the extensive post; I wanted to provide complete info.

I’ve applied every VMware performance tweak I’ve found (below) and all typical Performance Monitor counters seem normal while launching, however still the VM crawls. I understand that this isn’t high-end lab hardware, but while launching, the disk is often 99% active with read/write only 0-5MB/s, when I’d expect it should be very busy reading. I’ve tested raw file-transfer as fast at 130MB/s, so why is disk transfer near-zero when launching? Doesn’t seem to be a nesting issue; I see similar behavior when launching the parent ESXi VM in Workstation.


Environment: Nested VMware lab
Hardware: HP Pavilion AMD FX Six-Core proc @ 3.3GHZ, 32GB RAM, 3TB HDD 7200RPM Seagate ST3000DM001
Parent OS: Windows 10 Pro 64bit
Host: VMware Workstation Pro 14 host running ESXi 6.5
VM:   Win2016 1607 DC on 4 vCPU, 8GB RAM, 60GB storage


Performance Monitoring / Diags while launching VMs:
VMware tools: installed
VT: enabled
Host CPU:  5-20% with only momentary peaks to 90%
Host Memory: 14GB avail (4GB avail on VM)
Host Disk:
- Disk Activity: Between 10-99%
- Read: Mostly 0-5MB/s; seldom, brief peaks to 15MB/s
- Write: Mostly 0-5MB/s; seldom, brief peaks to 15MB/s
- Avg. Disk Queue Length: .33 (low)
- Avg. Disk sec/Transfer: .002 (low)
- Page Reads/sec: .87 (low)
- Page Faults/sec: 6202 avg; 35188 max   Can we ignore as these aren’t hard faults?
- DAVG per ESXTOP: 5-11ms (low)


VM Config / Optimizations per https://www.vmware.com/pdf/WS6_Performance_Tuning_and_Benchmarking.pdf
(Running only 1 VM at a time while troubleshooting).
Memory: VM has 6GB free of 8GB total
Disk space: Host=800GB free; VM=17GB free
Disk cache: Enabled read-write cache on host and VM
Networking: N/A; not accessing network resources
Consolidated/Preallocated disks
Defragged both host and VM
Snapshots: Removed all
Disabled Windows Defender while troubleshooting
Edited vmx to optimize i/o per VMware kb1008885
   MemTrimRate = "0"
   mainMem.useNamedFile = "FALSE"
   sched.mem.pshare.enable = "FALSE"
   prefvmx.useRecommendedLockedMemSize = "TRUE"


All that being said, is it normal for disk read/write to be near-zero while loading a VM, and is there anything else to look at?

Thank you in advance.
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

Performance is only going to be worse when using VMs in a nested environment.

But you comment, that VMs installed in workstation are slow?

I've not used a spinning disk for use with VMware Workstation for a number of years, have you considered using an SSD,

If you run CrystalDisk Benchmark on your host, what performance does your single 7200 SATA disk give ?
Avatar of Stephan Avanozian

ASKER

Please see att. jpg showing CrystalDisk and Atto results, which shows up to 250MB read. CrystalDisks numbers are lower in comparison, but my understanding is that this is because CrystalMark uses uncompressed data. My 7200 disk's performance appear to be normal as similar to this review: http://www.hardwaresecrets.com/seagate-barracuda-3-tb-hard-drive-review/3/.

My results recap:

CrystalDisk:
Sequential: 191 R / 182 W
512K:           55 R / 87 W
4K:                .6 R / 1.3 W
4KQD32:      2.8 R / .78 W

Atto:
256KB: 150 R  / 150 W
4MB:     250 R / 150 W


Understood that an SSD drive would improve performance, but I'm trying not to pump more $ into my small lab.

I'm also trying to improve my understanding of performance optimization in general: It would seem that if CPU/memory/disk/network are all configured optimally, one would assume that each resource would run at it's max potential unless it's waiting on some other resource. I'm wondering why disk read/write rate is near-zero for extended periods of time while CPU/memory/network activity is mostly low. Considering that loading an app is basically reading from slower disk into faster memory, shouldn't the disk be steadily reading at 150-250MB/s?
All I/O under a hypervisor is virtualised. A single SATA disk at most is going to give 40 IOPS, and this will be less virtualised.

Are those host figures ?

and what are the  figures on a VM under VMware Workstation, and they under ESXi.

You would be better running Hyper-V on your host!
The above tests are from the Win10 Pro host.  I will run CrystalDisk on the VMware Workstation and ESXi layers and post those as well. Thank you.
Ok, I ran ATTO and CrystalDiskMark tests and here are the results (max throughput in MB/s). See also att. chart with more detail.

                        Read   Write   Read-sequential
Host:               260      150     191
Parent VM:     270      150     126
Child VM:        160      25       64

I was mistaken that the VM parent performance was similar to the VM child; the parent is comparable to the host and the child is behaving... very poorly by comparison as you'd first mentioned.

Is this typical degradation for nested VMs, or do you think it's so severe that I might have something misconfigured?
Benchmarks-for-nested-VM.jpg
ASKER CERTIFIED SOLUTION
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Many thanks for your advice. Your suggestion to compare disk results between host>vm>nestedvm really helped.  I'll move the nested vms to the parent for now and will order an SSD.
Update:

Andrew, per your suggestion, I further migrated the VM to Hyper-V.  Major improvement!! Here are the results (all on HDD):

Time to launch VM:

5 min, 25 sec  --> Nested VMware VM (ESXI VM within Workstation)
3 min, 37 sec  --> 36% faster as non-nested VMware VM  (ie within Workstation)
2 min, 50 sec  --> 52% faster on Hyper-V


Nice! From what I've read, this is because Hyper-V is a type-1 baremetal hypervisor vs Workstation type-2.

Andrew, this is the performance boost I was looking for! Many thanks for guiding me towards Hyper-V.

Cheers,
Stephan