Link to home
Start Free TrialLog in
Avatar of Armitage318
Armitage318

asked on

Bad performance ho VMware ESXi cluster

Hi, I have a VMware ESXi 6.0 cluster with 7 nodes (5 x R710, 2 x R630).
Since 1 week or two, several customers are complaining about very slow performances on their VPS (some Windows, some Linux, ...)
So I wondering if this could be an issue with my infrastructure.
I don't think this could be a RAM or CPU resource bottleneck:

User generated image
As storage, I use 3 x Dell PowerVault MD3200, with SAS disks only - no SSD).
How can I further investigate?
Thank you!
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

Since 1 week or two, several customers are complaining about very slow performances on their VPS (some Windows, some Linux, ...)

You've got to be a little cautious, as to what are they comparing their Servers to, in terms of performance ?

e.g. if they are comparing to a desktop computer they are probably correct. Consolidation and virtualization is a compromise.

Your hosts, don't look over committed, e.g. CPU and RAM, so it's time to start looking at the performance of the VMs, CPU, Memory and Disk ?

Make sure ALL VMs have VMware Tools installed and using the VMXNET3 interface.

see my EE Article

HOW TO:  Performance Monitor vSphere 4.x or 5.0
Avatar of noci
noci

Did you patch for Spectre or Meltdown about the time your customers started complaining?
the patches might be a cause of serious performance degradation.
(Not applying patches will make you vulnerable to  active protected memory reading from unprivleged processes or from guests reading memory on the host (or other guests)
(Not applying patches will make you vulnerable to  active protected memory reading from unprivleged processes or from guests reading memory on the host (or other guests)

his physical servers have not been fixed yet!!!

BIOS update for R710 is not available yet, and as for the the risk..... well....fetching pieces of memory out of the host!!!!
The OS might have gotten a patch. Which due to heavy cache cleaning effectivly slows down a system a lot.
The BIOS uCode fix is only for part a solution. and an OS can be fixed for a part as well OS's need to use RETpolines  and explicit branch prediction cache flushes.

Most efficient use Both, the cache flushes could be lifted as a emergency measure.
@Armitage318:- what build of ESXi are you using ?
ASKER CERTIFIED SOLUTION
Avatar of Ajay Chanana
Ajay Chanana
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
What are you using for storage local, san etc?
Avatar of Armitage318

ASKER

@Andrew: I am using ESXi 6.0.0, 7504637
@ajay: I checked the values you suggested me on my cluster (7 hosts), and sometimes I saw very high numbers (just for few seconds):

User generated image
User generated image
 

@compdigit44: I am using SAN (3 x Dell PowerVault MD3200i, through iSCSI, 1 Gbit connections)

Thank you all