• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 4272
  • Last Modified:

ESXi Disk write latency always gone up to above 65 ms every morning between 6 - 7 AM ?

Hi All,

I'd like to know what could be the problem or indication if I got this warning every morning between 6 AM - 7 AM ?

Status: Warning (Yellow)
Alarm: Host disk write latency
Time: 6/11/2010 6:26:56 AM
Level of Disk write latency is above 65 Millisecond

Open in new window


ESXi 4.0 with local VMFS datastore of RAID-1
2x 1 TB WDC10EARS 64 MB buffer 7200 rpm SATA-II

it happens just in the morning when there is no significant workload in the server (it is still outside of business hours).

any kind of suggestion would be greatly appreciated.

3 Solutions
Check for automated antivirus checks or that the backups are not still running?
jjozAuthor Commented:
Thanks for the reply man,

This is the model of my hard disk drive: http://www.wdc.com/en/products/products.asp?DriveID=866

I don't know why it is happening with a patterns consistently between those time frame.
There HAS to be at least one new job running between 6-7AM to account for this.  So you're just going to have to find it.   I'd turn off the virtual machines one by one during this window to see which VM is the culprit as a starting point, then you will know where to look.  

Once you know what machine it is, use the native o/s utilities to see what program is chewing up the most CPU & IO and act accordingly.
Get free NFR key for Veeam Availability Suite 9.5

Veeam is happy to provide a free NFR license (1 year, 2 sockets) to all certified IT Pros. The license allows for the non-production use of Veeam Availability Suite v9.5 in your home lab, without any feature limitations. It works for both VMware and Hyper-V environments

jjozAuthor Commented:
thanks for the suggestion man, this is production web server so it can't be turned off, but that's a good way to isolate the problem.
If this happens during the business hours continuously then i can understand that there is something wrong with the Hard drive unfortunately it isn't.
maybe move the clock on each VM up to 6AM briefly to see if anything kicks off?
No way is it a hardware problem. There is nothing in way of automated maintenance that kicks off inside an HDD that would run on this sort of schedule.  The HDD has no internal clock.  It works by cumulative power on hours.   So if you turned off the computer for 30 mins, then if it WAS the HDD,  the window it is slow will be from 6:30 - 7:30.
Also, just use perfmon if windows, iostat if UNIX/LINUX, etc ... on each VM during the performance slow down.  Quite simply, it is slow because the HDD is doing more work.  Find the rogue program.
jjozAuthor Commented:
ok, I'll try that when i get into the office tomorrow morning.
thanks for the info man.
Two things - (1) watch the performance tab on each VM to see what the disk read/write rate is during the problem period - likely only one of your VMs is causing the issue. Once you identify the troublesome vm you can dig deeper into what precisely it is doing during the problem period.

(2) - Make sure you have battery backed write cache on your raid controller, and that it is configured for "write-back" (as opposed to "write-through") caching. Many performance issues with disk wrties can be traced back to raid controller caching configuration and/or the lack of BBWC.

Hope this helps.
Michael WorshamInfrastructure / Solutions ArchitectCommented:
One thing you stated is that you are using WD Caviar Green drives. This is part of the main issue.

For VMware ESX/ESXi, Server and other virtualized host environments, stick with the WD Caviar Black drives. The 'black' model run at 7200 RPM and are geared to be faster, thus more robust under duress. The 'green' models run at 5400 RPM and have special reduced power consumption capabilities including turning off the drive when not in use. Servers, especially VM-based, will always be active thus not able to take advantage of green-like energy saving modules, thus cause heavy wear and tear on drives that do support them.

Additional reference:
jjozAuthor Commented:
thanks man !
jjozAuthor Commented:
here's an update to the problem,

I forgot to include the real screenshot from my ESXi, here it is, it averages above 40 ms for both disk command latency and the disk commands issued.

I know that the performance is very slow but i couldn't found anything peculiar in CPU or memory contention.

Is this normal or not OK for a low load web server ?

Featured Post

Get quick recovery of individual SharePoint items

Free tool – Veeam Explorer for Microsoft SharePoint, enables fast, easy restores of SharePoint sites, documents, libraries and lists — all with no agents to manage and no additional licenses to buy.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now