Solved

ESXi 4.1 post upgrade disk latency issue

Posted on 2011-03-22
6
2,315 Views
Last Modified: 2012-06-27
Upgraded our production server from ESXi 4.0 to 4.1 over the weekend. Since the upgrade I have been seeing an issue where disk latency to our iSCSI SAN will intermittently spike dramatically for about 40 seconds. When this happens all other VM performance metrics flatline to 0.

VM datastore performace graph

VM CPU performace graph

As our SBS 2008 VM is one of those affected this means the network just stops and then starts up again. At the moment the issue is just annoying, I'd like to get it sorted before it escalates.

This issue did not exist while we were running ESXi 4.0 on this host. Any pointers on where to look at resolving this issue are much appreciated.  
0
Comment
Question by:siht
  • 3
  • 2
6 Comments
 
LVL 16

Accepted Solution

by:
danm66 earned 300 total points
ID: 35195796
start reviewing the logs, look for events in the logs at the times you are seeing the spikes.  Also, if you can catch it in the middle of one of these periods, log into the console and do a vmkping to the iscsi host.  If those vmkpings fail, then you should check your network switches for error messages, too.
0
 
LVL 21

Assisted Solution

by:mastoo
mastoo earned 200 total points
ID: 35199009
You might work your way backwards from existing metrics.  Look at your san management activity.  Is it getting hit by something and actually experiencing latency (doubtful).  Then look at the iScsi Nic traffic, datastore traffic/latency, etc.
0
 
LVL 6

Author Comment

by:siht
ID: 35212544
I could vmkping the iscsi ip from the console during the spikes and there was nothing jumping out at me from the logs.
There was a bit of this:

[2011-03-23 10:27:12.729 1BBEEB90 verbose 'App'] [VpxaMoVm::CheckMoVm] did not find a VM with ID 5 in the vmList
[2011-03-23 10:27:12.729 1BBEEB90 verbose 'App'] [VpxaAlarm] VM with vmid = 5 not found
[2011-03-23 10:27:12.729 1BBEEB90 verbose 'App'] [VpxaMoVm::CheckMoVm] did not find a VM with ID 8 in the vmList
[2011-03-23 10:27:12.729 1BBEEB90 verbose 'App'] [VpxaAlarm] VM with vmid = 8 not found
[2011-03-23 10:27:12.729 1BBEEB90 verbose 'App'] [VpxaMoVm::CheckMoVm] did not find a VM with ID 9 in the vmList
[2011-03-23 10:27:12.729 1BBEEB90 verbose 'App'] [VpxaAlarm] VM with vmid = 9 not found
[2011-03-23 10:27:22.727 1BBEEB90 verbose 'App'] [VpxaMoVm::CheckMoVm] did not find a VM with ID 5 in the vmList
[2011-03-23 10:27:22.727 1BBEEB90 verbose 'App'] [VpxaAlarm] VM with vmid = 5 not found
[2011-03-23 10:27:22.727 1BBEEB90 verbose 'App'] [VpxaMoVm::CheckMoVm] did not find a VM with ID 8 in the vmList
[2011-03-23 10:27:22.727 1BBEEB90 verbose 'App'] [VpxaAlarm] VM with vmid = 8 not found
[2011-03-23 10:27:22.727 1BBEEB90 verbose 'App'] [VpxaMoVm::CheckMoVm] did not find a VM with ID 9 in the vmList
[2011-03-23 10:27:22.727 1BBEEB90 verbose 'App'] [VpxaAlarm] VM with vmid = 9 not found


But it was not especially associated with the latency events.
I have migrated the SBS VM off of the iSCSI storage and restarted the storage. The errors above have gone away and while the latency on the iSCSI datastore still spikes it never hits higher than 12 ms, as opposed to the 15000 ms spikes I was seeing.

I'll migrate the VM's back to iSCSI slowly & keep an eye on it. This issue occurred under standard load, we'd been running those VM's on the iSCSI LUN for months. I'd like to have gotten to the bottom of it but I'm glad it's gone away for now.
0
Complete VMware vSphere® ESX(i) & Hyper-V Backup

Capture your entire system, including the host, with patented disk imaging integrated with VMware VADP / Microsoft VSS and RCT. RTOs is as low as 15 seconds with Acronis Active Restore™. You can enjoy unlimited P2V/V2V migrations from any source (even from a different hypervisor)

 
LVL 16

Expert Comment

by:danm66
ID: 35212579
-did you patch the VM recently?

-is it running on a snapshot?

-you might want to setup a syslog server to get better logging.  you can use the VMA appliance to do that job if you want.
0
 
LVL 6

Author Comment

by:siht
ID: 35212595
-did you patch the VM recently?

No and this was affecting all VM's on the iSCSI storage, the SBS VM was the biggest issue. The issue was definitely triggered by upgrading the host to ESXi 4.1.

-is it running on a snapshot?

No snaps on production VM's.

-you might want to setup a syslog server to get better logging.  you can use the VMA appliance to do that job if you want.

Way ahead of you there :)
0
 
LVL 6

Author Closing Comment

by:siht
ID: 35212618
Issue was not resolved but answers were still helpful.
0

Featured Post

Netscaler Common Configuration How To guides

If you use NetScaler you will want to see these guides. The NetScaler How To Guides show administrators how to get NetScaler up and configured by providing instructions for common scenarios and some not so common ones.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Finding original email is quite difficult due to their duplicates. From this article, you will come to know why multiple duplicates of same emails appear and how to delete duplicate emails from Outlook securely and instantly while vital emails remai…
Giving access to ESXi shell console is always an issue for IT departments to other Teams, or Projects. We need to find a way so that teams can use ESXTOP for their POCs, or tests without giving them the access to ESXi host shell console with a root …
Teach the user how to rename, unmount, delete and upgrade VMFS datastores. Open vSphere Web Client: Rename VMFS and NFS datastores: Upgrade VMFS-3 volume to VMFS-5: Unmount VMFS datastore: Delete a VMFS datastore:
Teach the user how to configure vSphere clusters to support the VMware FT feature Open vSphere Web Client: Verify vSphere HA is enabled: Verify netowrking for vMotion and FT Logging is in place or create it: Turn On FT for a virtual machine: Verify …

792 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question