[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 2335
  • Last Modified:

ESXi 4.1 post upgrade disk latency issue

Upgraded our production server from ESXi 4.0 to 4.1 over the weekend. Since the upgrade I have been seeing an issue where disk latency to our iSCSI SAN will intermittently spike dramatically for about 40 seconds. When this happens all other VM performance metrics flatline to 0.

VM datastore performace graph

VM CPU performace graph

As our SBS 2008 VM is one of those affected this means the network just stops and then starts up again. At the moment the issue is just annoying, I'd like to get it sorted before it escalates.

This issue did not exist while we were running ESXi 4.0 on this host. Any pointers on where to look at resolving this issue are much appreciated.  
0
siht
Asked:
siht
  • 3
  • 2
2 Solutions
 
Danny McDanielClinical Systems AnalystCommented:
start reviewing the logs, look for events in the logs at the times you are seeing the spikes.  Also, if you can catch it in the middle of one of these periods, log into the console and do a vmkping to the iscsi host.  If those vmkpings fail, then you should check your network switches for error messages, too.
0
 
mastooCommented:
You might work your way backwards from existing metrics.  Look at your san management activity.  Is it getting hit by something and actually experiencing latency (doubtful).  Then look at the iScsi Nic traffic, datastore traffic/latency, etc.
0
 
sihtAuthor Commented:
I could vmkping the iscsi ip from the console during the spikes and there was nothing jumping out at me from the logs.
There was a bit of this:

[2011-03-23 10:27:12.729 1BBEEB90 verbose 'App'] [VpxaMoVm::CheckMoVm] did not find a VM with ID 5 in the vmList
[2011-03-23 10:27:12.729 1BBEEB90 verbose 'App'] [VpxaAlarm] VM with vmid = 5 not found
[2011-03-23 10:27:12.729 1BBEEB90 verbose 'App'] [VpxaMoVm::CheckMoVm] did not find a VM with ID 8 in the vmList
[2011-03-23 10:27:12.729 1BBEEB90 verbose 'App'] [VpxaAlarm] VM with vmid = 8 not found
[2011-03-23 10:27:12.729 1BBEEB90 verbose 'App'] [VpxaMoVm::CheckMoVm] did not find a VM with ID 9 in the vmList
[2011-03-23 10:27:12.729 1BBEEB90 verbose 'App'] [VpxaAlarm] VM with vmid = 9 not found
[2011-03-23 10:27:22.727 1BBEEB90 verbose 'App'] [VpxaMoVm::CheckMoVm] did not find a VM with ID 5 in the vmList
[2011-03-23 10:27:22.727 1BBEEB90 verbose 'App'] [VpxaAlarm] VM with vmid = 5 not found
[2011-03-23 10:27:22.727 1BBEEB90 verbose 'App'] [VpxaMoVm::CheckMoVm] did not find a VM with ID 8 in the vmList
[2011-03-23 10:27:22.727 1BBEEB90 verbose 'App'] [VpxaAlarm] VM with vmid = 8 not found
[2011-03-23 10:27:22.727 1BBEEB90 verbose 'App'] [VpxaMoVm::CheckMoVm] did not find a VM with ID 9 in the vmList
[2011-03-23 10:27:22.727 1BBEEB90 verbose 'App'] [VpxaAlarm] VM with vmid = 9 not found


But it was not especially associated with the latency events.
I have migrated the SBS VM off of the iSCSI storage and restarted the storage. The errors above have gone away and while the latency on the iSCSI datastore still spikes it never hits higher than 12 ms, as opposed to the 15000 ms spikes I was seeing.

I'll migrate the VM's back to iSCSI slowly & keep an eye on it. This issue occurred under standard load, we'd been running those VM's on the iSCSI LUN for months. I'd like to have gotten to the bottom of it but I'm glad it's gone away for now.
0
NFR key for Veeam Backup for Microsoft Office 365

Veeam is happy to provide a free NFR license (for 1 year, up to 10 users). This license allows for the non‑production use of Veeam Backup for Microsoft Office 365 in your home lab without any feature limitations.

 
Danny McDanielClinical Systems AnalystCommented:
-did you patch the VM recently?

-is it running on a snapshot?

-you might want to setup a syslog server to get better logging.  you can use the VMA appliance to do that job if you want.
0
 
sihtAuthor Commented:
-did you patch the VM recently?

No and this was affecting all VM's on the iSCSI storage, the SBS VM was the biggest issue. The issue was definitely triggered by upgrading the host to ESXi 4.1.

-is it running on a snapshot?

No snaps on production VM's.

-you might want to setup a syslog server to get better logging.  you can use the VMA appliance to do that job if you want.

Way ahead of you there :)
0
 
sihtAuthor Commented:
Issue was not resolved but answers were still helpful.
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now