Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

VMware Degraded I/O with degraded virtual disk

Posted on 2013-10-31
3
Medium Priority
?
1,452 Views
Last Modified: 2016-11-23
We have ESXi 5.0 running on a Dell PowerEdge R510 host with PERC 6/i RAID controller.  Eight physical disks configured as two virtual disks - one RAID 10 and one RAID 5.  Incidentally all 500 GB SAS (7.2K) drives.  VMware is installed on a flash drive.

We had a drive go into predicted fail a couple of weeks ago.  Initially not much impact at all, but it seems the drive has deteriorated further and although still not failed, disk I/O for the entire server has slowed to a crawl.  We had a web server VM hosting a small website with an instance of SQL Express and the website would timeout in most database connection attempts.  This VM was on the healthy RAID 10 VD.

The question is why would VM's on the other RAID 10 virtual disk be impacted by degraded state of the other virtual disk?  

In doing a little research, I read that if a "predicted failure" drive has a significant number of bad blocks, I/O performance can degrade while those blocks are marked bad.  So, we "offlined" the disk in question (reluctantly, knowing the risks) thinking that the drive was pretty much failed anyway.

That was more than 12 hours ago and still abysmal I/O on the entire server.

We have a replacement drive scheduled for delivery today, but I'm wondering if we need to be prepared for further corrective action.  Is this expected behavior or does it indicate further issues?  

We have Dell OMSA installed within ESXi and no other trouble is reported by the system.
0
Comment
Question by:gatorIT
3 Comments
 
LVL 124

Accepted Solution

by:
Andrew Hancock (VMware vExpert / EE MVE^2) earned 2000 total points
ID: 39613659
We've seen this, when RAID controllers, constantly try and read and write bad blocks.

This seems to take away overall performance from all good storage.

Get the disk replaced ASAP, when we see a disk go into predicted fail, we get it escalated to DELL/HP the very same day, for swap out, within 4 hours.
0
 
LVL 22

Expert Comment

by:Nick Rhode
ID: 39614099
Although different raid configs they are operated from the same controller.  Regardless of the defective disk (get replaced right away anyways) you will probably see the I/O error for I believe the PERC 6/i controller does not have a write cache so it reads and writes at the same time.  This will cause a performance issue which can be seen inside the vms with what seems like a short delay.  I recommend the H710 or higher for the PERC which has a write cache to resolve that issue.  After replacing the drive give a call to dell, they will most likely recommend a better raid controller to resolve those I/O issues.
0
 

Author Comment

by:gatorIT
ID: 39620228
PERC 6/i has write back cache and battery.  It's the SAS 6i/r that does not.  

Drive has been replaced, I think this will be the last RAID 5 array we ever use.  72 hours in we're still only at 75% in the rebuilding process.  The extra bit of storage from RAID 5 just isn't worth the performance hit and time to rebuild a small (relatively) 1.5 TB array.
0

Featured Post

Threat Trends for MSPs to Watch

See the findings.
Despite its humble beginnings, phishing has come a long way since those first crudely constructed emails. Today, phishing sites can appear and disappear in the length of a coffee break, and it takes more than a little know-how to keep your clients secure.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article outlines why you need to choose a backup solution that protects your entire environment – including your VMware ESXi and Microsoft Hyper-V virtualization hosts – not just your virtual machines.
Giving access to ESXi shell console is always an issue for IT departments to other Teams, or Projects. We need to find a way so that teams can use ESXTOP for their POCs, or tests without giving them the access to ESXi host shell console with a root …
Teach the user how to configure vSphere clusters to support the VMware FT feature Open vSphere Web Client: Verify vSphere HA is enabled: Verify netowrking for vMotion and FT Logging is in place or create it: Turn On FT for a virtual machine: Verify …
Video by: ITPro.TV
In this episode Don builds upon the troubleshooting techniques by demonstrating how to properly monitor a vSphere deployment to detect problems before they occur. He begins the show using tools found within the vSphere suite as ends the show demonst…

885 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question