Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

VMware Degraded I/O with degraded virtual disk

Posted on 2013-10-31
3
Medium Priority
?
1,422 Views
Last Modified: 2016-11-23
We have ESXi 5.0 running on a Dell PowerEdge R510 host with PERC 6/i RAID controller.  Eight physical disks configured as two virtual disks - one RAID 10 and one RAID 5.  Incidentally all 500 GB SAS (7.2K) drives.  VMware is installed on a flash drive.

We had a drive go into predicted fail a couple of weeks ago.  Initially not much impact at all, but it seems the drive has deteriorated further and although still not failed, disk I/O for the entire server has slowed to a crawl.  We had a web server VM hosting a small website with an instance of SQL Express and the website would timeout in most database connection attempts.  This VM was on the healthy RAID 10 VD.

The question is why would VM's on the other RAID 10 virtual disk be impacted by degraded state of the other virtual disk?  

In doing a little research, I read that if a "predicted failure" drive has a significant number of bad blocks, I/O performance can degrade while those blocks are marked bad.  So, we "offlined" the disk in question (reluctantly, knowing the risks) thinking that the drive was pretty much failed anyway.

That was more than 12 hours ago and still abysmal I/O on the entire server.

We have a replacement drive scheduled for delivery today, but I'm wondering if we need to be prepared for further corrective action.  Is this expected behavior or does it indicate further issues?  

We have Dell OMSA installed within ESXi and no other trouble is reported by the system.
0
Comment
Question by:gatorIT
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
3 Comments
 
LVL 123

Accepted Solution

by:
Andrew Hancock (VMware vExpert / EE MVE^2) earned 2000 total points
ID: 39613659
We've seen this, when RAID controllers, constantly try and read and write bad blocks.

This seems to take away overall performance from all good storage.

Get the disk replaced ASAP, when we see a disk go into predicted fail, we get it escalated to DELL/HP the very same day, for swap out, within 4 hours.
0
 
LVL 22

Expert Comment

by:Nick Rhode
ID: 39614099
Although different raid configs they are operated from the same controller.  Regardless of the defective disk (get replaced right away anyways) you will probably see the I/O error for I believe the PERC 6/i controller does not have a write cache so it reads and writes at the same time.  This will cause a performance issue which can be seen inside the vms with what seems like a short delay.  I recommend the H710 or higher for the PERC which has a write cache to resolve that issue.  After replacing the drive give a call to dell, they will most likely recommend a better raid controller to resolve those I/O issues.
0
 

Author Comment

by:gatorIT
ID: 39620228
PERC 6/i has write back cache and battery.  It's the SAS 6i/r that does not.  

Drive has been replaced, I think this will be the last RAID 5 array we ever use.  72 hours in we're still only at 75% in the rebuilding process.  The extra bit of storage from RAID 5 just isn't worth the performance hit and time to rebuild a small (relatively) 1.5 TB array.
0

Featured Post

Learn Veeam advantages over legacy backup

Every day, more and more legacy backup customers switch to Veeam. Technologies designed for the client-server era cannot restore any IT service running in the hybrid cloud within seconds. Learn top Veeam advantages over legacy backup and get Veeam for the price of your renewal

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article aims to explain the working of CircularLogArchiver. This tool was designed to solve the buildup of log file in cases where systems do not support circular logging or where circular logging is not enabled
Giving access to ESXi shell console is always an issue for IT departments to other Teams, or Projects. We need to find a way so that teams can use ESXTOP for their POCs, or tests without giving them the access to ESXi host shell console with a root …
In this video tutorial I show you the main steps to install and configure  a VMware ESXi6.0 server. The video has my comments as text on the screen and you can pause anytime when needed. Hope this will be helpful. Verify that your hardware and BIO…
How to Install VMware Tools in Red Hat Enterprise Linux 6.4 (RHEL 6.4) Step-by-Step Tutorial

704 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question