Solved

Dell PowerEdge T310 + ESXi v4 Lost acces to volume due to connectivity issues.

Posted on 2010-09-06
10
2,421 Views
Last Modified: 2012-10-17
We have a Dell PowerEdge T310 running ESXi v4.0 and two production VM's (one Windows Server 2003 and one Ubuntu Linux).  All of the storage for the server is local in 2 1TB SATA drives.  It's been running flawlessly for approximately 200 days (since installed) but, beginning yesterday, is starting to randomly go offline.  In the event log for the server, I see a series of messages "Lost access to volume <long number> (datastore1) due to connectivity issues.  Recovery attempt is in progress and outcome will be reported shortly.".  This error is showing up at random intervals every few minutes on the server all of a sudden.
0
Comment
Question by:cybertechcafe
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 8
10 Comments
 

Author Comment

by:cybertechcafe
ID: 33611041
Googling now but, is there any way to get to the service console remotely (e.g., without having to have hands on the physical console)?
0
 
LVL 5

Expert Comment

by:MrN1c3
ID: 33611186
You cant hit the service console remotely if its running esxi.  Do you have a DRAC card on your T310?
0
Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

 

Author Comment

by:cybertechcafe
ID: 33611266
To be honest, I'm not sure.  I'm not terribly familiar with the environment (yet) and am still filling my way around.  Looking at everything else though, I suspect that the answer is no.  If that's my only option, looks like it's time for a site visit.
0
 

Author Comment

by:cybertechcafe
ID: 33611559
I believe that a site visit is going to be my best option here (there are obviously a few things that I need to discover about the site).  My plan at this point is the following:

- Check to make certain that the box has the latest BIOS
- Check to make certain that the firmware is up-to-date on the box
- Start it and see if we still see the errors (a lot of what I'm seeing seems to indicate that this is either a hardware issue or a firmware issue.  Since it has been working well for so long and, to my knowledge, there have been no changes, I fear that it's more likely hardware than firmware, but I am hoping)
- If the errors are still there, head down the road below
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009557
- My only concern with the link above is that it seems very specific to shared storage and fibre channel, but this is an on board RAID controller (Dell / PERC) and not a SAN or NAS device.
0
 

Author Comment

by:cybertechcafe
ID: 33611787
The drives on the server are a mirrored.  We have another ESXi server available that we can use as a stand in while this one is down.  I would like to be able to copy the VM from the semi-dead ESXi server to the stand-in-server but am unable to do so from the datastore browser (keep getting I/O errors).  Is it possible for me to remove one of the drives and, using a USB drive cage or something, mount it in something like Linux and just copy the files to the other server?  Will Linux be able to see the VMFS?

Removed reference to illegal CD.

rindi,
EE ZA Storage
0
 

Author Comment

by:cybertechcafe
ID: 33612175
One other thing that I just noticed is that, from Host -> Configuration -> Health and Status, there is a Warning and the status of the drive controller seems to be flapping (unknown / normal).
0
 

Author Comment

by:cybertechcafe
ID: 33612181
Also, the box has dual power supplies and the status of both is 'unknown'.  Do not know if that is normally 'normal' or if 'unknown' is typical.
0
 

Author Comment

by:cybertechcafe
ID: 33613173
Ok, just an update.  We arrived on site to begin the [long, arduous] process of recovery and rebooted the server a couple of times in the process.  On one of these reboots, we noted that the array was in a state 'resyncing'.  We let ESXi boot and went to the Health Status and, this time, noted that the storage controller had a warning and one of the drives was in status 'rebuilding'.  What was more, both of the VM's on the server had started and there were *no* errors.  We have shut down the VMs and are using the datastore browser to download them to another workstation (something that wasn't possible before, kept getting I/O errors) and are getting good throughput and no errors.  At this point, I have *no idea* what has changed on the box but it's running very well at the moment and we are moving bits across the drive controller with no problems.
0
 

Accepted Solution

by:
cybertechcafe earned 0 total points
ID: 33614055
Ok, the initial problem of not having connectivity to the hard drives seems to be behind us.  At the end of the day, nothing was really done to *fix* the problem, it just started working again.  We did find that something (still trying to find out what) caused the RAID array (mirror) to degrade and, I suspect, that degraded array was a big part of the problem (understandably very slow while it was attempting to rebuild the array).
0

Featured Post

Enterprise Mobility and BYOD For Dummies

Like “For Dummies” books, you can read this in whatever order you choose and learn about mobility and BYOD; and how to put a competitive mobile infrastructure in place. Developed for SMBs and large enterprises alike, you will find helpful use cases, planning, and implementation.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Last article we focus in how to VMware: How to create and use VMs TAGs – Part 1 so before follow this article and perform the next tasks, you should read the first article how to create the TAG before using them in Veeam Backup Jobs.
In this article, I will show you HOW TO: Install VMware Tools for Windows on a VMware Windows virtual machine on a VMware vSphere Hypervisor 6.5 (ESXi 6.5) Host Server, using the VMware Host Client. The virtual machine has Windows Server 2016 instal…
Teach the user how to rename, unmount, delete and upgrade VMFS datastores. Open vSphere Web Client: Rename VMFS and NFS datastores: Upgrade VMFS-3 volume to VMFS-5: Unmount VMFS datastore: Delete a VMFS datastore:
This tutorial will walk an individual through the steps necessary to enable the VMware\Hyper-V licensed feature of Backup Exec 2012. In addition, how to add a VMware server and configure a backup job. The first step is to acquire the necessary licen…
Suggested Courses

624 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question