I/O Error on ESX server (linux)

Where I work we have a series of virtual servers running on 4 ESX servers (v3.5). The servers were left unattended for 5 days and when we (the server administrators) came back in today, we noticed that one of the servers was no longer in our data center. We can still access the 3 VMs on the server, but have shut 2 down to save resources while we figure out how we are going to merge the other onto our other ESX servers. The remaining one is our forest primary DC.

While we were investigating the issue, we found that there was a bad sector on sd(8,2). What we are trying to figure out is if sd(8,2) is one of the HDDs or if it is related to our RAID controller.
LVL 1
minthor11Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

kyleb84Commented:
A bad sector is physical damage to the harddisk - a part of it that can no longer be reliably used.

I suggest you replace it A.S.A.P before another drive fails.

0
kyleb84Commented:
This should explain better:

http://en.wikipedia.org/wiki/Bad_sector
0
minthor11Author Commented:
well our theory was that since we have it set up for RAID 1 mirroring, the other HDD should have taken over if one of the disks fail. It doesnt make since that both HDDs failed at the same time on the same sector.  Also, with the limited experience with linux that i have the drives are all listed as sda, sdb, sdc, etc. and the partitions are listed as a number following the sda, sdb (ie sda2 od sdb1). I've never seen sd(8,2) before and as such i dont know which HDD has gone bad.
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

Paul SolovyovskySenior IT AdvisorCommented:
Was the error on a virtual machine or the host?  I would run the dell openmange agents on the ESX Server to get more data if possible
0
kyleb84Commented:
sd(8,2) is a (LUN,Partition #)

(SCSI numbering)

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
kyleb84Commented:
At a guess I'd say sdb
0
kyleb84Commented:
sorry my bad, sda, since LUN 7 is usually the controller.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Linux

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.