Virtual Machines disappeard after rebooting VMWare Host

vCenterI have applied lsi provider .vib file on the ESXi v5.5 host. At the end of install, it said "Reboot: Required". So
I rebooted the host. Now I don't see any of my VMs and they are appeared to be inaccessible.
Have I lost all my VMs?
What can make datastore inaccessible when I did not delete the datastore?
LVL 1
sgleeAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

sgleeAuthor Commented:
I just logged into the host 192.168.1.2 using putty and ran some CLI commands, but no results.

login as: root
Using keyboard-interactive authentication.
Password:

The ESXi Shell can be disabled by an administrative user. See the
vSphere Security documentation for more information.
~ # ls
altbootbank      lib64            sbin             var
bin              local.tgz        scratch          vmfs
bootbank         locker           store            vmimages
bootpart.gz      mbr              tardisks         vmupgrade
dev              opt              tardisks.noauto  vsantraces
etc              proc             tmp
lib              productLocker    usr
~ # ./MegaCli -AdpAllInfo -aALL
-sh: ./MegaCli: not found
~ # cd opt
/opt # cd lsi/MegaCLI
/opt/lsi/MegaCLI # ./MegaCli -AdpAllInfo -aALL


Exit Code: 0x01
/opt/lsi/MegaCLI # ./MegaCli -AdpAllInfo -aALL


Exit Code: 0x01
/opt/lsi/MegaCLI # ls
MegaCli         MegaSAS.log     libstorelib.so
/opt/lsi/MegaCLI # ./MegaCli I -CfgDsply -Aall
Invalid input at or near token I

Exit Code: 0x01

/opt/lsi/MegaCLI # ./MegaCli .PDList n-aAll
Invalid input at or near token .PDList                                          

Exit Code: 0x01
0
jmcgOwnerCommented:
It would help to know how your datastores were provisioned before the change. Local disk?
0
sgleeAuthor Commented:
It had one datastore with space of 600gb × 6 on raeid 10. I discovered that one of 6 HDs went bad over the weekend but I did not have time to replace it today. Since the VMware host failed to reboot, I am guessing that perhaps the Raid 10 is broke due to 2nd HD failure? I will find out when I get to my office in the moring.
I will report back.
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

jmcgOwnerCommented:
A failed RAID would certainly give a result like this. Good luck!
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Two things here..... what driver did you apply ?

does the host still see the storage controller ?

is the server and storage controller on the HCL ?

If two disks have failed, the datastore is gone, RAID 6 cannot support more than two disk failures...

OR...

I do not believe you have lost your VMs, but the host has lost access to the storage controller, because the support for the controller is missing.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
sgleeAuthor Commented:
RAID-Controller-Error.jpgHere is what I see this morning.
Should I go into the RAID configuration/
Is there a way to recover the raid?


Last night I ran CLI command to check the status of HDs and below is the result:

*******************************************************************
Adapter #0

Slot Number: 0
Firmware state: Online, Spun Up
Connected Port Number: 2(path0)
Inquiry Data: SEAGATE ST3600057SS     00086SL0XTZX            

Slot Number: 1
Firmware state: Online, Spun Up
Connected Port Number: 1(path0)
Inquiry Data: SEAGATE ST3600057SS     00086SL1Y72B            

Slot Number: 2
Firmware state: Online, Spun Up
Connected Port Number: 0(path0)
Inquiry Data: SEAGATE ST3600057SS     00086SL0VXGL            


Slot Number: 4
Firmware state: Online, Spun Up
Connected Port Number: 3(path0)
Inquiry Data: SEAGATE ST3600057SS     00086SL1YAAM            


Slot Number: 5
Firmware state: Failed
Connected Port Number: 4(path0)
Inquiry Data: SEAGATE ST3600057SS     00086SL1XDW3            


Slot Number: 6
Firmware state: Online, Spun Up
Connected Port Number: 5(path0)
Inquiry Data: SEAGATE ST3600057SS     00086SL0EKMR            
*******************************************************************
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
if you are RAID 6, and have a single disk failure, your RAID should be intact.

continue.
0
sgleeAuthor Commented:
I replaced the failed hard drive and the raid is rebuilding now.
I see the VMs are loading one at a time.
Thank you.
0
sgleeAuthor Commented:
Hardware StatusHere is the screenshot from vSphere Hardware Status tab:

I assume that the reason I see the "Warning" is because the RAID is rebuilding after I replaced t he failed HD.
(Slot Number: 5  Firmware state: Failed, per ID: 41180630)
Hopefully as the raid 10 is completely rebuilt, the warning message will go away.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Wait and check back when the RAID is stable. and rebuilt.
0
sgleeAuthor Commented:
Status Update:
I replaced the bad hard drive and the raid controller finished rebuilding the raid. All good now.
The question is when I rebooted VMWare host with a failed hard drive (and another HD with SMART error), while RAID 10 was still intact, why it failed to recognize existing datastore?

My guess is that when a hard drive fails, as long as the RAID is good, ESX will recognize existing datastore and continue to run VMs. However once you reboot, it won't recognize the datastore until bad hard drives are replaced.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
The clue, will be in the BIOS RAID state the RAID was in, waiting for a question!

When a RAID fails, you usually do not restart the host, but replace the disk ASAP!
0
sgleeAuthor Commented:
It all worked out. Thank you.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
no problems!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
VMware

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.