ESX 6.0 suddently restarted

Vladimir Buzalka
Vladimir Buzalka used Ask the Experts™
on
Hi Experts

I have Dell T410 located in data center, running ESX 6.0.0 5224934. 64GB of RAM and 3 disks (each is physically consisting in 2 disks mirrored).
suddently after running without any problem over 2 years, ESX restarted itself yesterday 12NOV2018 at about 21.15

I have downloaded all the logs available and they are with me.

Can you advice what had happened? How to find cause of problem?

Many thanks

Vladimir
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017
Commented:
1. Local Power issues ? Host on a UPS ?

2. Firmware up to date ?

3. Hardware checked for faults, CPU, Fans etc

4. Latest build of ESXi ? (your build is old, and probably has bugs!)

5. Check logs ?

6. did you get a PSOD ?

Commented:
if you have installed OMSA

Dell OpenManage Server Administrator (OMSA)

then check the hardware logs.

all the best

Author

Commented:
Dear Andrew

thanks for quick list to check. This computer is running in data center with all expected redundancy, power including. Firmware and ESXi build are still the same, i.e. they are old maybe, but considering that I did NO change over last 2 years in any configuration of hardware of virtual machines, I believe this should not be issue.

For Hardware - what I can see in vSphere client, all is OK.

I suspected that there could be some problem with failed drive, so I downloaded full logs from Perc raid card, but those looks quite good as well.

For logs - please advice what I should check?

PSOD? - what is it, please?

Thanks

Vladimir
Ensure you’re charging the right price for your IT

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
Purple Screen of Death - PSOD

You’ll be surprised what firmware and build versions fix even after 2 years

Can you repeat the problem!?

Otherwise one off - or hardware issue!
Top Expert 2014

Commented:
You need the system event log as per above.
https://www.dell.com/support/article/no/en/rc1078549/sln292270/poweredge-server-error-messages-in-system-event-log-and-how-they-can-be-viewed?lang=en

If it has only crashed once in two years that's pretty good, I tend to put such one-off events down to cosmic rays.
Thanks all for your help

official communication was issued by Coolhousing.net (my datacenter provider). Yesterday they had simply power failure for more than 1 hour, exactly as visible in my logs. So good to know that it is not indeed problem of my server :-)

Something happened with power contactors and they were not able to connect to UPS units, so now it is under investigation of UPS provider.

many thanks again, your ideas are anyway very helpful.

Vladimir
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
Yes First item in my post!

Logs will not show much!
Top Expert 2014

Commented:
Logs may show a power loss event for one PSU, although not both as no electricity to write to BMC/iDRAC.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial