Solved

Two ESX servers get rebooted randomly at the same time

Posted on 2010-11-30
15
520 Views
Last Modified: 2012-06-27
Hello Experts

We have 3 ESXi 4.0 servers , from last week ... two of them get rebooted at the same time. one time it happened @ 1:30 AM on Friday and next time was today @ 8:00 AM.

we had some cabling changes in serevr room last week , so I thought maybe it's a power issue , but ESX servers are connected to different outlets and other servers are connected to the same outlets...

How I can investigate this issue ? log files are very complicated and they are getting rolled over after ESX reboot . I extracted all zip files inside report bundles but didn't get anything for the time of reboot.
0
Comment
Question by:akhalighi
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 6
  • 2
15 Comments
 
LVL 2

Expert Comment

by:frevere
ID: 34239919
Do you have the power management for the cluster turned on?
0
 
LVL 10

Author Comment

by:akhalighi
ID: 34240207
as far as I know , there is no cluster . there are 3 ESX servers under a data center.

I checked power management option under configuration , it's empty : attached the screenshot.
vmware.jpg
0
 
LVL 2

Expert Comment

by:frevere
ID: 34241340
Can you show a screenshot of the left pane, like the attached?  Do you show the vCenter, Datacenter, cluster, resource pool, vm, etc. screenshot
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
LVL 2

Expert Comment

by:frevere
ID: 34241484
Reason for previous posting is that DPM (power management) can be enabled on a cluster of ESX/ESXi hosts.  If you do not have the cluster, as I have highlighted above, then other option to look for are: 1) cron job running on the servers themselves, 2) roles and who has ability to reboot the hosts.  I will check my test environment further for anything else I can find.
0
 
LVL 10

Author Comment

by:akhalighi
ID: 34241566
Thanks frevere

there is no clusters fo sure .

As for roles ; there is only one and that's domain administrator that can connect to Vcenter. when I select the ESX servers in question I can see all events ( like powering up VMs , etc ) but there is nothing about ESX server reboot.

One important thing ; I am not sure if they get really rebooted ... we came in the morning and noticed that VMs are off . I am checking now to confirm if they are really rebooted.
0
 
LVL 10

Author Comment

by:akhalighi
ID: 34241646
This was in message log ; I think that's an indication of a reboot.

~~~~~~~~~~~~~
Nov 30 13:05:59 syslogd started: BusyBox v1.9.1-VMware-visor-654
Nov 30 13:05:59 vmklogger: Successfully daemonized
~~~~~~~~~~~~~

I received alerts from monitoring system @ around 8:04 AM EST ; which is 13:04 GTZ ; this shows that reboot happened afterwards.
0
 
LVL 2

Expert Comment

by:frevere
ID: 34241840
it looks as though you have SSH enabled on your ESXi host.  I would check the server for a cron job that is running.
0
 
LVL 28

Expert Comment

by:bgoering
ID: 34241890
How is the power in generall in your datacenter. Could it be a surge or drop that the ESXi servers might be more sensitive to?
0
 
LVL 10

Author Comment

by:akhalighi
ID: 34241975
thanks . would you tell me how to check a running cron job ?
0
 
LVL 2

Expert Comment

by:frevere
ID: 34242040
Not positive (you may want to google this) but you might want to look in the /var/spool/cron folder.  Also check the rc.local in the /etc/ directory for any commands that would reschedule the job.  To check for a running cron job, run the following:  cat /var/run/crond.pid
0
 
LVL 10

Author Comment

by:akhalighi
ID: 34242097
hmmm . okay ... I'll give this a try . but this wasn't happening before and we didn't change anything . I cannot imagine how come a cron job started occuring all of the sudden.
0
 
LVL 2

Expert Comment

by:frevere
ID: 34242144
http://www.jules.fm/Logbook/files/add_cron_job_vmware.html for adding a cron job to ESXi.  Also has link for enabling SSH.  If not refer to http://www.vm-help.com/esx40i/ESXi_enable_SSH.php for enabling SSH instructions.  You know glad you asked this question.  I didn't know about having to run the auto backup script.
0
 
LVL 2

Accepted Solution

by:
frevere earned 300 total points
ID: 34242290
You say that this never happened before and just started happening.  Things to think about:  1) has anyone (like VMware support) been on the hosts recently?  If you check and have no cron job, check and disable SSH.  This will prevent anyone from remotely accessing the host.  2) Check who is a member of the Domain Admins group in AD.  Limit this as much as possible.  vCenter (Windows application, i.e. runs on a windows server) automatically uses the local computer\administrtors group and grants administrator role to this group.  By default in AD, Domain Admins are automatically part of the local machine administrator group.  In vCenter, specifically at the vCenter level in tree, check the permission and explicitly add "domain\administrator" to have administator role.  The group Adminstrators (which is the local administrator group) change permissions to something low.  If you or another need to manage the VMware environment, explicitly define yourselves here.  3) As bqoering mentioned, even though the 2 hosts are plugged into different PDUs, are the PDUs using the same circuit.  Could there be something else on that circuit, like an air handler that when starting up trips the hosts?
0
 
LVL 28

Assisted Solution

by:bgoering
bgoering earned 200 total points
ID: 34243387
to check cron jobs log on to the console as root and enter

crontab -e

It will either open an empty file (if no cron jobs are defined) or else a file the lists the various jobs that have been defined.
0
 
LVL 10

Author Closing Comment

by:akhalighi
ID: 34293622
Issue didn't happened for some time . I close this question for now. Thanks for your helps
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Last article we focus in how to VMware: How to create and use VMs TAGs – Part 1 so before follow this article and perform the next tasks, you should read the first article how to create the TAG before using them in Veeam Backup Jobs.
HOW TO: Connect to the VMware vSphere Hypervisor 6.5 (ESXi 6.5) using the vSphere (HTML5 Web) Host Client 6.5, and perform a simple configuration task of adding a new VMFS 6 datastore.
Teach the user how to rename, unmount, delete and upgrade VMFS datastores. Open vSphere Web Client: Rename VMFS and NFS datastores: Upgrade VMFS-3 volume to VMFS-5: Unmount VMFS datastore: Delete a VMFS datastore:
This Micro Tutorial walks you through using a remote console to access a server and install ESXi 5.1. This example is showing remote access and installation using a Dell server. The hypervisor is the very first component of your virtual infrastructu…
Suggested Courses

617 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question