Solved

esx 4.0 random reboots

Posted on 2011-09-30
15
788 Views
Last Modified: 2012-08-14
I have an ESX 4 server which is rebooting daily and I'm having a hard time figuring out what is going on with it.  I've tried to export the logs to try and open up a case with VMWare but the diagnostic bundle never generates.  It always comes up with it saying log files are missing.

What I believe triggered all this is a bad hard drive, I have a raid 1 set and one of the drives failed, I've since replaced the failed drive and monitored the array rebuilding, but now this server reboots everyday.

I'm obviously not a VM expert so any help would be appreciated.  This is a BL460c G1 which is currently in maintenance mode.
0
Comment
Question by:PCVIC
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 5
  • 2
  • +1
15 Comments
 
LVL 30

Expert Comment

by:IanTh
ID: 36891896
do you get a purple screen of death
why arent you using 4.1 ?
0
 
LVL 120
ID: 36891902
One of the many reasons why ESX can reboot is due to bad memory or processor, or overheating.

Check the memory with http://www.memtest.org/
0
 
LVL 1

Expert Comment

by:d33m
ID: 36891980
you can also try to check hardware/system logs on itegrated light out (iLO, if available) referring to this blade server.

possible you will find them usefull.
0
Ransomware-A Revenue Bonanza for Service Providers

Ransomware – malware that gets on your customers’ computers, encrypts their data, and extorts a hefty ransom for the decryption keys – is a surging new threat.  The purpose of this eBook is to educate the reader about ransomware attacks.

 

Author Comment

by:PCVIC
ID: 36892040
No purple screen, for now all of our ESX systems will stay at 4.0

I've used ILO, reviewed the IML log and its been clean since the recovery of the array.

I've run hardware diagnostics and it has found nothing, I've gone as far as removing the blade reseating all the hardware.  
0
 
LVL 120
ID: 36892075
Check the memory with http://www.memtest.org/
0
 

Author Comment

by:PCVIC
ID: 36892131
I will try the memory test.
0
 

Author Comment

by:PCVIC
ID: 36892551
Looks like its going to take forever, I'll let it finish though.  Its been running for 52mins and is at 12%.
0
 

Author Comment

by:PCVIC
ID: 36903550
I let it run over the weekend and found no issues with Memory.  The whole time while it was booted up to the memtest ISO it never rebooted.  I think we can safely rule out hardware as the issue
0
 
LVL 120
ID: 36903580

Hardware is clearly okay, unless a VMware drive is causing issue with hardware, or a software fault in ESX 4. What build are you running?

Are you running the HP version of ESX, e.g. have you got HP Agents installed?

It might be worth stopping or uninstalling the HP Agents temporarily.

Is ESX 4 installed on the internal hard drives - RAID 1?

Any SAN attached?
0
 

Author Comment

by:PCVIC
ID: 36903642
Running ESX 4.0.0, 261974

There is no hardware agents installed.  ESX is installed on internal drives, RAID 1.  

It is attached to SAN
0
 
LVL 120

Accepted Solution

by:
Andrew Hancock (VMware vExpert / EE MVE^2) earned 500 total points
ID: 36903662
Okay, so all VMs are on the SAN?

Is this this the only single ESX 4.0 server?

Does it reboot daily?

I think personally, if the server reboots daily, I would do the foillowing

1. I would install ESXi 4.0 U3 on a USB flash drive and install it on the Internal USB Connector inside the blade, and remove the two disks temporary. ESXi and ESX are compatible, and it will connect to your existing SAN, and run the VMs, with no issues and no changes.

This will prove if it's ESX 4.0 and/or disks drives.
0
 
LVL 30

Expert Comment

by:IanTh
ID: 36903696
does your  vi client get any logs
0
 

Author Comment

by:PCVIC
ID: 36925594
I've rebuilt the system onto two new drives.  So far the system has been up and running for 1 day and 20 hrs.
0
 
LVL 120
ID: 36925607
Very Good, duff disks maybe?
0
 

Author Comment

by:PCVIC
ID: 36926351
I have a feeling it had to do with the array since a failed drive is what triggered the symptoms.  There was no logs anywhere though that can confirm it.

Anyhow the issue appears to be resolved, I just hate the paper work that comes along with re-introducing the host back into production.

Thanks hanccocka
0

Featured Post

Don't Miss ATEN at InfoComm 2017!

Visit booth #2167 to see the  new ATEN VM3200 32 x 32 Modular Matrix Switch. Other highlights include the VE8950 4K HDMI Over IP Extender, VS1912 12-Port DP Video Wall Media Player  and VK2100 ATEN Control System. Register now with Free Pass Code ATEN288!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When converting a physical machine to a virtual machine using VMware vCenter Converter Standalone or vCenter Converter Enterprise, if an adapter type is not selected during the initial customization the resulting virtual machine may contain an IDE d…
In this article, I will show you HOW TO: Install VMware Tools for Windows on a VMware Windows virtual machine on a VMware vSphere Hypervisor 6.5 (ESXi 6.5) Host Server, using the VMware Host Client. The virtual machine has Windows Server 2016 instal…
Teach the user how to configure vSphere Replication and how to protect and recover VMs Open vSphere Web Client: Verify vsphere Replication is enabled: Enable vSphere Replication for a virtual machine: Verify replicated VM is created: Recover replica…
Teach the user how to use configure the vCenter Server storage filters Open vSphere Web Client:  Navigate to vCenter Server Advanced Settings: Add the four vCenter Server storage filters: Review the advanced settings: Modify the values of the four v…

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question