• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 933
  • Last Modified:

ESX 3.5 Locking up

HI,

I installed 2 ESX 3.5 Update 2 servers at the same time a few months ago. They are both DL380 G5s 14 GB RAM, 2 x Quad core processors and are attached to an MSA500.

ESX02 as I called it locks up once a week. The guests are in accessible and the Infrastructure client cannot connect to the serve. The only way I can get it back up is to connect to ILO and reboot it. The server is by no means stressed and has plenty of available resources.

I managed to get a look at what it says on the console when the above happens. It read
"kernel 2.4.21-57. Elvmnix on an i686 you probably have a hardware problem with your RAM chips. Please consult hardware error logs"

I booted the server off a diagnostic cd and ran a memory test and it gave the all clear, as usual
!

I am looking for a way to get the server to send out logs or see what is happening in the back-ground when this happens. I also have the HP Insight Management agents installed and the ILO doesn't have any errors when the server is locked up.

Help much appreciated!

Thanks
0
davewex
Asked:
davewex
  • 2
  • 2
1 Solution
 
azjeepCommented:
I was told my VMware tech support awhile back that those HP Management Agents can cause problems sometimes.

I haven't always had good luck testing RAM with software apps like the diagnostic disks.  Try swapping it out for sure.  It's an easy fix if it is indeed the RAM.
0
 
larstrCommented:
I second azjeep on that this is probably a hw issue, and that your memory is likely the culprit. Checking your memory dimms before going into prod is very important.

Lars
0
 
davewexAuthor Commented:
I installed the agents after I started getting the issue. I am going to run memtest on it this evening as I used HP diagnostics the last time. I guess you guys can't help and its down to trial and error...I was just hoping for an easy fix

thanks anyway
0
 
azjeepCommented:
It doesn't get much easier than swapping out a couple of DIMMs ;)
0
 
davewexAuthor Commented:
There is 14 GB of RAM in this server and I don't have that spare and the issue only arises once every few weeks.

Memtest found an error with the RAM in DIMM A so I have replaced this and all seems well

cheers
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: Python 3 Fundamentals

This course will teach participants about installing and configuring Python, syntax, importing, statements, types, strings, booleans, files, lists, tuples, comprehensions, functions, and classes.

  • 2
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now