VMware ESXi 4.0 not responding to vSphere client

I have an ESXi 4.0 server and it is not responding to the vSphere client.  The web interface is also not responding.

I restarted the Management Agents via the console with no effect.  I do have SSH enabled and am able to access the server.  I tried running "services.sh" restart, still nothing.  I get the following output from services.sh and see the following:

~ # services.sh restart
Running sfcbd-watchdog stop
sh: cannot kill pid 36223521: No such process
Running wsman stop
Stopping openwsmand
Openwsmand is not running.
Running slpd stop
Stopping slpd
Running vobd stop
watchdog-vobd: Terminating watchdog with PID 36223336
Vobd stopped.
Running hostd stop
watchdog-hostd: PID file /var/run/vmware/watchdog-hostd.PID not found
watchdog-hostd: Unable to terminate watchdog: Can't find process
sh: cannot kill pid 36223585: No such process
Running ntpd stop
Stopping ntpd
Running ntpd restart
Starting ntpd
Running hostd restart
mount: mounting visorfs on /var/lib/vmware/hostd/stats failed: File exists
Running vobd restart
[36636846] Begin 'hostd ++min=0,swap,group=hostd /etc/vmware/hostd/config.xml', min-uptime = 60, max-quick-failures = 1, max-total-failures = 1000000
Vobd started.
Running slpd restart
Starting slpd
Running wsman restart
Starting openwsmand
Running sfcbd-watchdog restart

Open in new window

If I try to run "vim-cmd vmsvc/getallvms". I get the following:

Failed to connect: Cannot open TCP socket: Cannot allocate memory

Looking at the hostd.log, I see what looks like a memory dump.

[2010-12-27 20:34:09.243 1D37EDC0 info 'HostsvcPlugin'] Plugin started
[2010-12-27 20:34:09.243 1D37EDC0 info 'HttpNfcSvc'] Http Service started: TCPServerSocket(ASYNC_ACCEPT, ipv4=TCP(fd=9 name=localhost.localdomain:12001), ipv6=TCP(null))
[2010-12-27 20:34:09.243 1D37EDC0 info 'HttpNfcSvc'] Plugin started
[2010-12-27 20:34:09.243 1D37EDC0 info 'InternalsvcPlugin'] Plugin started
[2010-12-27 20:34:09.245 1D37EDC0 info 'Libs'] vmware-serverd: Removing stale symlink /var/run/vmware/4fcf7431a2736d2b35514afd524a40d3
[2010-12-27 20:34:09.245 1D37EDC0 info 'Libs'] vmware-serverd: Setup symlink /var/run/vmware/4fcf7431a2736d2b35514afd524a40d3 -> /var/run/vmware/root_0/1293482049244074_36223585
[2010-12-27 20:34:09.246 1D37EDC0 info 'Libs'] vmware-serverd: Removing stale symlink /var/run/vmware/4fcf7431a2736d2b35514afd524a40d3
[2010-12-27 20:34:09.246 1D37EDC0 info 'Libs'] vmware-serverd: Setup symlink /var/run/vmware/4fcf7431a2736d2b35514afd524a40d3 -> /var/run/vmware/root_0/1293482049244074_36223585
[2010-12-27 20:34:09.247 1D37EDC0 info 'Nfc'] Plugin started
[2010-12-27 20:34:09.247 1D37EDC0 info 'Ovfmgrsvc'] Plugin started
[2010-12-27 20:34:09.247 1D37EDC0 info 'Partitionsvc'] Starting partitionsvc plugin
[2010-12-27 20:34:09.247 1D37EDC0 info 'Proxysvc'] vmacore/ssl/useSSLCtxPool: true
[2010-12-27 20:34:09.248 1D37EDC0 info 'Proxysvc'] vmacore/ssl/serializeServerHandshake: false
[2010-12-27 20:34:09.248 1D37EDC0 info 'Proxysvc'] Proxy: Starting new Proxy service
[2010-12-27 20:34:09.248 1D37EDC0 info 'Proxysvc'] Proxy Https service started
[2010-12-27 20:34:09.248 1D37EDC0 info 'Proxysvc'] Proxy: Starting new Proxy service
[2010-12-27 20:34:09.248 1D37EDC0 info 'Proxysvc'] Proxy Http service started
[2010-12-27 20:34:09.261 1D37EDC0 info 'Proxysvc'] Plugin started
[2010-12-27 20:34:09.261 1D37EDC0 info 'Snmpsvc'] ProcessConfig: SNMP Agent enabled: false, communities: 0, sinks 0
[2010-12-27 20:34:09.261 1D37EDC0 info 'Snmpsvc'] Start: Not configured to run
[2010-12-27 20:34:09.261 1D37EDC0 info 'Solo'] webServer/port: 8309
[2010-12-27 20:34:09.261 1D37EDC0 verbose 'WelcomePageCustomizer'] Returned following links: LocalizedLink(), LocalizedLink()
[2010-12-27 20:34:09.262 1D37EDC0 verbose 'WelcomePageCustomizer'] Created customizer with oem file '/etc/vmware/oem.xml' and dynDataUrl '/dyndata.js'.
[2010-12-27 20:34:09.262 1D37EDC0 info 'HTTP server'] HTTP server created with docroots /usr/lib/vmware/hostd/docroot/
[2010-12-27 20:34:09.262 1D37EDC0 verbose 'HTTP server /host'] HTTP accessible configuration files configuration: /etc/vmware/hostd/webAccessibleConfigFiles.xml
[2010-12-27 20:34:09.263 1D37EDC0 info 'Solo'] soapPort: 8307
[2010-12-27 20:34:09.263 1D37EDC0 panic 'App'] error: Cannot allocate memory
[2010-12-27 20:34:09.263 1D37EDC0 panic 'App'] backtrace:

[00] eip 0x12020965
[01] eip 0x11f0d5c0
[02] eip 0x11eb6e80
[03] eip 0x120221ce
[04] eip 0x12016ff3
[05] eip 0x120186bd
[06] eip 0xc0f5c26
[07] eip 0xc0f69a3
[08] eip 0xc106f20
[09] eip 0x11ed0da2
[10] eip 0x11ed5504
[11] eip 0x11eceb10
[12] eip 0xc7a4ff2
[13] eip 0xc79aad5
[14] eip 0xc7a921f
[15] eip 0x1326ff0c
[16] eip 0xbe4d961

[2010-12-27 20:34:09.266 1D37EDC0 verbose 'VMotionJournal'] Exit: Succeeded
[2010-12-27 20:34:09.267 1D37EDC0 verbose 'VmdbSvc'] Shutting down VMDB service...
[2010-12-27 20:34:09.267 1D37EDC0 verbose 'VmdbSvc'] Unregistering callback...
[2010-12-27 20:34:09.267 1D37EDC0 verbose 'VmdbSvc'] ...done

Open in new window

I realize that rebooting the server may fix the problem.  I like that to be a last resort.  Does anyone have any ideas on how to recover from this kind of situation without rebooting?

Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization ConsultantCommented:
It looks like a reboot I'm afraid. VMotion the VMs off if that's possible, if not shut them down, if possible.

and then upgrade to U1 or U2 at least, if you've got support or 4.1.
Danny McDanielClinical Systems AnalystCommented:
did you have storage issues prior to this?  How about free disk space on the system?

Here's what my ESXi 4.0 test system looks like:

# df -h
Filesystem                Size      Used Available Use% Mounted on
visorfs                 215.9M    178.2M     37.7M  83% /
vmfs3                   945.0G    565.0M    944.4G   0% /vmfs/volumes/4b3ca5f5-e8fe1246-701e-000c29d0bd09
vfat                    249.7M     59.3M    190.4M  24% /vmfs/volumes/bf4b6c7d-35d47829-f46e-d99a33602a4f
vfat                      4.0G      2.1M      4.0G   0% /vmfs/volumes/4b3ca5f0-6f650843-1516-000c29d0bd09
vfat                    249.7M      4.0k    249.7M   0% /vmfs/volumes/e7a45a75-9510be53-4309-a7d260303cb1
vfat                    285.9M    230.2M     55.7M  81% /vmfs/volumes/efd8efe3-03bc1cbf-15e0-080efd9e7379
BigchinganAuthor Commented:

I've inherited this ESXi server as part of my new job.  No one around me has any knowledge of any history with this server.  In fact, no one knew this server existed until I pointed it out!  Yikes!

Here is what my "df" looks like:

/vmfs/volumes/4addce0c-aaf59780-c95d-0026b932868d # df -h
Filesystem                Size      Used Available Use% Mounted on
visorfs                 219.0M    184.3M     34.7M  84% /
vmfs3                   832.3G    349.0G    483.2G  42% /vmfs/volumes/4addce0e-7d40e760-c8c2-0026b932868d
vfat                    249.7M     60.6M    189.2M  24% /vmfs/volumes/acb3be78-804c35d4-aa53-6aab33772d47
vfat                      4.0G      2.1M      4.0G   0% /vmfs/volumes/4addce0c-aaf59780-c95d-0026b932868d
vfat                    249.7M     60.8M    188.9M  24% /vmfs/volumes/5933ae7a-e30c219a-b29a-7cbd51a29a1a
vfat                    285.9M    281.4M      4.5M  98% /vmfs/volumes/c2a427e4-2d317086-fef9-b5750d88536c

Open in new window

Some of my partitions look fuller than yours, but they are not completely full.

Your Guide to Achieving IT Business Success

The IT Service Excellence Tool Kit has best practices to keep your clients happy and business booming. Inside, you’ll find everything you need to increase client satisfaction and retention, become more competitive, and increase your overall success.

Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization ConsultantCommented:
are the VMs still running correctly?
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization ConsultantCommented:
you've not exhautsed the memory on the physical host?

I did a very stupid thing on our test box the other day, and powered up 10 x Workstations (VMware View auto-provisioned them!), the only problem was the host, didn't have 20GB.

It flatlined the box for hours, until I managed to RDP into the workstations and shut them down, the ESX host would not respond to SSH, vSphere Client, or Console!

But the VMs seemed to be working okay!?

you'll have to use resxtop (esxtop) to check memory, cpu, disk etc from  the vIMA applicance, but you'll probaby not able to oinstall this, so you'll need to use the Remote CLI for Windows

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
BigchinganAuthor Commented:

Yes the VMs are running fine.  I did an esxtop and I don't see any excessive CPU or memory consumption.

I took the plunge last night and attempted to reboot the ESX server.  It wouldn't reboot manually.  I pulled the plug and rebooted.  Thankfully, everything came back and none of the VMs were damaged.
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization ConsultantCommented:
Monitor closely again in the future, if support agreement allows, I would upgrade at least to ESX 4.0 U2 and maybe evven ESX 4.1
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.