Solved

VMWare Virtual Center Service constantly crashing?

Posted on 2011-02-18
18
3,229 Views
Last Modified: 2012-05-11
Not sure which log file to check for possible causes...

Can anyone suggest the log file to check to help diagnose the fact that every morning when I come into the office I need to RDP onto our Virtual Center server and start the Windows service -> VMWare Virtual Center

Also any idea's or other things to check for clues at to why this might be happening...?

Thanks in advance...
0
Comment
Question by:nhhgict
  • 6
  • 5
  • 5
  • +1
18 Comments
 
LVL 40

Expert Comment

by:coolsport00
ID: 34927384
Typically, though not solely, the VMWare Virtual Center Service stops due to not finding/being able to connect with the database. Do you have a local SQLExpress DB or remote (SQL or Oracle)?

~coolsport00
0
 
LVL 28

Expert Comment

by:bgoering
ID: 34927391
Generally the service crashing has to do with loss of backend database connectivity
0
 
LVL 28

Expert Comment

by:bgoering
ID: 34927401
Probably the best thing to do would be look at the Windows Event log - if it is the database you should find an event indicating that. If it isn't the database you should be able to find an event that will send us down a different path.
0
 

Author Comment

by:nhhgict
ID: 34927405
All-in-one the DB sits on the same box as the VC and is an SQL2005 DB...
0
 
LVL 40

Expert Comment

by:coolsport00
ID: 34927419
There was a post on here several wks back (if I recall) that refrenced something other than the DB, I think...let me see if I can find it and what the resolution was...

~coolsport00

BTW...is your vCenter Server a VM or physical? (not that that really matters too much, but just getting more info) Does vCenter Server reboot each nite?
0
 
LVL 28

Expert Comment

by:bgoering
ID: 34927458
Is it the full 2005 DB or the SQL Express that comes with vcenter? If the Express version you could be bumping up against size restrictions - particularly if you are collecting a log of performance metrics and/or event information.
0
 
LVL 40

Expert Comment

by:coolsport00
ID: 34927469
Are the credentials to start the service correct?...for both SQL Service and vCenter Server? If those got changed recently, that may be the issue.

~coolsport00
0
 
LVL 40

Expert Comment

by:coolsport00
ID: 34927507
Use this URL to get your vCenter logs and post them:
http://www.vmwarewolf.com/which-virtual-center-log-file/

~coolsport00
0
 

Author Comment

by:nhhgict
ID: 34927549
O.k...the box is physical and does not have any scheduled reboot tasks...

The SQL is full 2005 9.0.4035

The SQL and the VC services are all default "Logon as Local System"...

I just assumed it would be best to check a VMWare log on the VC however check the Windows Event logs I've found this...pretty much happens daily between 05:00 and 06:00...

Windows Event ID: 1024

The instance of the SQL Server Database Engine cannot obtain a LOCK resource at this time. Rerun your statement when there are fewer active users. Ask the database administrator to check the lock and memory configuration for this instance, or to check for long-running transactions.
0
Control application downtime with dependency maps

Visualize the interdependencies between application components better with Applications Manager's automated application discovery and dependency mapping feature. Resolve performance issues faster by quickly isolating problematic components.

 
LVL 40

Expert Comment

by:coolsport00
ID: 34927572
Hmm...ok; but go ahead and post the VC log as well...may get even more info.

~coolsport00
0
 

Author Comment

by:nhhgict
ID: 34927785
Hi ~Coolsport00...

I can't see anything in particular that is related to SQL in this log...? If people leave a VI Client running on their desktops overnight i.e. lock their desktops and go home with a VI Client connection still running to the VC...is that bad practice? Should we make sure people are logged out of any VC client connections before leaving the office?

Looking at the vpxd-#.log has opened a new can of worms...it seems the SSL certs for our hosts are in a total mess and not in sync with the VC...this event as posted below is in the vpxd-#.logs so many times that the log has cycled 8 times today already once it reaches it's 5MB limit...

I'm not particularly sure this is related to the service dropping but this defiatley explains the reason one or two hosts in particular deem to drop in and out of VC all the time...

[2011-02-18 16:22:57.116 09548 warning 'Libs'] SSLVerifyCertAgainstSystemStore: The remote host certificate has these problems:

* A certificate in the host's chain is based on an untrusted root.

[2011-02-18 16:22:57.116 09548 warning 'Libs'] SSLVerifyCertAgainstSystemStore: Certificate verification is disabled, so connection will proceed despite the error

[2011-02-18 16:22:57.116 09548 warning 'Libs'] SSLVerifyCertAgainstSystemStore: The remote host certificate has these problems:

* A certificate in the host's chain is based on an untrusted root.

[2011-02-18 16:22:57.116 09548 warning 'Libs'] SSLVerifyCertAgainstSystemStore: Certificate verification is disabled, so connection will proceed despite the error

[2011-02-18 16:22:57.132 10004 warning 'Libs'] SSLVerifyCertAgainstSystemStore: The remote host certificate has these problems:

* A certificate in the host's chain is based on an untrusted root.

[2011-02-18 16:22:57.132 10004 warning 'Libs'] SSLVerifyCertAgainstSystemStore: Certificate verification is disabled, so connection will proceed despite the error

[2011-02-18 16:22:57.132 10004 warning 'Libs'] SSLVerifyCertAgainstSystemStore: The remote host certificate has these problems:
0
 

Author Comment

by:nhhgict
ID: 34927933
O.k so on some of the instances it does mention the SQL...?

I will take a look at some SQL logs to see if anything is reported there too...OMG...this is a mess...this is what you get for allowing a contractor to come in and upgrade your VC from to VSphere...

[2011-02-18 12:50:07.764 08608 warning 'App'] [VdbStatement] SQL execution took too long: { call load_stats_proc(?, ?, ?, ?, ?, ?) }
[2011-02-18 12:50:07.764 08608 warning 'App'] [VdbStatement] Execution elapsed time: 3077 ms
[2011-02-18 12:50:07.764 08608 warning 'App'] [VdbStatement] Bind parameters:
[2011-02-18 12:50:07.764 08608 warning 'App'] [VdbStatement] datatype: 2, size: 4,arraySize: 1000
[2011-02-18 12:50:07.764 08608 warning 'App'] [VdbStatement] value = 0
[2011-02-18 12:50:07.764 08608 warning 'App'] [VdbStatement] datatype: 1, size: 4,arraySize: 1000
[2011-02-18 12:50:07.764 08608 warning 'App'] [VdbStatement] value = 0
[2011-02-18 12:50:07.764 08608 warning 'App'] [VdbStatement] datatype: 11, size: 0,arraySize: 1000
[2011-02-18 12:50:07.764 08608 warning 'App'] [VdbStatement] datatype: 1, size: 4,arraySize: 1000
[2011-02-18 12:50:07.764 08608 warning 'App'] [VdbStatement] value = 0
[2011-02-18 12:50:07.764 08608 warning 'App'] [VdbStatement] datatype: 3, size: 21,arraySize: 1000
[2011-02-18 12:50:07.764 08608 warning 'App'] [VdbStatement] value = -7601390775825144485
[2011-02-18 12:50:07.764 08608 warning 'App'] [VdbStatement] datatype: 11, size: 0,arraySize: 1000
[2011-02-18 12:50:07.811 01032 info 'App'] [VpxLRO] -- BEGIN task-internal-257400 -- host-26659 -- VpxdInvtHostSyncHostLRO.Synchronize --
[2011-02-18 12:50:07.826 01032 info 'App'] [VpxdHostSync] Synchronizing host: vh14.nhill.root.nhhg (172.26.0.124)
[2011-02-18 12:50:07.826 06248 warning 'Libs'] SSLVerifyCertAgainstSystemStore: The remote host certificate has these problems:

* A certificate in the host's chain is based on an untrusted root.
[2011-02-18 12:50:07.826 06248 warning 'Libs'] SSLVerifyCertAgainstSystemStore: Certificate verification is disabled, so connection will proceed despite the error
[2011-02-18 12:50:07.826 06248 warning 'Libs'] SSLVerifyCertAgainstSystemStore: The remote host certificate has these problems:

* A certificate in the host's chain is based on an untrusted root.
[2011-02-18 12:50:07.826 06248 warning 'Libs'] SSLVerifyCertAgainstSystemStore: Certificate verification is disabled, so connection will proceed despite the error
[2011-02-18 12:50:07.842 08200 warning 'Libs'] SSLVerifyCertAgainstSystemStore: The remote host certificate has these problems:

* A certificate in the host's chain is based on an untrusted root.
[2011-02-18 12:50:07.842 08200 warning 'Libs'] SSLVerifyCertAgainstSystemStore: Certificate verification is disabled, so connection will proceed despite the error
[2011-02-18 12:50:07.842 08200 warning 'Libs'] SSLVerifyCertAgainstSystemStore: The remote host certificate has these problems:

* A certificate in the host's chain is based on an untrusted root.
[2011-02-18 12:50:07.842 08200 warning 'Libs'] SSLVerifyCertAgainstSystemStore: Certificate verification is disabled, so connection will proceed despite the error
[2011-02-18 12:50:07.983 01384 info 'App'] [VpxdHostSync] Retrieved host update to 131266
[2011-02-18 12:50:08.045 07092 info 'App'] [VpxdHostSync] Retrieved host update to 163393
[2011-02-18 12:50:08.139 07092 info 'App'] [VpxdHostSync] Completed host synchronization
[2011-02-18 12:50:08.139 07092 info 'App'] [VpxLRO] -- FINISH task-internal-257399 -- host-60225 -- VpxdInvtHostSyncHostLRO.Synchronize --
[2011-02-18 12:50:08.154 01384 info 'App'] [VpxdHostSync] Completed host synchronization
[2011-02-18 12:50:08.154 01384 info 'App'] [VpxLRO] -- FINISH task-internal-257398 -- host-26036 -- VpxdInvtHostSyncHostLRO.Synchronize --
0
 
LVL 16

Expert Comment

by:danm66
ID: 34928022
ignore the SSL errors, they are just pointing out that you don't have a custom SSL cert so it is going to use what it has.  The problem you need to key in on is the SQL execution msgs and why it is taking so long.  What is the size of your db and how many hosts do you have?
0
 
LVL 28

Expert Comment

by:bgoering
ID: 34928296
Make sure you have a maintenance plan in SQL Server for your vCenter database. A good maintenance plan will "tune up" the database somewhat. It is interesting that it seems to happen at night though where (presumably) load would be light. It may also be that you have an extensive maintenance plan that would require you to take vcenter server down, run maintenance, and then bring it back up.
0
 

Author Comment

by:nhhgict
ID: 34941198
Hi everyone thanks for you assitance...

danm66 - the DB is 9861.38Mb with 2220.28Mb free. We have 25 ESX hosts.

bgoering - This problem definately seems related to the state of the SQL database as suggested by you and so many of the replies above. This morning I have a new event log error on the VC server.

I am going to speak to our SQL DBA to get some assitence on cleaning this VCDB up and getting the maintenance plan in place on the VC box to complete successfully. However, any ideas and assistence or ideas to help with this event log error would be appreciated...

Event Type:      Error
Event Source:      VMware VirtualCenter Server
Event Category:      None
Event ID:      1000
Date:            21/02/2011
Time:            06:00:27
User:            N/A
Computer:      VM_MANAGEMENT2
Description:

The description for Event ID ( 1000 ) in Source ( VMware VirtualCenter Server ) cannot be found.

The local computer may not have the necessary registry information or message DLL files to display messages from a remote computer. You may be able to use the /AUXSOURCE= flag to retrieve this description; see Help and Support for details.

The following information is part of the event: An unrecoverable problem has occurred, stopping the VMware VirtualCenter service. Error: Error[VdbODBCError] (-1) "ODBC error: (HY000) - [Microsoft][SQL Native Client][SQL Server]The instance of the SQL Server Database Engine cannot obtain a LOCK resource at this time. Rerun your statement when there are fewer active users.

Ask the database administrator to check the lock and memory configuration for this instance, or to check for long-running transactions." is returned when executing SQL statement "UPDATE VPX_HOST WITH (ROWLOCK) SET DATACENTER_ID = ? , DNS_NAME = ? , HOST_KEY = ? , IP_ADDRESS = ? , EXPECTED_SSL_THUMBPRINT = ? , HOST_SSL_THUMBPRINT = ? , USER_NAME = ? , PASSWORD = ? , PASSWORD_LAST_UPD_DT = ? , SOAP_PORT = ? , AUTHD_PORT = ? , ENABLED = ? , VPXA_ID = ? , MASTER_GEN = ? , MASTER_SPEC_GEN = ? , VMOTION_ENABLED = ? , FAULTTOLERANCE_ENABLED_FLG = ? , LICENSED_EDITION = ? , LOCAL_IP_ADDRESS = ? , ADMIN_DISABLED_FLG = ? , MANAGEMENT_IP = ? WHERE ID = ?".

0
 
LVL 16

Accepted Solution

by:
danm66 earned 500 total points
ID: 34941354
http://kb.vmware.com/kb/1003979 Investigating the health of the VC Database Server.

has some info regarding the transaction logs and how to purge data from the database.  pay attention to the part about the rollup jobs.  Check the jobs to see if they've been completing without errors!  That's usually what causes a db to get big and unwieldy.
0
 
LVL 28

Expert Comment

by:bgoering
ID: 34942924
Yes, I would check with the DBA regarding the configuration of SQL server. I haven't personally experienced this (most of my outages occur when someone bounces the database and doesn't restart vCenter).

Now the long running transactions mentioned is interesting. I don't know of any vCenter activity that would lock an entire table for something long running, but a dbcc command that is part of the maintenance surely can. Have your DBA check on that also.
0
 

Author Closing Comment

by:nhhgict
ID: 35028423
The SQL database was in a mess and the maintenance schedule needed to be totally thought out again to allow better timimng for tasks to complet...
0

Featured Post

Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

Join & Write a Comment

Suggested Solutions

It Is not possible to enable LLDP in vSwitch(at least is not supported by VMware), so in this article we will enable this, and also go trough how to enabled CDP and how to get this information in vSwitches and also in vDS.
Will try to explain how to use the VMware feature TAGs in the VMs and create Veeam Backup Jobs using TAGs. Since this article is too long, I will create second article for the Veeam tasks.
Teach the user how to configure vSphere Replication and how to protect and recover VMs Open vSphere Web Client: Verify vsphere Replication is enabled: Enable vSphere Replication for a virtual machine: Verify replicated VM is created: Recover replica…
Teach the user how to install log collectors and how to configure ESXi 5.5 for remote logging Open console session and mount vCenter Server installer: Install vSphere Core Dump Collector: Install vSphere Syslog Collector: Open vSphere Client: Config…

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now