VMWAre esxi 5.5 cluster error

We have been running a VMWare cluster using dual IBM servers connected to an IBM DAS. Whenever I connect to VCenter it shows all the VM's as orphaned and disconnects after about 40 seconds. When I log onto each esxi server both have the same error message (Cannot synchronize host. Cannot connect the specified. the host may not be available.) Please see the attached screenshot. Is there a way to fix this without shutting down VM's or the hosts, as I cannot migrate them?
jvillareal78Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Uni KittyCommented:
You didn't attach the screenshot but based on the description I'm guessing that the agents are running into issues and the processes are hanging or dying and so they aren't communicating with vCenter (orphaned).  

You can try to restart the management agents. You can do this without having to shut down the VMs.

This KB lists the different ways you can restart the agents.

kb.vmware.com/kb/1003490
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Check your SQL database, it's not full ?

SQL Express Database sizes are limited to 4GB and 10GB depending upon version.

It's only the Management Server which is "dying" so don't panic!

This will not affect the running of the VMs, which is important.

Also try restarting the vCenter Server SERVICE.... on Windows, if you are using Windows OS.

either via RDP or Console (via ESXi direct)
0
jvillareal78Author Commented:
I looked at the article and will attempt the restart of the management agents. These are in a data center so I will have to see if I can log on and turn on SSH. I am attampting to upload the screenshot again to verify the issue.
Untitled.jpg
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
The lack of CPU and Memory detailed in the summary, indicates an Agent issue.

Just check SQL database.
0
Uni KittyCommented:
I'm seeing three hosts not communicating with vCenter. If restarting the agents doesn't help, I'd begin to investigate the storage next or maybe even there's a network communication issue that's causing this. Keep us posted.
0
Keith PratolaSenior Systems ArchitectCommented:
I kind of doubt there is an issue with the management agents on all hosts. Do you have the root password or an account that can log directly into each ESXi host from vSphere? At least this way you can verify if there is an issue with the host. The easiest way to restart management agents is to SSH into the host, if SSH is enabled, and run "services.sh restart".

Open up the vSphere Client and log into 192.168.5.215 or another host directly instead of going through vCenter. Based on your first comment it doesn't sound like you are actually logging directly into the host, but rather just selecting it from within vCenter. There is a big difference.

Do you have vCenter installed on a Windows Server or are you using a vCenter Server Appliance? Have you tried restarting vCenter? This has no affect on the hosts or VMs running.

Also, can you connect to any of the servers that are running as VMs? Such as RDP to a Windows Server or SSH to a Linux server? If all the servers are running and fine, then there is likely no storage issue either.
0
jvillareal78Author Commented:
All the VM's are running and I am able to connect to them using remote desktop. I can log into the hosts themselves and there are no errors that show on either. It only gives the error when logging in through vcenter,
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Did you restart agents?

restart vCenter Server Service?

Check SQL DB size ?

It would also be worth checking that the Hosts can ping vCenter Server by IP Address and Hostname? (and vica versa)

No communication issues, or firewalls that have gone up...
0
Keith PratolaSenior Systems ArchitectCommented:
Which is what I expected. You need to troubleshoot vCenter and connectivity from vCenter to the ESXi hosts. Such as making sure you don't have ports blocked that is causing a communication issue from vCenter to the ESXi hosts.

Again, is vCenter running on Windows or is it an appliance? Have you restarted the vCenter server? Are all services running? If running on Windows, check the event logs. You can check the database too, but generally speaking of the DB is full, the vCenter service will stop running and you won't even be able to log into vCenter.
0
Uni KittyCommented:
Totally agree, looks like you've narrowed it down to the vCenter side, KapsZ28 and Andy have good ideas on what to check. I'll sit by as moral support now. :)
0
jvillareal78Author Commented:
I did restart the agents using /etc/init.d/hostd restart and /etc/init.d/vpxa restart on both hosts with no affect. I have rebooted the vcenter server several times with no affect. The SQL database size looks to be about 11 gigs. And vcenter is running as VM on the cluster.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Are you using SQL Express or SQL Full ?

e.g. did you use the SQL which is included with the installation of vCenter Server 5.5 for Windows.

SQL Express 2012 is limited to a maximum of 10GB

see here

http://sqlmag.com/sql-server-2012/sql-server-2012-express-editions

You may need to shrink your database, if this is SQL Express. e.g. the free version, that vCenter Server 5.5 installation comes with!)
0
jvillareal78Author Commented:
I used the SQL that was included with the vcenter server for windows, it is 2008 R2
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Okay, you need to shrink your DB!

The problem is 5.5 is so noisy these days, and has so many tables and events, it does not take many Hosts and VMs, and after a few months, your tables are full, and you've exceeded the database size....

are you a SQL DBA ?

I've got an SQL script you can run, to check, sizes of tables, and also reduce these sizes...

the scripts are here...

http://www.experts-exchange.com/Software/VMWare/Q_28647805.html

but I'll also list them...

okay here goes...

Check the following tables...in the database VIM_VCDB

There are two scripts here, the first one will show the row and total space used in the table.

SELECT [Table Name],
(SELECT rows FROM sysindexes s WHERE s.indid < 2 AND s.id = OBJECT_ID(a.[Table Name])) AS [Row count], [Total space used (MB)] FROM 
                (
                SELECT  QUOTENAME(USER_NAME(o.uid)) + '.' + QUOTENAME(OBJECT_NAME(i.id)) AS [Table Name],
                                CONVERT(numeric(15,2),(((CONVERT(numeric(15,2),SUM(i.reserved)) * (SELECT low FROM master.dbo.spt_values (NOLOCK) WHERE number = 1 AND type = 'E')) / 1024.)/1024.)) AS [Total space used (MB)]
                FROM    sysindexes i (NOLOCK)
                                                INNER JOIN
                                sysobjects o (NOLOCK)
                                                ON
                                i.id = o.id AND
                                ((o.type IN ('U', 'S')) OR o.type = 'U') AND
                                (OBJECTPROPERTY(i.id, 'IsMSShipped') = 0)
                WHERE indid IN (0, 1, 255)
                GROUP BY           QUOTENAME(USER_NAME(o.uid)) + '.' + QUOTENAME(OBJECT_NAME(i.id))
                
                ) as a
ORDER BY            [Total space used (MB)] DESC 

Open in new window


the second script here, will remove Tasks and Events from the Db

alter table VPX_EVENT_ARG drop constraint FK_VPX_EVENT_ARG_REF_EVENT, FK_VPX_EVENT_ARG_REF_ENTITY 
alter table VPX_ENTITY_LAST_EVENT drop constraint FK_VPX_LAST_EVENT_EVENT

truncate table VPX_TASK
truncate table VPX_ENTITY_LAST_EVENT
truncate table VPX_EVENT
truncate table VPX_EVENT_ARG

alter table VPX_EVENT_ARG add
constraint FK_VPX_EVENT_ARG_REF_EVENT foreign key(EVENT_ID) references VPX_EVENT (EVENT_ID) on delete cascade, 
constraint FK_VPX_EVENT_ARG_REF_ENTITY foreign key (OBJ_TYPE) references VPX_OBJECT_TYPE (ID)

alter table VPX_ENTITY_LAST_EVENT add
constraint FK_VPX_LAST_EVENT_EVENT foreign key(LAST_EVENT_ID) references VPX_EVENT (EVENT_ID) on delete cascade

Open in new window


You run these scripts "as is", and your responsibility.

I would run the first script and check, if the tasks and events are large... and these are the tables causing the excessive space issue.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
jvillareal78Author Commented:
I am not a SQL anything hehe. But if you let me know what to run I am more then happy to try.
0
Keith PratolaSenior Systems ArchitectCommented:
You should at least look at the database file before trying to shrink it. What is the current size of the .MDF?
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
get me a quick screen grab of the data folder for SQL, and version....and just before I head off to my bed, here in the UK (00:38), you may need to download and install SQL Management studio, to look and manage your dbs.

be back in 5 hrs!
0
jvillareal78Author Commented:
Here is a screenshot of the database folder, but it did not install SQL manager so not sure how to run those scripts
Untitled.jpg
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Yes, database too large.

SQL Management studio

you will need this download

https://www.microsoft.com/en-gb/download/details.aspx?id=29062

You will need to install it on a workstation, or vCenter Server, to open the SQL Instance, to Execute the Code, I've uploaded here...
0
jvillareal78Author Commented:
After running script to shrink the database, and restarting the agents, vcenter was working 100%. Thank you everyone.
0
Uni KittyCommented:
Yay!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
VMware

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.