Solved

VMware hosts keep disconnecting and reconnecting to Virtual Center after moving VC server to new host

Posted on 2011-02-11
31
2,963 Views
Last Modified: 2012-05-11
I have 4 ESX hosts running Vshpere 4.1.  I started to replace the Dell 1950 hosts with Dell R610 hosts.  So far I have replaced 2 of the hosts and placed them in a separate cluster.  My Virtual center server is a VM.  Today, I moved my Virtual Center server to the new cluster.  (No EVC, I powered off the VC server and dissconnected from host in old cluster, then attached it on a host in the new cluster)  Everything is running fine except about once an hour, I get email alerts that all 4 hosts are not responding.  I'm sure there is a setting on the old hosts that I did not mimic on the new hosts.

Hoping that someone here has seen this and has the answer.

Thanks
0
Comment
Question by:drawlin
  • 15
  • 9
  • 6
  • +1
31 Comments
 
LVL 28

Expert Comment

by:bgoering
ID: 34876651
What do you mean disconnected/attached? You did remove from inventory and add to inventory?

Is ths vCenter manageing both clusters?

Often disconnect/reconnect can boil down to DNS problems - all hosts should be able to ping and vmkping one another by short name, FQDN, and IP Address.

Another thing you might try is putting the hosts in maintenance mode (one at a time) then disconnecting them from vCenter, then adding them back in.

Did IP address change for vCenter as part of its move? If so you definately need to do the disconnect reconnect as vpxa config on the host needs to have the name and ip address of vcenter server.
0
 
LVL 5

Author Comment

by:drawlin
ID: 34876680
Yes, remove and add the vmx file to inventory.  The new hosts have the same DNS names and IP's as the hosts I pulled.  I have had the new hosts up and running in their own cluster for a couple weeks.  I have been powering down VMs and moving them to hosts in the new cluster.  Everything pings just fine and the VM's are always up and running without interuption.  

It's just since I moved the VC server to a host in the new cluster that I get email alerts about every 70 minutes that read:

Target: vh02.aegislabs1.local
Previous Status: Green
New Status: Red
 
Alarm Definition:
([Red state Is equal to notResponding] AND [Red state Not equal to standBy])
 
Current values for metric/state:
 State = Not responding AND State = Unknown
 
Description:
Alarm 'Host connection and power state' on vh02.aegislabs1.local changed from Green to Red
0
 
LVL 5

Author Comment

by:drawlin
ID: 34876685
I meant to add that I get an email alert for all 4 hosts, the 2 in the old cluster and the 2 in the new cluster.
0
 
LVL 28

Expert Comment

by:bgoering
ID: 34876729
If I understand correctly at one time your new hosts had different IP addresses, then when you completed your migration you gave the new hosts the addresses that were on the old hosts.

If that is the case it is same as last scenario above except it is host ip that changed instead of vCenter ip.

Take a look at http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1006768 and follow the instructions there.

Let me know...
0
 
LVL 5

Author Comment

by:drawlin
ID: 34876775
The way I went about this was:  I vmotioned VMs off of one of my old hosts and shut it down.  Then I installed ESX on a new server and named and IP'd it exactally the same as the host I shut down.  I placed the new host in a new cluster because it has a newer CPU.  I powered off some VM's and migrated them to the new host and power them on.  Everything worked fine.  a week later I moved VMs off of another host and repeated the same process, adding the new host to the new cluster and moved some more VM's.  Everyting works fine.  Today, I turned off the VC server and logged into the old host using Vshpere client and removed the VC server from inventory.  Then I logged into a new host using Vsphere client and added the VC server to inventory.  

Since moving the VC server to a new host, I get these "host not responding" alarms avery hour.  When I look at the event log the outtage happens and restores in less than 1 second.  
0
 
LVL 28

Expert Comment

by:bgoering
ID: 34876802
Sounds good assuming you removed the old hosts from vcenter (to clear out record of the old host) and then properly added the new host to vcenter. vCenter server retains some record of the hosts in its database, as well as the various esx hosts in the vpxa.cfg.

So moveing the vCenter from one host to another is causing an issue. I suppose at this point we should start looking at the networking on the new host where it is residing. Make sure no speed/duplex mismatch on the physical side between vmnic and physcal switch. Make sure no loose or bad cables. Is networking redundant? Is the disconnect occuring at regular intervals like something timed out? Check physicall switch config - if trunking double check it is correct, etc.

Probably my last post tonight (past bedtime) but will check on issue in the morning...
0
 
LVL 22

Expert Comment

by:Luciano Patrão
ID: 34876919
Hi

Is the old vCenter still connected or working? I have seen this issues many times, when a second vCenter is in the network.

Uninstall the old vCenter from the network, and remove all the hosts(by adding the hosts into the new vCenter) from the old vCenter. If none of the hosts are using the old vCenter this issue doesn't happen.

Hope this can help

Jail

0
 
LVL 5

Author Comment

by:drawlin
ID: 34879183
Only one VC server and it is a VM.  I moved it from a host in the old cluster to a new host in the new cluster.  After that I get these alarms every hour.
0
 
LVL 5

Author Comment

by:drawlin
ID: 34879189
I also moved the VC server from one of the new hosts to the other new host in the new cluster and waited for a while, and I got the same alarms.
0
 
LVL 5

Author Comment

by:drawlin
ID: 34879259
Network config didn't change.  I used the same two fibers that were plugged into the old hosts for the new hosts.  Same NIC teaming settings on the new hosts as the old hosts.  Same IP's as well.
0
 
LVL 22

Expert Comment

by:Luciano Patrão
ID: 34879315
Hi

For understanding the environment, how many hosts do you have? How many hosts are in the new VC?
How many hosts are in the old VC?

Is the new and old vCenter same versions? What are the version of this vCenters?

Did you create the new vCenter with a new IP?? Or are you using the same IP from the old vCenter?

Is this vCenters physical servers, or VMs?

Jail
0
 
LVL 5

Author Comment

by:drawlin
ID: 34879508
From my original post:

(I have 4 ESX hosts running Vshpere 4.1.  I started to replace the Dell 1950 hosts with Dell R610 hosts.  So far I have replaced 2 of the hosts and placed them in a separate cluster.  My Virtual center server is a VM.)

I only have one VC server and it is a VM.  I did not create a new VC server.  I migrated the VC server from an old host to a new host that's when the alarms started to occur every hour.
0
 
LVL 28

Expert Comment

by:bgoering
ID: 34879631
As a test - if you migrate VC back to the old host do the alarms disappear? I would like to establish for certain that it is the change in host that caused the issue rather than coincidentally some other problem cropping up at the same time. That will better let us focus our efforts in resolving your issue.
0
 
LVL 22

Expert Comment

by:Luciano Patrão
ID: 34879644
Hi

For a new vCenter, just install a new vCenter and add the hosts into that vCenter.

Only if you have any special network configurations, some App, etc., you should migrate, if not, then just create a new server, install the new vCenter and then add the hosts into that new vCenter.

Jail
0
 
LVL 5

Author Comment

by:drawlin
ID: 34881087
I thought about moving the VC server back to an old host last night.  I'm doing it now.  I'll post in an hour to say if it generated an alarm or not.
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 
LVL 5

Author Comment

by:drawlin
ID: 34881121
I moved the VC server to an old host.  Looking back at the logs, I have been getting alarms exactaly every 1 hour and 14 minutes.
0
 
LVL 22

Expert Comment

by:Luciano Patrão
ID: 34881177
Hi

Have you try what I have propose? Create a new one, do not move, or import anything from the old vCenter.

Create a new VM, or use that VM, uninstall the vCenter, clean that server and reinstall the vCenter and add the hosts into it.

One question, the SQL is on that server and installed trough vCenter install?? Or you are using external SQL Server, or another Database Server (like Oracle etc.)

Jail
0
 
LVL 5

Author Comment

by:drawlin
ID: 34881273
Bestway:

If the alarms persists after having moved the VC server to an old host, I will consider creating a new VC server.  If they stop, then it must be related to the new hosts.  

When I first built my VMware infrastructure a couple years ago, I had a similar problem with alarms and a VMware consultant changed a setting in the software/advanced section and they went away.  I just don't know what he did.  Something to do with best practices for the SAN I'm using.
0
 
LVL 5

Author Comment

by:drawlin
ID: 34881415
It's been 2 hours since I moved the VC server from a new host to an old host and no alarms.  I can't leave the VC server on an old host as my plan is to instal new hosts and move the old hosts to my DR site.

Any ideas?
0
 
LVL 5

Author Comment

by:drawlin
ID: 34883094
It's been 12 hours and no alarms.  Looks like the problem is related to the two new hosts.
0
 
LVL 28

Expert Comment

by:bgoering
ID: 34883195
Guess we might as well move it back then - its got to go there sometime. Look in /etc/opt/vmware/vpxa/vpxa.cfg on each failing host and make sure the ServerIP field is correctly set to the IP of the vcenter server. Also, Is it only the new hosts, a single host, or all hosts that you are getting the error on.
0
 
LVL 28

Expert Comment

by:bgoering
ID: 34883198
Also go through http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&externalId=1003409 and follow the knowledge base article - it will walk you through various things to look for. If this doesn't do it then posting some log files are next.
0
 
LVL 28

Expert Comment

by:bgoering
ID: 34883225
Another thing to look at is in vCenter:

Home->Administration->vCenter Server Settings->Runtime Settings

Make sure the IP address of the vCenter Server is there - especially important if the server has more than one IP
0
 
LVL 28

Expert Comment

by:bgoering
ID: 34883240
This connect/disconnect usually boils down to some IP address is misconfigured somewhere, or dns is wrong, or if using host files there is a wrong entry. I would also double check on the host that you are moving it to all of the vmnics are good, if trunking VLANs or you have ether channel set up make sure the physical switch config is matching that of the ESX host and there isn't a wrong vlan tag number set up somewhere.
0
 
LVL 7

Expert Comment

by:cdjc
ID: 34889089
I haven't seen you state anywhere that you removed your old hosts from vCenter Server before you shut them down.

If you didn't remove them prior to shutting them down, then it sounds like you have ARP-related issues.  You started with 4 old ESX/ESXi hosts.  You shut two of them down (without first removing them from vCenter Server) and then install ESX/ESXi on two new servers, using same IP'sand hostnames as the two shutdown old hosts ... but the two new hosts have different MAC addresses than the two old hosts.  This could cause your VMware environment to become "confused", and is one possible explanation for the behavior that you're seeing.

You should have put the two old hosts into maintenance mode, then removed them from vCenter Server prior to shutting them down.

You could try clearing the arp caches on your vCenter Server as well as all 4 of your hosts.  That might fix things up.

But, following that, you will most likely have to remove all four of your hosts from the vCenter Server inventory, then re-add them, in order to make sure that every gets cleaned up properly.
0
 
LVL 5

Author Comment

by:drawlin
ID: 34893061
I went to maint mode and sut down the old hosts, but didn't remove them from VC.  You think a stale ARP cache entry can last 30 plus days?  You talk about the ARP cache on the VC server.  I have shut down the VC server two times to move it from one cluster to another, rebooting a server should clear the ARP cache.  I can definitly try removing all the hosts from VC and re-adding them, but it's odd that the VC server doesn't alarm when on an old host but does when on a new host.  Also, all the VM's work fine and VC works fine, I just get a less than one second dissconnect on all host every 74 minutes.

I did open a ticket with VMware support and they are looking at logs at this time.  

Thank you all for your assistance.  I'll let you know what I find out.
0
 
LVL 22

Expert Comment

by:Luciano Patrão
ID: 34893112
Hi

I have migrate hosts from one vCenter to another by just adding to new host(just get a warning that the host belong to other vCenter, just say yes to moved). And see no issues.
I have added hosts into new vCenter by disconnecting from old vCenter and then add into new vCenter without any issues what so ever.

But yes I can understand that sometimes, or in some environments this can create an issue.

Jail
0
 
LVL 5

Accepted Solution

by:
drawlin earned 0 total points
ID: 35153526
After 4 weeks with VMware support and elevation to tier 3, I stumbeld on the problem.  I post this for those that may have an interest in the resolution I found.

The two new hosts did not have NTP servers configured and were several hours different in time than the old hosts.  I configured NTP settings and updated the time on the new hosts and the Virtual center disconnect errors stopped.

Thank you all for your suggestions.
0
 
LVL 28

Expert Comment

by:bgoering
ID: 35156130
Great - thanks for posting back with the solution. Definately going into my notes :)
0
 
LVL 22

Expert Comment

by:Luciano Patrão
ID: 35156616
Hi

@drawlin thank you.

Never seen this before.

Jail
0
 
LVL 5

Author Closing Comment

by:drawlin
ID: 37820841
This was what fixed the problem
0

Featured Post

Free Gift Card with Acronis Backup Purchase!

Backup any data in any location: local and remote systems, physical and virtual servers, private and public clouds, Macs and PCs, tablets and mobile devices, & more! For limited time only, buy any Acronis backup products and get a FREE Amazon/Best Buy gift card worth up to $200!

Join & Write a Comment

Suggested Solutions

HOW TO: Upload an ISO image to a VMware datastore for use with VMware vSphere Hypervisor 6.5 (ESXi 6.5) using the vSphere Host Client, and checking its MD5 checksum signature is correct.  It's a good idea to compare checksums, because many installat…
In this article, I will show you HOW TO: Perform a Physical to Virtual (P2V) Conversion the easy way from a computer backup (image).
Teach the user how to delpoy the vCenter Server Appliance and how to configure its network settings Deploy OVF: Open VM console and configure networking:
This tutorial will walk an individual through the steps necessary to enable the VMware\Hyper-V licensed feature of Backup Exec 2012. In addition, how to add a VMware server and configure a backup job. The first step is to acquire the necessary licen…

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now