We help IT Professionals succeed at work.

Backup Hyper V virtual machines for DHCP and DFS boot fine but won't do their jobs

98 Views
Last Modified: 2018-11-05
I had a Hyper V host that wouldn't boot, but I had some backups of the VMs on another machine that I fired up.  Two of the VMs are DCs, one is the DHCP server and the other the DFS root.  Once the backup VMs booted, I was able to ping both of them.  Things seemed normal until I realized that users were not getting DHCP.   I tried clearing the arp cache on our switches, thinking that those machines couldn't find the DHCP server, but that didn't work.  I even reset one workstation's NICs, but that didn't work either.  Nothing I tried would cause the machine to get an IP assigned.  Only after I assigned a manual IP did that workstation seem to be back to normal.

I also noticed that some of our DFS mapped drives didn't work either, even though the DFS root was up and running and I could ping it.

In the meantime, I was able to get the original Hyper V host up and running again, so I shut down the backup VMs and started up the original VMs.  DHCP and DFS started working immediately!  Can someone help me understand why this happened, why the backup VM wouldn't hand out DHCP requests?  Or why the DFS root didn't seem to work either?  It's kind of useless to have backup VMs if they won't do the jobs they're supposed to...

I am fully up and running with all the original Hyper V host and VMs once more, so the fire has been put out for now.  But, I'm worried that the next time I actually need those backup VMs to work that I'll be in the same situation once more.  Help?
Comment
Watch Question

Paul MacDonaldDirector, Information Systems
CERTIFIED EXPERT

Commented:
These "backup VMs" are different machines than the "primary VMs", or are they backups of the primaries?  

In either case, it may be that the "backup VMs" weren't up-to-date with the state of the network when you started them.  DHCP servers in a domain have to be authorized to operate and it may be that your "backup" DHCP servers were not.  The DFS issue may have a similar root cause, if the "backup" DCs didn't have the current DFS topography.
David WilliamsonIT Director

Author

Commented:
They are backups of the primaries, so as far as the network is concerned, they are the the same machines (should be at least). The backups happen 2-3 times daily, so they should have been quite current.  I'm using Quest Rapid Recovery, which is an image-based backup system.  RR automatically keeps the backups VMs updated throughout they day (they call it "virtual standby").  The backup VMs are off, but RR keeps their vhds updated each time a new backup of the live machine happens.

Could it have some thing to do with the MAC address of the backup VM? That is why I cleared the arp cache on my switch, thinking that it may have had a stale record, but that made no difference.  I was able to ping them anyway, so that doesn't seem like that was it.
Paul MacDonaldDirector, Information Systems
CERTIFIED EXPERT

Commented:
"Could it have some thing to do with the MAC address of the backup VM?"
It's possible, but given the "backup VM" is identical to the live VM, I would expect them to have the same MAC address.

How positive are you these "backup VMs" are (or were) current?  Even being a couple days old might cause problems, though I would expect the problems to go away pretty quickly.  Another question is, were the date and time on the "backup VMs" current once they booted?

It's probably more trouble than it's worth at this point, but I'd like to see a wireshark capture of the D.O.R.A. packets between a client and the backup DHCP server.  This is an unusual problem and you may run into it again some day.
David WilliamsonIT Director

Author

Commented:
I did not happen to check the date/time, but I could boot them back up without connecting them to the network to see what they say.  I'd have to wait for after hours to get the back on the network to do a packet capture.
David WilliamsonIT Director

Author

Commented:
as far as the mac address, does that source from the VM or the actual hardware that the VM is using (which is different of course)?  I suppose I could make sure the backup has the same one via the Hyper V settings in the network area.
Paul MacDonaldDirector, Information Systems
CERTIFIED EXPERT

Commented:
Hyper-V MAC addresses come from a pool on the host.  You can assign a fixed MAC address, but this shouldn't be necessary.
David WilliamsonIT Director

Author

Commented:
Still trying to work through this.  I tried moving another machine to a new Hyper V host, but was also having network issues.  I made sure the copy VM had the same MAC address, disconnected the original's network, then connected the copy's network.  The copy seems to be working fine; I can ping everywhere, do nslookups against the domain controller, etc, but I cannot ping the copy.  I'm not even sure how that works!  I can browse the internet, the file server, everything.  But I can't ping it.  The switches somehow don't know where that IP is even though it has the same MAC address as the original. We have two cicso catalyst switches and I cleared the arp table on both of them, still nothing.

This makes having VM copies sort of useless if they can't be reached :-)
Paul MacDonaldDirector, Information Systems
CERTIFIED EXPERT

Commented:
'The switches somehow don't know where that IP..."
Switches use the MAC to route packets, and if you can go from the VM to a web page, then packets are travelling in both directions.  
It's possible a firewall rule exists on the copy that suppresses ICMP responses.  That would explain this behavior.
When you ping the VM's hostname, it resolves to the correct IP address?  Does it make a difference if you ping the IP address instead?

"This makes having VM copies sort of useless if they can't be reached :-)"
I agree!
David WilliamsonIT Director

Author

Commented:
it does resolve the host name, but there's no difference between pinging the name or the IP.  There are some other services on that server (website and printers) and those are not accessible when the copy VM is online.

As far as a firewall rule, there shouldn't be anything because the copy should be just that, a copy.  The original doesn't block icmp, so the copy shouldn't.  I will double-check, but I'm confident the domain firewall is disabled.

One of the things I didn't mention is that the copy VM is on a different host.  The original VM is on Server 2008 R2 and the copy is on Server 2016.  I've just started using 2016, so could there be something I'm missing about how the newer version of Hyper V works?  Maybe the virtual switches behave differently, or have VLANs predefined or something odd like that?
David WilliamsonIT Director

Author

Commented:
Ok, some progress!  I disconnected the original VM and fired up the copy.  Still no network access to it from the outside, while on the machine itself, everything appeared normal: web browsing, file server browsing, pinging anything, etc.  So, I tried physically plugging and unplugging cables, I tried disabling virtual adapters, etc, whatever I could think of, no effect.  Next, I decided to run netsh int ip reset and netsh winsock reset on the copy VM, then rebooted.  Wha-lah!  I could now ping the machine normally, the sites it hosts came up, everything went to 100% functional.

Nearest I can figure is that when the copy VM came online and installed a new virtual NIC (because it was on different physical hardware), something got jacked up in the tcp/ip stack which allowed it to do everything normally, but nothing could get in to it from the outside (except what it initiated itself first).  Doing the reset seemed to have cleared it, bringing it back to normal.

That's one for the record books, never encountered it before...
IT Director
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION

Gain unlimited access to on-demand training courses with an Experts Exchange subscription.

Get Access
Why Experts Exchange?

Experts Exchange always has the answer, or at the least points me in the correct direction! It is like having another employee that is extremely experienced.

Jim Murphy
Programmer at Smart IT Solutions

When asked, what has been your best career decision?

Deciding to stick with EE.

Mohamed Asif
Technical Department Head

Being involved with EE helped me to grow personally and professionally.

Carl Webster
CTP, Sr Infrastructure Consultant
Empower Your Career
Did You Know?

We've partnered with two important charities to provide clean water and computer science education to those who need it most. READ MORE

Ask ANY Question

Connect with Certified Experts to gain insight and support on specific technology challenges including:

  • Troubleshooting
  • Research
  • Professional Opinions
Unlock the solution to this question.
Join our community and discover your potential

Experts Exchange is the only place where you can interact directly with leading experts in the technology field. Become a member today and access the collective knowledge of thousands of technology experts.

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.