Solved

Hyper-V Replica failing

Posted on 2013-02-04
33
6,494 Views
1 Endorsement
Last Modified: 2015-05-19
Hello All,

I am attempting to configure Hyper-V Replica and am coming to a dead end here it seems. Error message appears at the end when attempting to finish the configuration:

Hyper-V failed to enable replication.
Hyper-V failed to establish a connection with the Replica server.
Hyper-V Failed to enable replication for the virtual machines "TestVM": The connection with the server was terminated abnormally (0x00002EFE)
Hyper-V failed to establish a connection with the Replica server 'replica.company.com' on port '443.' Error: The connection with the server was terminated abnormally (0x00002EFE).


No logs showing any connection on the Destination Replica.

Logs on the Source Replica are as follows:
Within Hyper-V-VMMS Admin:
Error - Hyper-V-VMMS - 29310 - Hyper-V failed to establish a connection with the Replica server 'replica.company.com' on port '443.' Error: The connection with the server was terminated abnormally (0x00002EFE).
Error - Hyper-V-VMMS - 32000 - Hyper-V failed to enable replication for virtual machine 'TestVM': The connection with the server was terminated abnormally (0x00002EFE).

Within System Logs:
Error - DistributedCOM - 10028 - DCOM was unable to communicate with the computer replica.company.com using any of the configured protocols; requested by PID      e3c (C:\Windows\system32\mmc.exe).

This servers are on different domains and different sites. There is no trust relationship. So I assumed I needed to use certificate based authentication to make it work. Following the link below here I used self-signed certificates:
http://technet.microsoft.com/en-us/library/jj134153.aspx#BKMK_1_5
Destination server I put two FQDN CN's in case it was expected the external address so it's both replica.company.com and servername.company.com
Source server I put the proper FQDN.

Configured proper ports on our ASA on the destination servers' side for 443 (I also completely opened it on the local Windows firewall and in our ASA for troubleshooting).

Am I missing something here? I've seen a couple forums where people are having the same issue but either no resolution on it or the solution did not resolve it for me.
http://community.spiceworks.com/topic/259848-server-2012-replication-certificate-internal-domain
http://blog.greypuddles.net/?p=179
http://social.technet.microsoft.com/Forums/en-US/winservercore/thread/1979ecbb-3efd-47bc-9322-b509c369a0ed/
http://teety4.rssing.com/browser.php?indx=3899442&item=701
1
Comment
Question by:justin-jurgens
  • 13
  • 11
  • 7
  • +1
33 Comments
 
LVL 36

Expert Comment

by:ArneLovius
Comment Utility
I'm guessing from your description that both hyper-v servers are behind NAT.

You need to use static NAT at both sites to allow the communication to be two way.

You can of course restrict the source address to be only from the remote exit IP address.

I would usually use dynamic policy NAT to keep the outbound traffic on the same exit IP address as the static NAT.

The certifcates need to be configured on both servers.
0
 

Author Comment

by:justin-jurgens
Comment Utility
I configured self-signed certificates on both servers and copied the CA certificate to each others' trust root location.

Both servers are behind NAT yes. When you say static NAT you're meaning a 1-to-1 NAT to an external IP? I have that on the destination side, just not the source side. What is the requirement to having a static NAT entry on the source location's firewall besides failing back the VMs? I'm having an issue just getting it to go to the destination currently.
0
 
LVL 36

Assisted Solution

by:ArneLovius
ArneLovius earned 500 total points
Comment Utility
Yes, a 1 to 1 NAT (just for port 443) on both sides

Communication is required in both directions, until you have bi-directional communication, you will not be able to create the replica

The alternative to static NAT on both ends would be to have a VPN connection between the sites and just use the internal addresses.
0
 

Author Comment

by:justin-jurgens
Comment Utility
Ok I will complete that tonight. I was unaware of that requirement, didn't see that in anything I read with Microsoft whitepapers nor online forums/articles.

I will update with my progress later.
0
 
LVL 36

Expert Comment

by:ArneLovius
Comment Utility
its not as explicit a it should be

from here http://technet.microsoft.com/en-us/library/jj134240.aspx

If you use network address translation (NAT), ensure that the inbound and outbound ports are configured to use the same port number. Replica only listens on one port.
0
 

Author Comment

by:justin-jurgens
Comment Utility
So I created a static NAT on the source server as well. I also just for giggles completely opened the firewall to both the source and destinations servers. Nothing changed, same error message.

However, as a fluke maybe, I did get one new error on the destination server under Hyper-V-VMMS but it was only one time in the couple hours I was messing around last night:

Error - Hyper-V-VMMS - 29218
Hyper-V received a digital certificate that is not valid from primary server 'source.server.local'. Error: A certificate chain processed, but terminated in a root certificate which is not trusted by the trust provider. (0x800B0109).

It was only one time, so not sure if that will tell us much. I followed the Microsoft instructions for the self-signed certs pretty well and I even deleted the ones I had previously made and went through the instructions again.
0
 
LVL 9

Expert Comment

by:dipersp
Comment Utility
Any progress?  I have the same basic problem here, with one twist - no firewall.

Here's what we did for testing.  First, proof of concerpt internally.  Both servers are on the same subnet, firewall is turned off on the machines.  No hardware firewall.

Used self-signed certs with the internal (.local) name for the servers.  No problem, worked fine.  I then built new certs with the external (.com) name for the servers (Again, remember, same subnet at this point - so I just made A records in DNS for now.)  No go - Hyper-V wouldn't take the certs since they didn't have the FQDN for the internal name.

Third time I created the certs, I used two CN names - internal and external.  Took that cert no problem.  Went to enable replication one of our guests and got the same error you have - right when I hit finish.

So I have to think the problem is more cert related versus firewall/routing related.  Unfortunately, I haven't been able to find anything on my end either with lots of Googling, etc.  I'm seeing people that are creating host files, but I have to assume these are more workgroup machines.  Our machines are on the domain, and creating host files to redirect .local addresses scares me a little.  One of those things that will come and bite you three months from now!
0
 
LVL 9

Expert Comment

by:dipersp
Comment Utility
I'm heading down the path of UCC/SAN certificates.  Unfortunately, it does not appear we can do this with makecert.  And I'm not finding anywhere I can get a free trial for a UCC cert.  Not real interested in dish out money to find it wasn't the issue though. . .
0
 
LVL 36

Expert Comment

by:ArneLovius
Comment Utility
the certificate has to match the local server name

if you have your own internal CA, you could create an internal SAN certificates
0
 
LVL 9

Expert Comment

by:dipersp
Comment Utility
Thanks Arne - I know the cert has to match.  I have no problem getting the cert in, it's a matter of point to the .COM instead of .LOCAL address.

So I spent some time standing up an internal CA for testing.  I created a SAN cert with all of the names I needed -

source.domain.local
source.domain.com
destination.domain.local
destination.domain.com

Put the CA cert in the trusted root, put the SAN cert in the personal store of each Hyper-V host.  Went in fine, was able to turn on replication on each host fine.  Can always get past this step.

Then I went to enable replication on a guest machine; pointed it to destination.domain.com for the replica server, pointed it to the cert, all good there.  Hit finish on the last step, and error.  

What gives?!
ErrorCapture.jpg
0
 
LVL 36

Expert Comment

by:ArneLovius
Comment Utility
I would have two SAN Certs, one for just the source and one for just the destination. Having the source and destination might cause a different problem.
0
 
LVL 9

Expert Comment

by:dipersp
Comment Utility
Well, I actually just eliminated the certificate issue all together.  Currently the two (test) machines are on the same domain and same subnet.  We were testing certs to make sure this would work once they're out over the Internet and not on the same subnet.

So for the time being, I tested with kerberos.  Also fails if I use the .com FQDN.  If I use the internal/.LOCAL FQDN - works perfectly fine, just like we experienced with the certs.

I'm at a loss.  Not sure what else to try/test.  We can ping both machines using the .com FQDN (Set this up in a HOST file this time to be sure we weren't having DNS issues.)

I have to assume the machines are rejecting this because they aren't known to be .com and they're seeing it as a security thing maybe?  Is this a case for an alternative UPN suffix maybe?
0
 
LVL 36

Expert Comment

by:ArneLovius
Comment Utility
Possibly you should open a new question for your issues.
0
 
LVL 9

Expert Comment

by:dipersp
Comment Utility
If my troubleshooting and methods help the original author, who had the same issues. . .
0
 
LVL 36

Expert Comment

by:ArneLovius
Comment Utility
If you had answers rather than questions...
0
 
LVL 9

Expert Comment

by:dipersp
Comment Utility
Netminder,

My apologies.  I felt the troubleshooting steps I was also going through on the same problem might be helpful to others in narrowing down the issue.  Since no one has found a solution yet, figured it wouldn't hurt to give additional ideas out and see if something worked for someone else.
0
How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

 

Author Comment

by:justin-jurgens
Comment Utility
I am still having the same issue unfortunately. Loads of research and attempts but still not getting past the error...
0
 
LVL 36

Assisted Solution

by:ArneLovius
ArneLovius earned 500 total points
Comment Utility
have you tried with a VPN between the two sites and using SAN certificates that just have the internal short name and internal f.q.d.n ?
0
 

Author Comment

by:justin-jurgens
Comment Utility
I will attempt creating a site-to-site VPN and use a SAN certificate with servername and FQDN to see if that does anything.

Will update tomorrow.
0
 

Author Comment

by:justin-jurgens
Comment Utility
Wow...so ya that worked. Site-to-site VPN with the servername & FQDN SAN certificate.

What gives? I may not be able to do this with every client, so what do you think a better solution will be? Is there something I did wrong?
0
 
LVL 9

Expert Comment

by:dipersp
Comment Utility
I assume you used the internal FQDN?
0
 

Author Comment

by:justin-jurgens
Comment Utility
Previously I on the destination server certificate I used multiple CNs:
destinationserver
destinationserver.destinationcompany.local
replica.destinationcompany.com

On the source server certificate:
sourceserver
sourceserver.sourcecompany.local
replica.sourcecompany.com

Nothing worked.

As soon as I put a site-to-site VPN and placed manual entries into the corresponding HOSTS files it worked perfectly.
0
 
LVL 36

Expert Comment

by:ArneLovius
Comment Utility
Now you know that it works, you can try to "break" it :-)

I would take down the VPN, and use the hosts file on each server using the public address that is NATted to the other server. I would use the internal shortname and internal f.q.d.n

When created the NAt rules, make sure that you are allowing all outbound traffic, and using dynamic policy NAT to ensure that the outbound traffic comes "from" the same address that the inbound traffic is "to"
0
 
LVL 36

Expert Comment

by:ArneLovius
Comment Utility
This post suggest setting a certificate authentication post, which I read as being separate to the replication traffic port

http://blogs.technet.com/b/virtualization/archive/2012/07/16/hyper-v-replica-certificate-based-authentication-in-windows-server-2012-rc.aspx

I would be tempted to setup a packet capture somewhere along the path between teh two servers (with the traffic going over the VPN) and leave it capturing everything that includes both host addresses, but excludes port 443, and see if any other ports are being used.
0
 

Author Comment

by:justin-jurgens
Comment Utility
Afterward for testing purposes I attempted to disabling the VPN and doing the HOSTS file edits for the internal short name and FQDN. Didn't work. So I know it's definitely not a certificate issue but a network issue somewhere along the lines.

As for that second part, I'm not sure what you mean by "allow all outbound traffic on the NAT entry." Could you explain that? I am using ASA 5505's and 5510's.

I'll setup a packet capture to see anything in regards to your secondary post there. But all I've read is that Replica only listens/replies on a single port. Definitely doesn't seem like that's true though for some reason...
0
 
LVL 36

Expert Comment

by:ArneLovius
Comment Utility
Use a dynamic NAT policy for traffic from the hyper-v server so that the egress IP is the same as the ingress IP, rather than just using the interface address.

If you are just using PAT for port 443, I might also try a full NAT for the address, using an ACL to restrict traffic to just the remote side.
0
 

Author Comment

by:justin-jurgens
Comment Utility
Ya I just have static NAT entries (1 to 1) for the Replica servers with an ACL to allow TCP 443.

I'll attempt that here today. Thanks for sticking with me here...
0
 
LVL 36

Accepted Solution

by:
ArneLovius earned 500 total points
Comment Utility
Just to clarify, I'd change the ACL so that instead of just allowing 443, you allowed all traffic, but only from the "remote" site.
0
 

Author Comment

by:justin-jurgens
Comment Utility
Sorry for the lack of updates, been incredibly busy. So some success and some failure:

Editing the ACL to say "From this address -> Pass all IP" without the VPN activated works great. But only with servername.company.com and not replica.company.com which honestly doesn't matter.

I can say this is resolved. Thank you sir!! If I could give you more points I would!!
0
 
LVL 36

Expert Comment

by:ArneLovius
Comment Utility
I presume that servername.company.com is the internal f.q.d.n, not the public address ?

If you want to lock it down more, you could run a packet sniffer to grab traffic between the two hosts expluding port 443, and create a more restrictive ACL

I would be tempted to do this anyway, just to make sure that noting is being sent in plaintext between the two boxes.
0
 

Author Comment

by:justin-jurgens
Comment Utility
Yes, servername.company.com is the internal FQDN. Replica.company.com was what I was going to use externally. Even using a SAN certificate with both names, the ACL you recommended and the external FQDN it would fail with the original message.

But as soon as I changed it to the internal FQDN with the same process it worked fine.

I'll probably lock it down, but I'm just happy I at least know how to get it to work and slowly lock it down from there.
0
 
LVL 4

Expert Comment

by:HostOne
Comment Utility
Error - Hyper-V-VMMS - 29218
Hyper-V received a digital certificate that is not valid from primary server 'source.server.local'. Error: A certificate chain processed, but terminated in a root certificate which is not trusted by the trust provider. (0x800B0109).

Guys I just had the same problem. Banged my head against a wall for about an hour making new self signed certs over and over and always getting the same error.

I was using the command line based "certutil -addstore -f Root FirstRootCA.cer" command (which I always use and it always works fine) but in desperation, I used the GUI on the primary server to export the Cert to a PFX and then imported it, manually on the other end. The issue's now solved.

Weird but I wanted to mention it in case it helps someone else.
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

This article is an update and follow-up of my previous article:   Storage 101: common concepts in the IT enterprise storage This time, I expand on more frequently used storage concepts.
You might have come across a situation when you have Exchange 2013 server in two different sites (Production and DR). After adding the Database copy in ECP console it displays Database copy status unknown for the DR exchange server. Issue is strange…
This tutorial will walk an individual through the process of configuring their Windows Server 2012 domain controller to synchronize its time with a trusted, external resource. Use Google, Bing, or other preferred search engine to locate trusted NTP …
How to install and configure Citrix XenApp 6.5 - Part 1. In this video tutorial we have explained step by step installation of Citrix XenApp 6.5 Server on Windows Server 2008 R2 is explained in this video. We have explained the difference between…

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now