Dcdiag failes connectivity test

Both Windows 2003 sp1 dc's are getting this error:

Domain Controller Diagnosis

Performing initial setup:
   Done gathering initial info.

Doing initial required tests
   Testing server: Default-First-Site-Name\EX2
      Starting test: Connectivity
         The host 184b1f1a-f748-4e27-afa0-8f5ce1dffe49._msdcs.domian.net could not be resolved to an
         IP address.  Check the DNS server, DHCP, server name, etc
         Although the Guid DNS name

         (184b1f1a-f748-4e27-afa0-8f5ce1dffe49._msdcs.domain.net) couldn't be

         resolved, the server name (server.domain.net) resolved to the IP address

         ( and was pingable.  Check that the IP address is

         registered correctly with the DNS server.
         ......................... EX2 failed test Connectivity

Tried deleting the _msdc entries in dns console, restarted netlogon and dns services. Everything else passes.
Well you need to see if on that server you can ping that address. Each domain controler has one of those addresses.

So from each domain controler try to ping those addresses. You can locate the address in DNS. Navagate to
Foward lookup zones/domain name/_msdcs

This is the area that holds to dns alises for DCs

Also you can see what alises is which by going to AD sites and services
sites/Domain name/server name/server1/NTDS settings
You can do this for each server

You should be able to atleast ping the server you are loged into, if not there is a dns issue. In network settings are you using local DNS servers, not internet servers? You should be using local dns servers.
mcsweenSr. Network AdministratorCommented:
I would make sure both servers are using the same primary (internal) DNS server then

ipconfig /registerdns

Try again and see what happens.
What it's telling you is that the GUID for the server : 184b1f1a-f748-4e27-afa0-8f5ce1dffe49._msdcs.domain.net could not be resolved.

Check here to see if you can find it:


or inside forestrootdomain.com in the _msdcs container.

If it's not there, right click the zone and make sure Allow Dynamic Updates is set.  Also check the server's NIC to be sure that you only point it to your DNS server and it is set to Register in DNS.  If either of those things needed fixing then restart the Netlogon service on the DC.

raindaveAuthor Commented:
Servers can ping each other. They each use the sam primary dns server.

Added the register dns setting for the nic, restarted the netlogon service.

The _msdcs container is not there in dns console, even though it wasn't deleted. Just removed the entries. Also got this Netlogon (5781)event:
Dynamic registration or deletion of one or more DNS records associated with DNS domain 'callihq.net.' failed.  These records are used by other computers to locate this server as a domain controller (if the specified domain is an Active Directory domain) or as an LDAP server (if the specified domain is an application partition).  
So the _msdcs container is not under Forward Lookup Zones or inside the callihq.net zone?

If not then this is a simple job to fix.

Let me know if ALL your DNS servers are 2003 before we begin.

raindaveAuthor Commented:
Yes they are all 2003, well we do have a secondary zones on a 2000 server in another domain, but that shouldn't matter.
No the container is not there. Thanks.
raindaveAuthor Commented:
Also the dynamic updates on the zone are set to secure only.

On the root DC (the original DNS server for the domain) do the following:

Right-click Forward lookup zone and select New>Zone
You will name it _msdcs.callihq.net    (it is critical that the name of this zone is the identical suffix to the actual DNS suffix of the DC for your domain)
The zone is Standard Primary.
Your replication scope is to All DNS servers in the Forest.

When it's done, right click the new zone and make sure Secure Dynamic Updates is selected and the zone is AD Integrated.

Restart Netlogon service on all DCs.  Make sure they point to this DNS server first in their NIC properties.  NO ISP DNS server IP addresses anywhere on any NIC inside your LAN.

Replication should place this new zxone on all other 2003 DNS servers in the forest.

Let me know.

raindaveAuthor Commented:
Same error; connection failed, further down the log I have:
Testing server: Default-First-Site-Name\EX2
      Skipping all tests, because server EX2 is
      not responding to directory service requests

Get this on both dc's in the forest.
Also we now have _msdcs container in the zone, it's "grey" but populated.

Maybe I need to reboot, just installed the Tuesday patches as well.
What is grey?

You should have:

Forward Lookup Zone

If there is an _msdcs container is callihq.net, then that container should be grey but the top-level one should not be.

raindaveAuthor Commented:
The top level one is not, but shouldn't there be sub containers under it like: dc, gc, domains, and pdc?

Right-clcik My Computer on one of your DCs.
Select Properties.
On the one of the tabs it shows the domain.

Make sure this domain is exactly the same as the _msdcs.domain zone.

If the servers are set to register in DNS then they should create folders within _msdcs.domain.net.

If they are not, then something in the Event Logs on the DCs will tell us what's going on.

raindaveAuthor Commented:
Rebooted the 2nd dc, no help. The only event on both dc's is in system, Netlogon 5781. The 2nd dc seems to have a coorectly populated _msdcs container, if I made the 1st dc a secondary would it transfer over from 2nd since right now they are both Standard Primary?
Make them both AD Integrated.  They will replicate properly then.

raindaveAuthor Commented:
They already are.
It sounds as if one DC is setup properly and the other one isn't.  

The one with the correct _msdcs zone, does it have any of the FSMO roles on it?  Is it a GC?
raindaveAuthor Commented:
It is GC but has no roles on it. Everything was working but I think someone made dns changes, then I noticed dcdiag failing, and a previous EE question with the same issue suggested removing the entries in that container.  
You could do that, but if replication isn't working then you'll have one empty server.

Can you run REPLMON and see if replication for all DCs is successful?  We should at least confirm this much before moving forward.

raindaveAuthor Commented:
Replmon shows failures, rpc server unavailable:

Below are the replication failures detected on Domain Controllers for this domain:

Domain Controller Name:                   COLOBACKUP
              Directory Partition:        DC=callihq,DC=net
              Replication Partner:        Default-First-Site-Name\EX2
              Failure Code:                1722
              Failure Reason:             The RPC server is unavailable.
Most RPC failures are caused by DNS or communication blockage between controllers (like a firewall or router).

Let's do this:

1)  On the GOOD DNS server, make sure the zones are AD Integrated and accept Dynamic Updates.
2)  Make sure everything in your LAN is pointing only to your GOOD DNS server.  No ISP DNS entries on any NIC inside your LAN.
3)  Forwarding on the GOOD DNS server is to be set to the ISP DNS.
4)  Restart the Netlogon service on ALL DCs and check to make sure they are showing up in the GOOD DNS server.
5)  Uninistall DNS from the server that has incomplete zone information.
6)  Reboot it.
7)  Reinstall DNS on this server and restart the Netlogon service on it when complete.
8)  Wait.  If replication is functioning, then all the DNS zones will create themselves and populate from the original, good DNS server.  If this happens, then you can point this newly installed DNS server to itself for DNS.
9)  All this server as a secondary DNS server to all clients and servers in the network.

Let me know.

raindaveAuthor Commented:
So I changed the (good)2nd dc dns server to itself, and before I knew it received the incomplete _msdcs container from the 1st dc. Why?

Also we have a A record for IANA blackhole server, I don't know how that got in there, I understand they are used for revcerse lookups on private address ranges eg: 172.x 192.x 10.x
Delete that entry.

Interesting that you are having so much fun with this - I can't begin to imagine why.  My guess is that the second DNS server was configured manually rather than allowing replication to populate it and now there are version differences.

Follow the steps above and remove DNS from one of the servers.  Get a single DNS server fully operational before reinstalling DNS on the second one.

If you need another set of eyes, let me know.  

raindaveAuthor Commented:
Yeah this is real fun :, anyway deleted that record, restarted netlogon on both, _msdsc is populated correctly. Deleted a test ad account and it shows on both - dcdiag still fails connectivity test though.

Sure would like to know how that record got in there?

Will probably delete and redo the 2nd  dc, the hardware is getting replaced anyway.  
Not sure about the record.

Can you find this record now?

raindaveAuthor Commented:
Yes it's in _msdcs as a Alias Cname. Also in AD sites and services. Replmon shows no errors, still fails connectivity test though.  
