Hi, I have a situation with three domain controllers that I'm perplexed about.
Firstly, we have two sites. We'll call the domain controllers North1, North2, and South1. North1 and South1 run Server 2003, North2 runs Server 2008 R2.
North1 and South1 are the DCs for two different sites. North1 is old, so we're replacing it. I installed North2 and promoted it to a domain controller a couple weeks ago, and everything has been going well. As it stands now, there are AD links between South1 and North1, and North1 and North2, but not North2 and South1. To test if KCC would do its thing, I shut down North1 over the weekend. I waited a couple hours and checked on it, and no links were created, and I was presented with several repeating Event log messages.
About 1 hour after shutting it down, Igot event ID 1404, "This directory service is now the intersite topology generator and has assumed responsibility for generating and maintaining intersite replication topologies for this site."
30 minutes after that, event ID 1308 came up, saying basically that it could not contact North1, and that it was going to create a temporary connection that would be removed once North1 was back up.
From there, it repeated event IDs 1566, 1311, and 1865 over and over for a couple hours until I powered North1 on again.
1566: All directory servers in the following site that can replicate the directory partition over this transport are currently unavailable.
1865: The Knowledge Consistency Checker (KCC) was unable to form a complete spanning tree network topology. As a result, the following list of sites cannot be reached from the local site.
1311: There is insufficient site connectivity information for the KCC to create a spanning tree replication topology. Or, one or more directory servers with this directory partition are unable to replicate the directory partition information. This is probably due to inaccessible directory servers.
Perform one of the following actions:
- Publish sufficient site connectivity information so that the KCC can determine a route by which this directory partition can reach this site. This is the preferred option.
- Add a Connection object to a directory service that contains the directory partition in this site from a directory service that contains the same directory partition in another site.
If neither of the tasks correct this condition, see previous events logged by the KCC that identify the inaccessible directory servers.
DCDIAG checks out okay (at least right now, I didn't test much while North1 was down). I've been reading up on using repadmin to troubleshoot, and from what I can see all replication is fine, it's just when I took North1 down, it couldn't do what it needed to do, and I'm not really understanding from the errors what might be preventing it from working.
Additionally, the inter-site transports are set up under IP. There are two (I don't know why, someone else set this up long before I was involved). One looks like the default one, one looks user created. The default one has a cost of 100 and replicates every 180 minutes, the user created one has a cost of 200 and replicates every 60 minutes (I assume that one isn't being used). Don't know if this could effect things at all, but thought I'd mention it.