KCC Not Creating New Active Directory Domain Services Connection

Hi, I have a situation with three domain controllers that I'm perplexed about.

Firstly, we have two sites.  We'll call the domain controllers North1, North2, and South1.  North1 and South1 run Server 2003, North2 runs Server 2008 R2.

North1 and South1 are the DCs for two different sites.  North1 is old, so we're replacing it.  I installed North2 and promoted it to a domain controller a couple weeks ago, and everything has been going well.  As it stands now, there are AD links between South1 and North1, and North1 and North2, but not North2 and South1.  To test if KCC would do its thing, I shut down North1 over the weekend.  I waited a couple hours and checked on it, and no links were created, and I was presented with several repeating Event log messages.  

About 1 hour after shutting it down,  Igot event ID 1404, "This directory service is now the intersite topology generator and has assumed responsibility for generating and maintaining intersite replication topologies for this site."

30 minutes after that, event ID 1308 came up, saying basically that it could not contact North1, and that it was going to create a temporary connection that would be removed once North1 was back up.  

From there, it repeated event IDs 1566, 1311, and 1865 over and over for a couple hours until I powered North1 on again.  

1566: All directory servers in the following site that can replicate the directory partition over this transport are currently unavailable.

1865: The Knowledge Consistency Checker (KCC) was unable to form a complete spanning tree network topology. As a result, the following list of sites cannot be reached from the local site.

1311: There is insufficient site connectivity information for the KCC to create a spanning tree replication topology. Or, one or more directory servers with this directory partition are unable to replicate the directory partition information. This is probably due to inaccessible directory servers.
User Action
Perform one of the following actions:
- Publish sufficient site connectivity information so that the KCC can determine a route by which this directory partition can reach this site. This is the preferred option.
- Add a Connection object to a directory service that contains the directory partition in this site from a directory service that contains the same directory partition in another site.
If neither of the tasks correct this condition, see previous events logged by the KCC that identify the inaccessible directory servers.

DCDIAG checks out okay (at least right now, I didn't test much while North1 was down).  I've been reading up on using repadmin to troubleshoot, and from what I can see all replication is fine, it's just when I took North1 down, it couldn't do what it needed to do, and I'm not really understanding from the errors what might be preventing it from working.  

Additionally, the inter-site transports are set up under IP.  There are two (I don't know why, someone else set this up long before I was involved).  One looks like the default one, one looks user created.  The default one has a cost of 100 and replicates every 180 minutes, the user created one has a cost of 200 and replicates every 60 minutes (I assume that one isn't being used).  Don't know if this could effect things at all, but thought I'd mention it.
Who is Participating?
netperfConnect With a Mentor Author Commented:
I believe I figured out my problem.  North1 was configured as the Bridgehead Server.  Once I removed that, and made North2 a Bridgehead Server in Sites & Services, I forced it to check topology again and it built the links that I was hoping to see.  
when you bring North1 down, the KCC (on North2 i assume) informs you that is now responsible for the intersite topology generator and that some temporary new links (with South1) were created while North1 (elected as bridgehead server for almost all NCs i assume) was down.
so for me seems good.
you can check that each site has the properly subnet configured, and you can delete all ntds connections and force the KCC to create them using the:
repadmin /kcc
netperfAuthor Commented:
I have the DC North1 down again (almost for 24 hours now, will be keeping it off for the weekend).  I haven't looked at it much yet, but replication definitely isn't occurring (I modified the login script on North2 4 hours ago, and it hasn't changed on South1 yet).  I'll be looking into it and have more information in a little while.
netperfAuthor Commented:
Not sure what information I can post that would be helpful.  I ran repadmin /kcc and that didn't seem to change anything, is it safe to delete the ntds stuff in ADS&S and run it?  All I got for output when I ran it without changing anything was:

Repadmin: running command /kcc against full DC localhost
Current Site Options: (none)
Consistency check on localhost successful.

Open in new window

DCDIAG fails on replications and kcc tests (which isn't surprising).  

REPADMIN /showreps gives errors like:

DSA object GUID: fcb888b3-df20-4a64-97b9-2b6945231fed
Last attempt @ 2011-03-12 16:56:57 failed, result 1722 (0x6ba):
    The RPC server is unavailable.
19 consecutive failure(s).
Last success @ 2011-03-11 22:13:06.

Open in new window

Any idea what I should do from here?  I'm a little hesitant to start mucking around in AD S&S too much, but if someone can tell me exactly what they think I should do I'd be willing to give it a try.  Thanks!
netperfAuthor Commented:
Figured out my issue myself.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.