Replication problem between 2 Windows 2003 domain controller


We have problem where 2 DC's will replicate and communicate fine with other DC's but not with each other.  Nothing has changed on either server since this problem appeared, and in total across all sites there are about 25 DC's.

I will try and explain further, but will only talk about 2 sites.

Site A has 2 DC's which are as follows:  dca.blah.local  and also dca2.sub.blah.local (second dc in a child domain)

Site B has 2 DC's which are as follows:  dcb.blah.local and also dcb2.sub.blah.local (again second dc in child domain)

dca2 can communicate and replicate info with dca and dcb and all other dc's, but not dcb2.  dcb2 can also replicate and communicate fine with dca and dcb and all other dc's, but will not replicate with dca2.

On the DNS side of things we are able to ping dca2 from dcb2 and vice versa.  When running replmon.exe on either dca2 or dcb2 we can connect to other dc's but not each other.  When trying to connect using replmon from dca2 to dcb2 or the other way around we get the error RPC Server Unavailable, yet we can connect from either of these to other dc's fine.

It is possible to telnet to port 135 from dca2 or dcb2 to any other dc, but not each other.

dca2 is running Windows 2003 Server SP2 and dcb2 is running Windows 2003 Server SP1 - this is how these servers have been for a long time, and no patches have been applied around when this problem started happening.

Any help would be appreciated.

Who is Participating?
biggles70Connect With a Mentor Author Commented:
I ran a dcdiag on the affected dc's and double checked that something added in the child domain was appearing in the parent domain - all in all it was actually repliating around, and appeared to be working.  

The main reason I found for the problem was due to users in a group not being able to access a sharepoint site on the parent domain.  The security group in question was created as a global group, which meant the visibility was only in the child domain.  Once I changed this to a universal group and it replicated around the visibility became forest wide and users were able to access.

As for the machine with SP2 on it - that is the only one on the network and as such was unathorised. I guess we'll have to see what happens there.

There are still some replication errors showing up, and I without getting all machines to the same SP level it would be hard to find out what is happening until everything is on the same SP.    
TrackhappyConnect With a Mentor Commented:
I would suggest that you apply SP2 to the other DC as a first step. It is not good practice to keep DC's at different service pack levels. If you were to go as far as logging this with Microsoft, the first thing they would tell you is to install the latest service packs on both machines.
biggles70Author Commented:
I knew you were going to say that, as it is what my first thought as well.  Because the SP1 machine is looked after by a different group I figured that it would take a bit longer to get done, and was hoping for some other things to try in the mean time. Given the known RPC problems with an SP1 machine I will get the team to upgrade and see how that goes - will post the resuts when I know.
Sorry.. ;)
biggles70Author Commented:
Changing the group from a global group to a universal one allowed security access in the parent domain.  Replication errors were a bit of a red herring.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.