site not replicating with any other sites - win2003 sp1

Hello guys and as always thanks for your time and expertise.  Here's my problem:
I'm trying to help one of my fellow techs.  He has a site (we have approximately 20 sites in our network) and this site is not replicating with any of the other sites.  Additionally, no intrasite replication is occurring either.  I ran DCDIAG and NETDIAG and there were more errors than you could shake a stick at.
Consequently, I was hoping I oculd demote one of thes domain controllers which I did.  I had to do a force demotion and clean up the metadata which I have done.  However, the DC still wouldn't replicate with any other DCs within the site or outside of it.
Things I've noticed and maybe there's no problem with these settings:
Let's say his domain controller has an ip of 10.50.5.2 subnet of 255.255.255.0  His gateway is 10.50.10.1 and so you get errors that the gateway is not on the same network segment as defined by the ip/subnet.  So that's one thing.  I have our switch guy looking into what the gateway should be if a change is required.
Additionally, I don't see any of these DCs in either DNS Domain zones or Forest dns zones.  
You guys have any ideas or recommendations.  Do you think the gateway is the problem?  Thanks.
pendal1IT ManagerAsked:
Who is Participating?
 
ChiefITConnect With a Mentor Commented:
First off, download (to the desktop), service pack 2 for the server. Then install it. SP1 has a coding issue with the MTU channels and can cause interittent communications with the server without many errors visually seen.

Then, You have to remove the metadata on this server. That includes the "Sites and services metadata as well as the DNS metadata". This article covers removing all metadata:

http://www.petri.co.il/delete_failed_dcs_from_ad.htm

Here is an explanation of phantoms, tombstones and active directory. It provides the four stages of a improperly deleted AD object.  This is a worth while read for someone in your shoes:

http://support.microsoft.com/kb/248047

Once done, you will probably see FRS errors in event logs in the 13000's, like 13508 and 13565, that elude to journal wrap. Journal wrap is a partial replication of data and will not permit you to replicate until this is resolve. The best way to resolve this is to use the burflag method to rebuild the replication set. I can provide that information if you do see these errors in event logs.

If you don't see FRS event errors, run a DCdiag /verbose and netdiag. Copy and paste the errors so we can view them. Look for event logs so we can see any other errors we may need to correct.
0
 
AmericomConnect With a Mentor Commented:
Is the DNS active directory integrated zone? If so, when you restart the netlogon services on your root DCs, does the Domaon and Forest zones  appear?

By looking at the IP and mask, the gateway should start with 10.50.5.x. May be that should be fixed first.  But before you make any change you may want to verify this first:
1) was the IP always configured like that?
2) can you currently ping other DCs in remote sites with the gateway as is? if not then you should try fix it asap. Otherwise, check with your network guys first or verify if this DC has mulitple NICs or multiple gateways etc.
0
 
aces4all2008Connect With a Mentor Commented:
Unless the DC contains persistent static routes for each of the DCs holding the FSMO roles (and your domain's DNS servers) a bad default gateway is a huge problem.  The default gateway must be on the same subnet as the machine.  As things stand now it can only initiate communication with other devices on the same subnet everything else will fail.  Other devices that are properly configured will be able to initiate communication with it however so it's been a short enough time it may still be receiving updates but not sending them.  

Are there still records for the problem DC in ADUC and DNS (from your good DCs)?

If yes correct the Default Gateway and reboot problem DC.

0
Making Bulk Changes to Active Directory

Watch this video to see how easy it is to make mass changes to Active Directory from an external text file without using complicated scripts.

 
pendal1IT ManagerAuthor Commented:
Thanks for the reply.  Yes, the DCs have DNS configured as AD-intergrated zones and the Domain and Forest zones are present.  What I meant in relation to these zones is that when I demoted/promted one of the domain controllers - there were no records for the new dc in either of these zones.  The records were present in all of the other areas of DNS.
- Yes, I can ping other DCs located in remote sites with the gateway as is.  He has the NICs on ther server I'm working on teamed but it's configured correctly.  I'm still waiting for word from the switch guy.  I hate waiting.
0
 
pendal1IT ManagerAuthor Commented:
aces4all2008 - The gateway was the first thing I noticed that was totally out of whack here.  I've been thrust into this situation so I'm still trying to figure out what happened with this config.  I'm waiting on our switch guy to tell me what the gateway should be.  I can guarantee there's no static routes.  RIght now, at this site, there's no intrasite or intersite replication going on.  Nothing is being replicated.  Now my forced demotion is reflected on DCs in other sites but I cleaned up all the metadata, etc via ntdsutil and adsiedit.  However, on the DCs in site, it doesn't show any of these changes.  Just to be clear, the forced demotion and cleaning is not reflected on the DCs in the same site as this DC.  How's that for messed up.
0
 
pendal1IT ManagerAuthor Commented:
Thanks ChiefIT - I'm downloading sp2 as I write this and will install.  
I did have to forcefully demote the DC I'm working on and I did follow the procedures you listed to remove the orphaned metadata.  In other words, I used ntdsutil and also adsiedit.msc to cleanup the metadata.  I also cleaned up DNS.  And these changes are reflected on DCs in other sites, however, not in this site.  None of the changes in terms of the demotion and cleanup process are reflected within this site. Moreover, from within sites and services, you also cannot force replication to any DCs within site or in different sites.  
As I mentioned above, I'm waiting on our switch guy to let me know what the correct default gateway should be.  I will probably have to follow up with you guys tomorrow. I hope you all will stay tuned and follow up because your expertise is greatly appreciated.  
0
 
AmericomCommented:
If you can ping other DCs in the remote site, you gateway should be fine. You can probably get a better explanation from your network guy as it seems they are doing more than just the usual tcp/ip subnetting such as routing multiple subnets by a single gateway etc, things like that which I'm not an expert of...

Before you promote the same box to a DC again, I suggest you start from scratch and clean the all metadata as suggested from the link provided by ChiefIT. You may also want to make sure the firewall is completely off as well.  
0
 
pendal1IT ManagerAuthor Commented:
Americom, I'm with you on the routing part of it.  I need the switch guy for confimation on that.  However, I have cleaned all of the metadata for this DC.  Those changes (the demotion and cleanup process) are reflected on DCs in other sites but not within this site.  The DCs in this site are not talking to each other at all although I can ping.  And per all of your recommendations, I'm not going to promote this server again until I have confirmation the gateway is correct.  Right now I'm installing sp2 and then hopefully I'll have more info for you guys tomorrow.
0
 
ChiefITCommented:
any event log errors in the 13000's, that elude to journal wrap?
0
 
ChiefITCommented:
Oh, they would be in the FRS event logs
0
 
pendal1IT ManagerAuthor Commented:
ChiefIT - I have 1300 errros in the event log but nothing yet that alludes to journal wrap.  Just so you know, the switch guy updated the gateway so now the DC and the gateway are on the same segment.  However, I'm still getting 13508 errors inthe FRS log indicating this DC cannot replicate with another DC in the same site.  My worry is that right now the other DC in this site is also jacked.  
I also get errors in the dns log that this DC's DNS server is not enlisted in the replicationo scopes for the Domain DNS zones or the Forest DNS zones.  
Maybe I'm being impatient.  However, when I promoted this server again, the log did indicate ad was installed but there were some errors.  
One thing that may have happened is this other tech disjoined this DC from the domain before demoting which I know is a no no.   I'm getting disgusted.
0
 
pendal1IT ManagerAuthor Commented:
Problem resolved guys.  The gateway situation was fixed and I made sure to cleanup all the metadata and DNS records.  I also installed sp2 and a host of other windows updates.  The servers in this site are back on speaking terms with each other and the servers in other sites.  Thank you very much for your time and expertise.  Greatly appreciated.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.