• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 9198
  • Last Modified:

Windows 2003 Active Directory Group Policy And Site Replication Problems

Hi Experts,

appreciate all the help I can get from you guys. I'm taking over the IT for this company Median. Their previous IT-guy was fired and I have to take over now. But there seem to be a lot of things on the network that are wrong or not configured properly. I know how to set these things up from scratch but I need help to identify where to modify the existing network.

Network is like this: 15 sites out of which 4 have an AD controller (2003), File server (2003) and Exchange Server (2003). All other sites have VPN links to any of the main sites.

First problem I noticed is that group policy is not working on any site other then the one I am based in. If I add a group policy, apply it and then check, only the local site receives it altough if I check on any other AD server, the group policy does show...

Errors I found so far:

Event-ID: 13508: The File Replication Service is having trouble enabling replication from MEDFRAD to MEDDEAD for c:\windows\sysvol\domain using the DNS name medfradad.median.local. FRS will keep retrying.

Event ID: 4004: The DNS server was unable to complete directory service enumeration of zone 142.168.192.in-addr.arpa.  This DNS server is configured to use information obtained from Active Directory for this zone and is unable to load the zone without it.  Check that the Active Directory is functioning properly and repeat enumeration of the zone. The extended error debug information (which may be empty) is "". The event data contains the error.

Event-ID: 4515 : The zone _msdcs.median.local was previously loaded from the directory partition MicrosoftDNS but another copy of the zone has been found in directory partition DomainDnsZones.median.local. The DNS Server will ignore this new copy of the zone. Please resolve this conflict as soon as possible.

Event-ID: 1925: The attempt to establish a replication link for the following writable directory partition failed.
 
Directory partition:
DC=median,DC=local
Source domain controller:
CN=NTDS Settings,CN=SNOMSAD,CN=Servers,CN=Norway-Oslo,CN=Sites,CN=Configuration,DC=median,DC=local
Source domain controller address:
9f738e0c-4224-4d9d-a380-6b6a91390710._msdcs.median.local
Intersite transport (if any):
CN=IP,CN=Inter-Site

Event-ID: 1311: The Knowledge Consistency Checker (KCC) has detected problems with the following directory partition.
 
Directory partition:
CN=Configuration,DC=median,DC=local
 
There is insufficient site connectivity information in Active Directory Sites and Services for the KCC to create a spanning tree replication topology. Or, one or more domain controllers with this directory partition are unable to replicate the directory partition information. This is probably due to inaccessible domain controllers.

Kind regards
Robby
0
ulensr
Asked:
ulensr
  • 9
  • 7
  • 4
6 Solutions
 
Michael PfisterCommented:
- Check the site structure in Active Directory Sites and Services (all IP addresses there?)

- Verify main site is ok with dcdiag. DCDiag has many options, use /v to get a detailed analysis. Rerstict the test to your main DC.
- Verify other sites are ok with dcdiag.

- Verify connection between sites with AD controllers with PortQueryUI (in both directions!):
http://support.microsoft.com/kb/832919/en-us


PortQueryUI Tool http://download.microsoft.com/download/3/f/4/3f4c6a54-65f0-4164-bdec-a3411ba24d3a/PortQryUI.exe

0
 
ulensrAuthor Commented:
Here you go:

1) Site structure: All main sites have a site defined and all IP ranges of these sites are linked accordingly. Do I need to create the subnets for the sites that do not have a DC also ? If so, to which site do I link thelm then ? Also I looked and there is one site link whiwh contains all sites. Each site has one DC and they are all assigned as global catalog server. Finally the NTDS settings have a link to each of the other servers per site. (overkill ?)

2) I check each site and they all look fine except:

              Latency information for 3 entries in the vector were ignored.
                 3 were retired Invocations.  0 were either: read-only replicas
and are not verifiably latent, or dc's no longer replicating this nc.  0 had no
latency information (Win2K DC).
           DC=DomainDnsZones,DC=median,DC=local
              Latency information for 3 entries in the vector were ignored.
                 3 were retired Invocations.  0 were either: read-only replicas
and are not verifiably latent, or dc's no longer replicating this nc.  0 had no
latency information (Win2K DC).
           CN=Schema,CN=Configuration,DC=median,DC=local
              Latency information for 6 entries in the vector were ignored.
                 6 were retired Invocations.  0 were either: read-only replicas
and are not verifiably latent, or dc's no longer replicating this nc.  0 had no
latency information (Win2K DC).
           CN=Configuration,DC=median,DC=local
              Latency information for 6 entries in the vector were ignored.
                 6 were retired Invocations.  0 were either: read-only replicas
and are not verifiably latent, or dc's no longer replicating this nc.  0 had no
latency information (Win2K DC).
           DC=median,DC=local
              Latency information for 4 entries in the vector were ignored.
                 4 were retired Invocations.  0 were either: read-only replicas
and are not verifiably latent, or dc's no longer replicating this nc.  0 had no
latency information (Win2K DC).

Is this something to worry about or look further into ?

Also I noticed this ...

       Skipping site France-Lille, this site is outside the scope provided by
       the command line arguments provided.
       Skipping site Belgium-Brussels, this site is outside the scope
       provided by the command line arguments provided.
       Skipping site Germany-HiHo, this site is outside the scope provided by
       the command line arguments provided.
       Skipping site Germany-Heppenheim, this site is outside the scope
       provided by the command line arguments provided.
       Skipping site Spain-Barcelona, this site is outside the scope provided
       by the command line arguments provided.
       Skipping site Sweden-Gothenburg, this site is outside the scope
       provided by the command line arguments provided.
       Skipping site Norway-Oslo, this site is outside the scope provided by
       the command line arguments provided.

What is that about ?

All other tests passed or display correct.

3) ALl ports are open between these sites since they have VPN tunnels in between them that go up on demand.

Thanks
0
 
DeanC30Commented:
Event-ID: 4515 : The zone _msdcs.median.local was previously loaded from the directory partition MicrosoftDNS but another copy of the zone has been found in directory partition DomainDnsZones.median.local. The DNS Server will ignore this new copy of the zone. Please resolve this conflict as soon as possible.

Tis would indicate an issue with your AD Integarted DNS zone files,  This could also account for a lot of the other issues (replication, outside scope etc).  Try to tresolve this one first and you may find it resolves other issues.  This article may help

http://support.microsoft.com/kb/867464
0
Simplify Active Directory Administration

Administration of Active Directory does not have to be hard.  Too often what should be a simple task is made more difficult than it needs to be.The solution?  Hyena from SystemTools Software.  With ease-of-use as well as powerful importing and bulk updating capabilities.

 
Michael PfisterCommented:
1) The IP ranges not having an own DC should be assigned to the nearest site with DC
2) add the /a switch to dcdiag to test all DCs
3) Please run PortQueryUI. I had very similar symptoms in a Windows 2003 network with VPN tunnels which PortQueryUI finally helped to solve.
0
 
ulensrAuthor Commented:
@mpfister

1) I added those

2) I ran the tests and this error comes up ...

* The File Replication Service Event log test
There are warning or error events within the last 24 hours after the
SYSVOL has been shared.  Failing SYSVOL replication problems may cause
Group Policy problems.
An Warning Event occured.  EventID: 0x800034FA
   Time Generated: 10/08/2007   14:30:30
   (Event String could not be retrieved)
An Warning Event occured.  EventID: 0x800034C4
   Time Generated: 10/08/2007   14:32:32
   (Event String could not be retrieved)
An Warning Event occured.  EventID: 0x800034C4
   Time Generated: 10/08/2007   14:32:33
   (Event String could not be retrieved)
An Warning Event occured.  EventID: 0x800034C4
   Time Generated: 10/08/2007   14:32:34
   (Event String could not be retrieved)
An Warning Event occured.  EventID: 0x800034C4
   Time Generated: 10/08/2007   14:32:35
   (Event String could not be retrieved)
An Warning Event occured.  EventID: 0x800034C4
   Time Generated: 10/08/2007   14:32:42
   (Event String could not be retrieved)
An Warning Event occured.  EventID: 0x800034C4
   Time Generated: 10/08/2007   14:32:47
   (Event String could not be retrieved)
An Warning Event occured.  EventID: 0x800034C5
   Time Generated: 10/08/2007   15:53:04
   (Event String could not be retrieved)

3) Only port 42 TCP says NOT LISTENING ...
0
 
ulensrAuthor Commented:
@DeanC30

On what server do I run this procedure ?
0
 
Michael PfisterCommented:
3) Did you try in both directions? Meaning running PortQueryUI on your main DC and pointing it to a site's DC? After that running it on the site's DC pointing to your main DC? I really mean run it on the DC's (via RDP), not remotely on any workstation.

"port 42 TCP says NOT LISTENING" means there is no DNS running on the target system.


Add'l questions:
- Do all DC's host a DNS server?
- Where is the primary DNS in TCP/IP config on each DC pointing to? (ipconfig /all for all DC's)
- Where are the clients DNS pointing to?
- Is name resolution working at all from a remote site? Run nslookup on your remote DCs and one remote workstation  and query it for your central DC(s).
0
 
DeanC30Commented:
ulensr:  Run the procedure on the DNS server where you get the error message.  Basically what it is saying is "I have this zone file, but there's another one here.. which I will ignore.  The DomainDnsZones.median.local  located in the directory partition is the one you need to KEEP
0
 
ulensrAuthor Commented:
@mpfister

3) Yes I tried in both directions from the DC's themselves querying each other site using RDP.

Hmm strange that this concerns DNS since all DC's are DNS servers also ...

answers :

1) Yes
2) It's own IP address
3) The sites DC aka DNS server
4) This works, also from the site which says port 42 is NOT listening
0
 
DeanC30Commented:
ulensr:  If you get the DNS issue resolved, the rest will probably become more obvious.  Without a sound DNS 'backbone'  you can kiss AD goodnight!!
0
 
ulensrAuthor Commented:
@DeanC30

Well your solution did solve the errors in the DNS eventlog .. all is smooth in there now ...

I am going to replicate the DC's and see what errors I end up with after this change

0
 
ulensrAuthor Commented:
After fixing the duplicate partition according to DeanC30's procedure I still have these errors:

1) Event-ID: 13508: The File Replication Service is having trouble enabling replication from MEDFRAD to MEDDEAD for c:\windows\sysvol\domain using the DNS name medfradad.median.local. FRS will keep retrying.

2) Event-ID: 4013 (Only when starting the server) The DNS server was unable to open the Active Directory.  This DNS server is configured to use directory service information and can not operate without access to the directory.  The DNS server will wait for the directory to start.  If the DNS server is started but the appropriate event has not been logged, then the DNS server is still waiting for the directory to start.

3) Event-ID: 1865: The Knowledge Consistency Checker (KCC) was unable to form a complete spanning tree network topology. As a result, the following list of sites cannot be reached from the local site.
 
Sites:
CN=Sweden-Gothenburg,CN=Sites,CN=Configuration,DC=median,DC=local

4) Event-ID: 1311: The Knowledge Consistency Checker (KCC) has detected problems with the following directory partition.
 
Directory partition:
CN=Configuration,DC=median,DC=local
 
There is insufficient site connectivity information in Active Directory Sites and Services for the KCC to create a spanning tree replication topology. Or, one or more domain controllers with this directory partition are unable to replicate the directory partition information. This is probably due to inaccessible domain controllers.

5) Event-ID: 5781: Dynamic registration or deletion of one or more DNS records associated with DNS domain 'DomainDnsZones.median.local.' failed.  These records are used by other computers to locate this server as a domain controller (if the specified domain is an Active Directory domain) or as an LDAP server (if the specified domain is an application partition).  

6) Event-ID: 40960: The Security System detected an authentication error for the server cifs/MEDBEAD.  The failure code from authentication protocol Kerberos was "There are currently no logon servers available to service the logon request.
 (0xc000005e)".

7: EVent-ID 1030: Windows cannot query for the list of Group Policy objects. Check the event log for possible messages previously logged by the policy engine that describes the reason for this.


It got worse or there are more problems then at first sight

Any ideas ?
0
 
ulensrAuthor Commented:
Last need I did some work after being put in the right direction by mpfister and DeanC30 (which will both receive part of the points already).

Mp fister put my attention to the fact that I needed to make sure each site was connected properly physically with all ports open. I noticed 2 out of 6 had VPN tunnels that never came up.

DeamC30 described how to resolve the DNS errors.

1) All DNS servers which still had errors after removing the duplicate partition I reinstalled (remove DNS and reinstall DNS), then rebooted.
2) On all dc's then the replication occured with less errors but still the GPO failed. Installed GPO Editor from Microsoft, removed any unexisting links / policies and replicated.
3) On one DC this still gave corrupt sysvol/domain/policies directory so I copied manually to that location.
4) Cleared DSF cach by doing dfsutil /purgemupcache
5) Reboot entire domain and now replication is perfect and no GPO errors anymore on the dc's ...

Now some of my client servers and machine seem to have the GPO working properly also but I still have a lot of then that do not get it:

- tried reboot
- gpupdate /force

but still receive this error ...

Event-ID 1058: Windows cannot access the file gpt.ini for GPO cn={E4440129-4B59-4B53-A13A-27FCF6B1DBDE},cn=policies,cn=system,DC=median,DC=local. The file must be present at the location <\\median.local\SysVol\median.local\Policies\{E4440129-4B59-4B53-A13A-27FCF6B1DBDE}\gpt.ini>. (The system cannot find the path specified. ). Group Policy processing aborted.

This policy t refers to doesn't exist in the domain anymore .. so how do I make it stop looking for it ?

DFSUTIL is for DC's I believe ...
0
 
DeanC30Commented:
Event-ID: 13508  - Check the reverse lookup zones in DNS, ensure there is only one PTR record for DNS servers

Event-ID: 4013 - This is usuall when restarting a DNS server, as the primary DNS is set to itself, but the DNS server service does not start immediately.  This can safely be ignored, if only hapening on startup

Event-ID: 1865 - Check that all DC / DNS servers have a RR in DNS.   (netdiag test:dns /fix should help, run on all  DCs should identify if this is an issue)

Event-ID: 1311 -   http://support.microsoft.com/kb/307593

Event-ID: 5781
http://www.eventid.net/display.asp?eventid=5781&eventno=167&source=NETLOGON&phase=1 

Event ID 40960 -  1. Stop the Kerberos Key Distribution service.
2. Set the KDC service to Disabled.
3. Restart the server (this forces the DC to get a Kerberos ticket from one of the other DCs).
5. Set the KDC service to Automatic.
6. Start the KDC service.
7. Restart the domain controller

Event ID: 1030 - Once the above have been sorted this should resolve itself.

HTH


0
 
Michael PfisterCommented:
Event-ID 1058: On your DC, open the share SYSVOL. Navigate to
SYSVOL\<your domain>\Policies

You may have to add "Administrators" or "Domain Admins" to the folder permissions  to get access.

Check if a directory named {E4440129-4B59-4B53-A13A-27FCF6B1DBDE} exists.
It should be empty or just a gpt.ini file in it.
If yes, delete it, otherwise move it to another folder, like temp.
0
 
ulensrAuthor Commented:
@mpfister

The problem is that this folder does not exist on any SYSVOL\median.local\policies share on any DC.

These machine are looking for a policy that no longer exists and I need to find out how I can force it to stop looking for it ...

0
 
Michael PfisterCommented:
Windows 2003 Resource Kit contains a tool called gpotool.exe
(http://www.microsoft.com/downloads/details.aspx?familyid=9d467a69-57ff-4ae7-96ee-b18c4790cffd&displaylang=en)

Pleease run it on a workstation having the problem.
0
 
Michael PfisterCommented:
Please run

gpotool /verbose /checkacl
0
 
ulensrAuthor Commented:
@mpfirster

I ran the tool and it produced a heap of error, especially "SYSVOL mismatch". I solved those with http://support.microsoft.com/default.aspx?scid=kb;en-us;828760

Then there were a few "Version mismatch" error and then I rebuild the SYSVOL enterprise wide (http://support.microsoft.com/kb/315457)

Now it seems that replication is taking place errorless, no DNS errors, no File Replication errors and no directory service errors.

Smooth ...

Then GPUPDATE /FORCE on all machines that are still not taking the GPO and that did the trick !
0
 
Michael PfisterCommented:
Glad it helped...


0

Featured Post

Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

  • 9
  • 7
  • 4
Tackle projects and never again get stuck behind a technical roadblock.
Join Now