Link to home
Start Free TrialLog in
Avatar of PWyatt1
PWyatt1

asked on

Netlogon and Other DC Role holder problems

Member Servers cannot connect to domain and DC FSMO role holder has all kinds of errors.
I am getting all kinds of netdiag and dcdiag errors on the primary DC (has all the FSMO roles + GC).

Netdiag /fix Errors

Domain membership test . . . . . . : Failed
    [WARNING] Ths system volume has not been completely replicated to the local
machine. This machine is not working properly as a DC.
DC discovery test. . . . . . . . . : Failed
        [FATAL] Cannot find DC in domain 'MCOL'. [ERROR_NO_SUCH_DOMAIN]


DC list test . . . . . . . . . . . : Failed
        'MCOL': Cannot find DC to get DC list from [test skipped].
LDAP test. . . . . . . . . . . . . : Failed
    Cannot find DC to run LDAP tests on. The error occurred was: The specified d
omain either does not exist or could not be contacted.

        [WARNING] Cannot find DC in domain 'MCOL'. [ERROR_NO_SUCH_DOMAIN]



DCDIAG /fix Errors
Starting test: NetLogons
   Unable to connect to the NETLOGON share!
 Starting test: Advertising
    Fatal Error:DsGetDcName (MCOLLANMGR) call failed, error 1355
    The Locator could not find the server.
Starting test: frsevent
   There are warning or error events within the last 24 hours after the
   SYSVOL has been shared.  Failing SYSVOL replication problems may caus
   Group Policy problems.
   ......................... MCOLLANMGR failed test frsevent
Starting test: kccevent
   An Warning Event occured.  EventID: 0x80000785
      Time Generated: 11/12/2008   10:35:57
      Event String: The attempt to establish a replication link for
   ......................... MCOLLANMGR failed test kccevent
Starting test: systemlog
   An Error Event occured.  EventID: 0xC25A002E
      Time Generated: 11/12/2008   10:36:01
      (Event String could not be retrieved)
Starting test: FsmoCheck
   Warning: DcGetDcName(GC_SERVER_REQUIRED) call failed, error 1355
   A Global Catalog Server could not be located - All GC's are down.
   Warning: DcGetDcName(TIME_SERVER) call failed, error 1355
   A Time Server could not be located.
   The server holding the PDC role is down.
   Warning: DcGetDcName(GOOD_TIME_SERVER_PREFERRED) call failed, error 135

   A Good Time Server could not be located.
   Warning: DcGetDcName(KDC_REQUIRED) call failed, error 1355
   A KDC could not be located - All the KDCs are down.
   ......................... MCOL failed test FsmoCheck

*******
I tried the following before the above netdiag and dcdiags were run:
ipconfig /flushdns
ipconfig /registerdns
net stop netlogon
net start netlogon
     Netlogon failed
I did a netdom query fsmo and none of the roles were listed
So I did an ntdsutil and grabbed the fsmo roles from the DC itself. (Kind of wierd, but it worked)
I ran netdom query fsmo again and the roles are now listed for the DC.

DNS looks OK, both the primary zones and the reverse dns, and all the srv records are in the right places in the _msdcs folder.

So here I am lost as to what I do next. Help would be appreciated.
Avatar of PWyatt1
PWyatt1

ASKER

More info. I did the following test:

:\>nltest /sc_query:mcol
_NetLogonControl failed: Status = 1355 0x54b ERROR_NO_SUCH_DOMAIN
Avatar of Darius Ghassem
Try the burflag method. Make sure you only have internal DNS servers listed in the TCP\IP properties on all servers. If you have more then one NIC then disable the none used NIC.

http://www.petri.co.il/delete_failed_dcs_from_ad.htm
Avatar of PWyatt1

ASKER

Thanks Dariusg:
Your instructions were a little vague. I have two DCs. The secondary DC is not replicating properly even though the SYSVOL folders are present and I tried the ipconfig /flushdns /registerdns, and net stop netlogon and net start netlogon with it also.

Essentially I have two DCs that are not acting correctly. I wanted to get the FSMO role holder DC working first before I tackle the secondary one. Per your instructions, I did set the burflags on the primary DC FSMO Role holder to D4, but because it is not working properly, I really don't know what that will do.
Thanks
What errors are you getting in the Event Logs? Which server was the netdiag run on? I attached the wrong link I'm sorry about that. Do a netdiag /v then post results. You need to find out which DC is the one that isn't having trouble. Which DC you want to make authoritive for the domain. Then you put the D4 burflag on this one then you put the D2 which will then force the replication on D2.

http://www.petri.co.il/delete_failed_dcs_from_ad.htm
Correction on this sentence:

"then you put the D2 which will then force the replication on D2"

then you need to put the D2 flag on the other DC that isn't functioning which will the force the replication from the DC that has the D4 burflag.
Avatar of PWyatt1

ASKER

Thanks dariusg:
I didn't set the burflag on the secondary server so I'll do that now.
FYI. I thought the GPOs were corrupted so I went through MMC and replaced the GPO security file in windows/system32/security with the default file using the Security and Analysis program. That seemed to work for part of the problem as my Exchange server (member server) is now delivering emails and other member servers are getting onto the domain in the normal amount of time.
Thanks
PWyatt:
These errors didn't occur when having your reverse lookup problems, did they?

https://www.experts-exchange.com/Networking/Protocols/DNS/Q-23896798-Reverse-DNS-Zones-Missing.html

@Dariusq:
Howdy, a little FYI for you.
Thanks Chief!
Avatar of PWyatt1

ASKER

Hi ChiefIT:
Gee I can't hide from you, can I?
All of these problems occurred after I rebuilt the reverse dns zones.

OK, Here's a rundown on the error mesages:
No DNS errors

I am working on the primary FSMO role holder DC now

In the App log, Im getting a boatload of 1030 and 1058 errors from Userenv.
In Directory Service, I'm getting a lot of NTDS KCC errors. These are caused from the sysvol problem on the secondary server. I'd like to leave this problem 'till we've fixed the UserEnv and frs problems, then I can reset the kcc on that server.
In File replication log, I am getting SYSVOL errors (Frs -13566) for the SYSVOL on the primary FSMO Role holder DC - the one I am on.

The 1030 and 1058 errors can be a couple of issues you might try running this command dfsutil /purgemupcache. The burflag should fix the FRS error.

https://www.experts-exchange.com/questions/21681998/Clients-unable-to-login-NETLOGON-and-SYSVOL-shares-not-present.html
Avatar of PWyatt1

ASKER

OK. I purged the cache. What do I do now to test/get rid of these errors? I just did a flush/registerdns and a stop/start netlogon.

SYSVOL looks fine on net share
Save the Event log then clear it out to see if the errors come back.
Avatar of PWyatt1

ASKER

OK. On the FSMO Role holder, netdiag came up clean.
However DCDIAG came up with a Netlogon error - unable to connect to the \\ThisServer\netlogon. An LSAPolicy op failed w/ error 1203.
However net share shows the SYSVOL share
Avatar of PWyatt1

ASKER

Thanks dariusg:
Great article, but there is too much description and not enough of "What if such and such is not there"
i.e. right now I have a FSMO DC with empty SYSVOL folders. How do I get the correct info back into them? I need an article that steps me through one step at a time.
Thanks
Avatar of PWyatt1

ASKER

I'm reading through the following article, so give me the rest of the day (I have to leave at 5:00pm CST).
I'll update you in the morning.
Thanks for sticking with me.

http://searchwindowsserver.techtarget.com/tip/0,289483,sid68_gci1320325,00.html#
Your SYSVOL that you are working on is empty? Then this is the problem. Is the other server's SYSVOL empty?
@dariusq:
You are providing such good info that I dare not join in. The burflag method is exactly what I would have done. I have seen 1030 and 1058 caused by a partial replication set. You will also see errors in the FRS logs within the 13000's like 13508 and 13565 in FRS event logs eluding to journal wrap.

I find that OFTEN these 1030 and 1058 errors are caused by an incomplete replication and that replication is the result of a DNS problem.

That is exactly what you are looking at. So, you hit it.

True Chief.

@PWyatt1

Does your DC point to itself for DNS only?
Avatar of PWyatt1

ASKER

Hi Chief:
Yes FSMO DC points to itself. I have diabled the other NICs.
I had a backup of the SYSVOL share and just replaced the empty SYSVOL folder.  All folders and policies are there.

There is no SYSVOL share on the secondary DC.

I am getting periodic 13566 errors.

My last Userenv post after I copied the backup SYSVOL folder in was a valid Sceli 1704 event "Security policy in the Group policy objects has been applied successfully."  

The 1030 and 1058 errors appeared to have stopped.





Avatar of PWyatt1

ASKER

I get the following errors on the PDC

C:\>ntfrsutl ds mcollanmgr
ERROR - Cannot bind w/authentication to computer, mcollanmgr; 000006d9 (1753)
ERROR - Cannot bind w/o authentication to computer, mcollanmgr; 000006d9 (1753)
ERROR - Cannot RPC to computer, mcollanmgr; 000006d9 (1753)
Ok, then you need to do the burflag with the other DC to force replication. Or you can demote then re-promote. Can you do a netdiag then post.
Avatar of PWyatt1

ASKER

Thanks dariusg:
If the secondary DC does not have any SYSVOL folders, would a forced replication add the SYSVOL folders as part of the replication? I always thought the SYSVOL folders had to be present before a replication could take place.
Thanks
Well that isn't always true. I have gotten the SYSVOL to re-create with the burflag. I think maybe a better scenario would be to fully remove the dc by demoting it then re-promoting would be the better option.
Avatar of PWyatt1

ASKER

Thanks. If I demoted it, I would have to do a /forceremoval because the SYSVOL share is not present. According to Microsoft documentation, if a DC is demoted with  /forceremoval, it should not be promoted back into the domain. What experience have you have with re-promoting such a demoted dc?

Thanks
Dariusq is right, use the burflag method to recreate the sysvol and netlogon shares. You will have to do this twice, one at a time.

ON DC1, (role holder), you perform an authoritative resotre. (D2 burflag method)
ON DC2 you perform a non-authoritative restore (D4 Burflag)

If this doesn't work, you have a tombstoned DC and you should DCpromo it to remove it, if not force remove it, perform a metadata cleanup on DC1 and reporomote it in. This has worked many times in the past. So, I don't anticipate any problems.
If you forceremoval then doing a metadata cleanup will make AD think that the DC is a new DC. Make sure you delete any DNS records once you do the forceremoval. As Chief says I have done multiple times and you shouldn't have an issue.
Avatar of PWyatt1

ASKER

Hi ChiefIT:
Did you work late last night? I though authoritive restore was D4 and non-authoritive restore was D2?
Phil :)
You are right Phil. You can do the burflag but I'm kind of leaning towards the force removal
Avatar of PWyatt1

ASKER

OK dariusg: I'll do the foreremoval and Metadata cleanup etc. Give me 45mins.
Thanks.
Ahh, You caught me:

6:30 in the AM, it's 9:30 now
Avatar of PWyatt1

ASKER

OK Dariusg:
Demoted w/ forcermoval, did metadata cleanup, removed from DNS, removed from _msdcs folders, removed from sites and services, removed from Sysvol frs (adsiedit), renamed, and rejoined the domain.

However, when i tried to promote, I got a "Could not find domain" error. No errors showed up in the PDC. The soon-to-be-dc does show up in the ADUC computers, and there are entries in dns zones (forward and reverse). Any suggestions? Thanks
Did you have DNS pointing to the PDC only?
Avatar of PWyatt1

ASKER

You were right!. Changed the DNS to point to the PDC.
I dcpromo'd and it looked like it set up the local SYSVO folders, but then I got the following error:
"The operation failed because an LDAP connection could not be established with the domain controller <PDC> "The specified server cannot perform the specified operation""

Any ideas?
Thanks
Check to make sure firewall is disabled. Check DNS once more to make sure it didn't put the 127.0.0.1 address in for DNS. Do a netdiag /fix
Avatar of PWyatt1

ASKER

Hi dariusg:
I had one old DNS record in the reversedns on the pdc which I got rid of.
I re-executed dcpromo and it went through to completion!
I just rebooted. Let me have it sit overnight and replicate a few times, then if everything is OK, I'll close this case out and give you the points.

Thanks for all the help.
Excellent job Dariusq! You were all over that. It was fun watching you work.

The last report sounds very promissing.
It does sound good. This has been a long one like most of the posts I get involved with.
Avatar of PWyatt1

ASKER

Hi Darius:
Good Morning! Well we're not quite there yet. The Secondary DC (SDC) that I just repromoted with a new name does not have the SYSVOL folders. I am getting KCC errors on the PDC, so on the SDC I stopped the Kerberos service (manual), rebooted and started again.

I then stopped the FRS on both DCs and did the D4/D2 Burflag routine. Stil no SYSVOL folders.

Any suggestions?
You can demote again then have the correct DNS settings in there then it should work with no problem.
Avatar of PWyatt1

ASKER

Ouch! Sigh-h-h-h.......will do.
I think there are two things to watch out for.

1) You just brought a DC on line. You may have to register the SRV records, (since they were removed in a metadata cleanup). To do so, register them by restarting the netlogon service.
2) Then each DC has DNS. So, they have to point to themselves as the preferred DNS server. Then, as the secondary preferred DNS server, they should point to the other DC.

After the SRV records are in place and the preferred DNS servers are correct, you should be able to replicate.
True Chief I think I skipped a step try to do a netdiag /fix on the system. Then do a netdiag the post.
Avatar of PWyatt1

ASKER

Hi Everyone:
My SDC server sat overnight but when I came in on Saturday morning, no contents and especially no GPOs in the SYSVOL share on the SDC.

I did a non-authoritave restore w/ the D2 burflag (stopping and starting ntfrs) but nothing happened. I think the problem may be elsewhere and the culprit is the PDC. The reason I thing this is that after the promotion, I noticed the IUSR and IWAM accounts were not listed in ADUC. I checked the default permission IIS folders and the IIS_WPG users were also missing. I added the IIS_WPG users to all appropriate folders and then uninstalled and reinstalled IIS on the SDC. Still there are no IUSR or IWAM accounts listed in the ADUC. This means that the PDC has some problems.

Can anyone tell me if there is any correlation between a PDC not registering IWAM and IUSR accounts, and the inability of a newly promoted DC to have the PDC not copy the appropriate contents to the new DC's SYSVOL folders?

Thanks

I'm bumping the points up to 400 as this is getting a little complicated now.
Avatar of PWyatt1

ASKER

The IUSR and IWAM accounts were OK on the PDC. I was referring to the SDC.
Did  you put the D4 on the other server? You must have both registry hacks in for the burflag method to work. When you make a server a DC it removes all local accounts. When you demote a DC and you are running IIS it will remove all domain IIS accounts then put the local accounts back in which is crazy but it does this. Now since you had a problem during promotion you might need to demote again but make sure you are doing the burflag method correctly
Avatar of PWyatt1

ASKER

Hi Darius!
I am NOT going to demote and promote again. This had gone on too long!
OK. Let me give you my understanding of the sequence for burflags:
Stop frs on both machines and switch to manual reset burflag on pdc to D4
reset burflag on sdc to D2
Reboot PDC and switch on frs
Reboot SDC and switch on frs.
Please confirm or change this sequence for me.
Thanks

That is correct. You can also try to copy over the SYSVOL from the PDC over to the other DC then run the burflag.
Avatar of PWyatt1

ASKER

Hello Darius:
Sorry for not getting back to you guys. I demoted the SDC and tried promotingit again (after going through the usual deleting thru metadata cleaup, adsiedit, DNS etc.). Changed the dns afterwards to point to itself. Same problem. No sysvol folders. I went thru the D2/D4 burflag routine. Nothing. I then demoted it for good.

I then took a server with a clean windows server ent. assigned a differet name, joined the domain, then promoted with usual DNS changes. Same thing..no SYSVOL folders.

I now know for a fact there there is something wrong with the PDC and its inability to replicate.  I'm not going to waste any more time on this and will close this case unless you want to help me debug the replication problems on the PDC.

I have over 6 DCs on other domains I am managing so its not like don't know what I am doing.  I have promoted and demoted many servers since Windows NT 4.0.

Please let me know if you have some suggestions, otherwise, as I said, I'll close this case.

Thanks
FRS relies heavily upon DNS. Did you register the SRV records in DNS, by restarting the netlogon service, Prior to trying to replicate?

Otherwise, I would think you can find the problems withing AD sites and services. Has the replication partnership been set up for this newly promoted DC?
Avatar of PWyatt1

ASKER

Hi Chief IT.
Yes, I did all that I should have. I went into ADSIedit and deleted all references to the SDC under microsoftdns\xxx.xxx.xxx.in-addr.arpa.
Yes, I restarted netloson.
There was no entry in the NTDS settings for the SDC under Sites and Service.

As I said, there is something wrong with the PDC. It is now running on its own with no errors. Barring help on the PDC I will rebuild the domain this weekend with a new PDC. Then, hopefully, I will be able to promote a second server to the SDC role.
Regards.
SOLUTION
Avatar of Darius Ghassem
Darius Ghassem
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of PWyatt1

ASKER

PDC netdiag is clean as a whistle.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of PWyatt1

ASKER

Hi ChiefIT. In the PDC? that's all I have right now.
Netlogon and Sysvol are file shares that are shared out by DFS.

DFS is not shared out as many might think. It uses SMB and netbios broadcasts simultaneously to share out these files.

Currently DFS shares use SMB sharing over ports 445/TCP and netbios datagram port 139/UDP. This is the primary bind to DFS shares.

The older method was to use Netbios/WINS port 137/TCP and Netbios datagram ports 138/UDP and 139/UDP. The server still uses this simultaneously with SMB sharing.

You can do a port inquiry to see if the ports are listening or blocked.

The syntax of a port query is:

      Portqry -n xxx.xxx.xxx.xxx -o 137,138,139,445 -p both

where xxx.xxx.xxx.xxx is the IP of your domain PDCe.

If you are listening on TCP 137, and 445 and also UDP138, and 139

Portqry is a part of the 2003 server support tools and is also compatible with XP pro.

Lets see if you have a port blockage preventing you from sharing out your DFS shares.
Make sure DFS service is running.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Before you take Siesta, I would like to point out that it appears you have a bit of TCP/IP filtering going on. That could knock down DFS.
Let us know if you need any help.