Link to home
Create AccountLog in
Avatar of ziceman
zicemanFlag for United States of America

asked on

Win2012 Single PDC health check before Migration

Checking health of single Win2012 PDC before migrating to a Win2019. The dcdiag is showing two errors

      Starting test: DFSREvent
        There are warning or error events within the last 24 hours after the
        SYSVOL has been shared. Failing SYSVOL replication problems may cause
        Group Policy problems.

      Starting test: SystemLog
        An error event occurred. EventID: 0x00009017
            Time Generated: 10/15/2023 13:12:48
            Event String:
            A fatal alert was received from the remote endpoint. The TLS protocol defined fatal alert code is 70.

I can find reference to SCHANNEL in the Event Log that perhaps points to the SystemLog error.

As for DSFR event, nothing shows up in EV. 

The customer has changed anything with Group Policies in quite a while, but everything seems to be operating as expected. The being said, I got different results running gpresult /v from a couple of workstations.

On one, the results looked normal. On another, it can back with - info: the user does not have rsop data.

Would like to do additional diagnostic on this Win2012 before adding (and migrating) new Win2019 machine.

What are the best next steps?

Avatar of ziceman
ziceman
Flag of United States of America image

ASKER

I promoted the new 2019 DC. While the process completed successfully, the new DC is not healthy. Most notably the SYSVOL and Netlogon shares are missing.

Here is the DCDIAG:


PS C:\Windows\system32> dcdiag


Directory Server Diagnosis


Performing initial setup:

   Trying to find home server...

   Home Server = REDACTED-DC

   * Identified AD Forest.

   Done gathering initial info.


Doing initial required tests


   Testing server: Default-First-Site-Name\REDACTED-DC

      Starting test: Connectivity

         ......................... REDACTED-DC passed test Connectivity


Doing primary tests


   Testing server: Default-First-Site-Name\REDACTED-DC

      Starting test: Advertising

         Warning: DsGetDcName returned information for \\sha-dc-01.REDACTED.local, when we were trying to reach

         REDACTED-DC.

         SERVER IS NOT RESPONDING or IS NOT CONSIDERED SUITABLE.

         ......................... REDACTED-DC failed test Advertising

      Starting test: FrsEvent

         ......................... REDACTED-DC passed test FrsEvent

      Starting test: DFSREvent

         There are warning or error events within the last 24 hours after the SYSVOL has been shared.  Failing SYSVOL

         replication problems may cause Group Policy problems.

         ......................... REDACTED-DC passed test DFSREvent

      Starting test: SysVolCheck

         ......................... REDACTED-DC passed test SysVolCheck

      Starting test: KccEvent

         A warning event occurred.  EventID: 0x80000BEB

            Time Generated: 10/15/2023   19:31:39

            Event String:

            The directory has been configured to not enforce per-attribute authorization during LDAP add operations. Warning events will be logged, but no requests will be blocked.

         A warning event occurred.  EventID: 0x80000BEE

            Time Generated: 10/15/2023   19:31:39

            Event String:

            The directory has been configured to allow implicit owner privileges when initially setting or modifying the nTSecurityDescriptor attribute during LDAP add and modify operations. Warning events will be logged, but no requests will be blocked.

         A warning event occurred.  EventID: 0x80000B46

            Time Generated: 10/15/2023   19:31:49

            Event String:

            The security of this directory server can be significantly enhanced by configuring the server to reject SASL (Negotiate, Kerberos, NTLM, or Digest) LDAP binds that do not request signing (integrity verification) and LDAP simple binds that are performed on a clear text (non-SSL/TLS-encrypted) connection.  Even if no clients are using such binds, configuring the server to reject them will improve the security of this server.

         A warning event occurred.  EventID: 0x80000BE1

            Time Generated: 10/15/2023   19:31:49

            Event String:

            The security of this directory server can be significantly enhanced by configuring the server to enforce  validation of Channel Binding Tokens received in LDAP bind requests sent over LDAPS connections. Even if  no clients are issuing LDAP bind requests over LDAPS, configuring the server to validate Channel Binding  Tokens will improve the security of this server.

         ......................... REDACTED-DC passed test KccEvent

      Starting test: KnowsOfRoleHolders

         ......................... REDACTED-DC passed test KnowsOfRoleHolders

      Starting test: MachineAccount

         ......................... REDACTED-DC passed test MachineAccount

      Starting test: NCSecDesc

         ......................... REDACTED-DC passed test NCSecDesc

      Starting test: NetLogons

         Unable to connect to the NETLOGON share! (\\REDACTED-DC\netlogon)

         [REDACTED-DC] An net use or LsaPolicy operation failed with error 67, The network name cannot be found..

         ......................... REDACTED-DC failed test NetLogons

      Starting test: ObjectsReplicated

         ......................... REDACTED-DC passed test ObjectsReplicated

      Starting test: Replications

         ......................... REDACTED-DC passed test Replications

      Starting test: RidManager

         ......................... REDACTED-DC passed test RidManager

      Starting test: Services

         ......................... REDACTED-DC passed test Services

      Starting test: SystemLog

         An error event occurred.  EventID: 0xC0001B58

            Time Generated: 10/15/2023   18:36:39

            Event String: The silsvc service failed to start due to the following error:

         An error event occurred.  EventID: 0xC0001B58

            Time Generated: 10/15/2023   18:44:41

            Event String: The silsvc service failed to start due to the following error:

         An error event occurred.  EventID: 0xC0001B58

            Time Generated: 10/15/2023   18:48:28

            Event String: The silsvc service failed to start due to the following error:

         A warning event occurred.  EventID: 0x000727A5

            Time Generated: 10/15/2023   19:31:05

            Event String: The WinRM service is not listening for WS-Management requests.

         An error event occurred.  EventID: 0xC0001B58

            Time Generated: 10/15/2023   19:31:54

            Event String: The silsvc service failed to start due to the following error:

         ......................... REDACTED-DC failed test SystemLog

      Starting test: VerifyReferences

         ......................... REDACTED-DC passed test VerifyReferences



   Running partition tests on : ForestDnsZones

      Starting test: CheckSDRefDom

         ......................... ForestDnsZones passed test CheckSDRefDom

      Starting test: CrossRefValidation

         ......................... ForestDnsZones passed test CrossRefValidation


   Running partition tests on : DomainDnsZones

      Starting test: CheckSDRefDom

         ......................... DomainDnsZones passed test CheckSDRefDom

      Starting test: CrossRefValidation

         ......................... DomainDnsZones passed test CrossRefValidation


   Running partition tests on : Schema

      Starting test: CheckSDRefDom

         ......................... Schema passed test CheckSDRefDom

      Starting test: CrossRefValidation

         ......................... Schema passed test CrossRefValidation


   Running partition tests on : Configuration

      Starting test: CheckSDRefDom

         ......................... Configuration passed test CheckSDRefDom

      Starting test: CrossRefValidation

         ......................... Configuration passed test CrossRefValidation


   Running partition tests on : REDACTED

      Starting test: CheckSDRefDom

         ......................... REDACTED passed test CheckSDRefDom

      Starting test: CrossRefValidation

         ......................... REDACTED passed test CrossRefValidation


   Running enterprise tests on : REDACTED.local

      Starting test: LocatorCheck

         ......................... REDACTED.local passed test LocatorCheck

      Starting test: Intersite

         ......................... REDACTED.local passed test Intersite

PS C:\Windows\system32>


So you've setup a new 2019 DC to migrate AD from 2012?


And then after joining 2019 to domain there are issues in Dcdiag? 


Without going through the logs the xx70 error caught my eye.. Check the DNS settings on NIC of both DC to see if each is pointing to other server first, then it's own or loopback address second.. 

Avatar of ziceman

ASKER

Thanks for the reply, Adam.  I do not see the xx70 error you reference, but I will check the NIC settings on both machines. 


Could this prevent the creation of the Sysvol and Netlogon shares for the new 2019 PDC?

Avatar of ziceman

ASKER

Not sure if I should now demote the new 2019 PDC and look further into the health of the Win2012 machine at this point. I don't want to carry unwanted baggage forward.

Avatar of ziceman

ASKER

I have been following this guide - https://www.experts-exchange.com/articles/37174/Introducing-and-Migrating-to-a-Windows-Server-2019-Domain-Controller.html


And I am at step # 19 (immediately before transfer of FSMO). 


repadmin /showrepl and repadmin /replsummary  do not show any errors when run from either machine. 

Sorry it's 0x00009017 without reading the logs in detail yes it could cause this kind of issue in some situations each config is to a degree different, and troubleshooting can be a process of elimination by experience, whether to fix or attempt rollback depends on what errors are then present after any change to NIC DNS, resync AD, check EV and rerun DCdiag both boxes
Repadmin can be normal but DCdiag and EV usually show signs..

Were you using FRS or Dfsr (or both!?) on 2012 box?
Avatar of ziceman

ASKER

The weird thing about the 0x00009017 error - it was generated from DCdiag on the existing 2012 DC - before the new 2019 DC was installed and promoted. 


... but I cannot find any indication that there was another secondary DC in the environment. I am not seeing anything in the AD settings that point to any other DC. 


Before adding the new DC, I had received the result below when running dfsrmig /GetGlobalState:


All Domain Controllers have migrated successfully to Global state (‘Eliminated’). Migration has reached a consistent state on all Domain Controllers. Succeeded.

Open in new window


But I am not seeing a SYSVOL_DFSR  shared on the old server - only SYSVOL and Netlogon

I'm on mobile so haven't been able to look in detail, but I wonder if it was due to FRS or Dfsr issue, especially if there's a conflict between the two, EV might provide clues vut my gut feel is definitely to resolve this and any other significant errors in EV before migrating.
Avatar of ziceman

ASKER

I had assume that the 'Eliminated' state had indicated the conversion had already been done.  Do need I need to manually reset the the sate with dfsrmig /setglobalstate 1 and execute the FRS to DFSR conversion?

Again this can be the case even without higher level errors but an underlying issue remains pissibly going back to the original OS​config, any corruption since etc clues would be in the logs and Endpoint connection errors is one of such possible, I'm only picking out obvious checks or issues,..

You could have a look at this, the requisites is relevant..

https://www.experts-exchange.com/articles/11397/SYSVOL-Migration-to-DFSR.html

Keep in mind if everything checks out, look lower down in the service hierarchy until you find it, then decide how to resolve, sorry I can't be more specifically C just can't get detail on small screen, but finally if you can't resolve it and out of time may be worth considering getting an AD engineer to pick it apart, then run the migration again to 2019.
Avatar of ziceman

ASKER

Well - according to a Spiceworks topic - if it was a 2008 domain to begin with, It would have started with DFSR and not FRS.  So you would not have a SYSVOL_DFSR folder just SYSVOL and would also show the eliminated state:


https://community.spiceworks.com/topic/2225009-frs-to-dfsr-issue-missing-folder


Just when I think the problem is identified, there is a new wrinkle. So, I still not sure whether it is safe to proceed. 

Going back to the original posts there is evidence of FRS activity and LDAP issues in Dcdiag logs before and after migration, I would definitely want to eliminate the possible causes before proceeding migration to avoid further complexity.

Needs more info on environment to troubleshoot each error, if it traces back to FRS ensure that's no longer running, no Sysvol or GP corruption, FFL elevated to 2012, No AD schema issues etc.
Avatar of ziceman

ASKER

The SYSVOL and Netlogon are there on orig 2012 DC, but they were not showing up on newly promoted 2019 DC.


I used this method to make the folders appear - https://thesysadminchannel.com/solved-sysvol-and-netlogon-shares-missing-2016-2019-domain-controller/


Now there folders / shares exist on both the old the new DC and *most* of DCDIAG comes back OK.


The remaining problem seems to be with the new DC reading the pathing or hierarchy of the old SYSVOL folder.

we are getting several:


An error event occurred. EventID: 0x00000422

Time Generated: 10/16/2023 19:45:23

Event String:

The processing of Group Policy failed. Windows attempted to read the file \\redacted.local\sysvol\redacted.local\Policies\{31B2F340-016D-11D2-945F-00C04FB984F9}\gpt.ini from a domain controller and was not successful. Group Policy settings may not be applied until this event is resolved.


So, if I go to \\redacted.local\sysvol\redacted.local from the new DC, I see this:


I see this: https://monosnap.com/file/5pruOUTqNFgMyhWBIF1q7dZDzTBymP


But it seems that all the GPOs are in:


C:\Windows\SYSVOL\sysvol\redacted.local\Policies


So why is it going to the wrong place?


Also - I found this in the event log of the orig DC:


The DFS Replication service stopped replication on volume C:. This occurs when a DFSR JET database is not shut down cleanly and Auto Recovery is disabled. To resolve this issue, back up the files in the affected replicated folders, and then use the ResumeReplication WMI method to resume replication.


Does that mean the DFSR is simply broken right now and DCDIAG is not as relevant?

I think this may be the crux of the question to push ahead 2019 or fallback..

If Dfsr is misconfigured, AD has issues and/or GP parts will work but process fails..

DCdiag originally hints at an issue replicating GP which may well be lower level..

Now EV points to Dfsr as the next layer to troubleshoot.. Assuming FRS is not still running, you could troubleshoot the original config issue which presented in the 2012 GP error, or disregard the original DCdiag..

The way it's going seems pushing ahead on 2019 without going back to the root of it could be adding complexity, you may be able to go back a step fix the replication issue then dive back to 2019..

I'd suggest try to resume replication if it works fine if not go back to basics eg how the 2012 server was setup, corruption and perms before trying much more to push for a fix on 2019..

You might get lucky by carrying on, but the more it keeps coming back to this so far unknown root cause the more you may be going away from a resolution, or you may get a workaround on this and then other services may break...
Avatar of ziceman

ASKER

So should demote the new 2019 DC again, and is that safe when they are talking to each other to some extent?


And .. is is safe to perform the CMD  wmic /namespace:\\root\microsoftdfs path dfsrVolumeConfig where volumeGuid=" on the orig production DC?

I would backup system state and try to resume replication outside hours when no users online if possible

As long as AD/DNS in good order should not be any major change if it works could be a case to go on, of not I would still want to know more on original 2012 DC issue..
Avatar of ziceman

ASKER

OK. I ran the wmic to recover the Jet db on the old 2012 DC, and it returned as successful. 


I also see these two items in the event log:


2218 - The DFS Replication service is in the second step of replication database consistency checks after an unexpected shutdown. The database will be rebuilt if it cannot be recovered. No user action is required. 


2214 - The DFS Replication service successfully recovered from an unexpected shutdown on volume C:.This can occur if the service terminated abnormally (due to a power loss, for example) or an error occurred on the volume. No user action is required. 

 

DCDIAG still shows the Group Policy replication errors, but this could take some time?

 

ASKER CERTIFIED SOLUTION
Avatar of Adam Bell
Adam Bell
Flag of Hong Kong image

Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
See answer
Avatar of ziceman

ASKER

After a little while, the replication started working. 


Could the corrupted Jet DB have previously prevented the SYSVOL and Nelogon folders from being created on the new server?


Or would this fix have still be required - https://thesysadminchannel.com/solved-sysvol-and-netlogon-shares-missing-2016-2019-domain-controller/  ?

I think it could and the notion of "corruption" could apply to both the jet db issue and the registry fix you posted last..

Without having full info on the environment I would say both conform to issues I've resolved in similar ways when dcdiag reported replication issues but "mainstream" resolutions turned out not to apply

Pleased you were able to resolve the initial questions, and thank you for the writeup...

Avatar of ziceman

ASKER

The next step would be to get rid of the Win 2012 DC. In this particular case, the organization is relatively small (@ 30 users), so the previous IT company had them running with a single PDC. We can add another 2019 DC before demoting the Win2012, as this would be best practices. Primarily, just want to safely remove the old server. 


Not sure if I should start a new topic for this?

Adding a second 2019DC if they cannot manage downtime if first DC fails maybe safest..

Then making master roles are on 2019 DC and demoting 2012 DC.

Then when all is well considering to raise Forest Functional Level, step by step if it's not 2012 right now.

Properly should be another question to maintain focus, but key is that you maintain any significant  EV/DCdiag issues in the post.