?
Solved

PDC is down

Posted on 2011-02-16
12
Medium Priority
?
1,393 Views
Last Modified: 2012-05-11
Hello,

We run an environment with about 200 users and one domain controller. We used to have a secondary controller, but this was removed some months ago. We started seeing some problems about four days days ago, and this coincided with an exchange database corruption event. Firstly, Active Directory client would not connect to the domain controller and subsequently our external emails, POP, and IMAP would go down. I saw SAM event ids 16645 and 16651 just before this occurred in each instance. We did some troubleshooting and found the problem would get triggered when we connected to the server via RDP. Not using RDP for some time would increase the time between this issue, but did not go away. Restarting netlogon service was also a quick fix and we initiated a bat to do this every two hours. However, the problem still occurs about every eight hours and we have to restart the server and then restart the mail services to get the system up and running again. Naturally this causes some consternation as we are unable to find out the root cause. When I look at the operations masters, the PDC tab gives me an error. Below is my latest dcdiag dump. Any assistance would be highly appreciated.


Domain Controller Diagnosis

Performing initial setup:
   Done gathering initial info.

Doing initial required tests
   
   Testing server: Default-First-Site-Name\DC
      Starting test: Connectivity
         ......................... DC passed test Connectivity

Doing primary tests
   
   Testing server: Default-First-Site-Name\DC
      Starting test: Replications
         [Replications Check,DC] A recent replication attempt failed:
            From BAANDEV to DC
            Naming Context: DC=ForestDnsZones,DC=etc,DC=local
            The replication generated an error (1256):
            The remote system is not available. For information about network troubleshooting, see Windows Help.
            The failure occurred at 2011-02-16 13:48:36.
            The last success occurred at 2009-11-07 13:55:52.
            11157 failures have occurred since the last success.
         [BAANDEV] DsBindWithSpnEx() failed with error 1753,
         There are no more endpoints available from the endpoint mapper..
         [Replications Check,DC] A recent replication attempt failed:
            From ADC to DC
            Naming Context: DC=ForestDnsZones,DC=etc,DC=local
            The replication generated an error (1256):
            The remote system is not available. For information about network troubleshooting, see Windows Help.
            The failure occurred at 2011-02-16 13:48:43.
            The last success occurred at 2008-10-22 22:53:27.
            20143 failures have occurred since the last success.
         [ADC] DsBindWithSpnEx() failed with error 1722,
         The RPC server is unavailable..
         [Replications Check,DC] A recent replication attempt failed:
            From BAANDEV to DC
            Naming Context: DC=DomainDnsZones,DC=etc,DC=local
            The replication generated an error (1256):
            The remote system is not available. For information about network troubleshooting, see Windows Help.
            The failure occurred at 2011-02-16 13:48:36.
            The last success occurred at 2009-11-07 13:55:52.
            11157 failures have occurred since the last success.
         [Replications Check,DC] A recent replication attempt failed:
            From ADC to DC
            Naming Context: DC=DomainDnsZones,DC=etc,DC=local
            The replication generated an error (1256):
            The remote system is not available. For information about network troubleshooting, see Windows Help.
            The failure occurred at 2011-02-16 13:48:43.
            The last success occurred at 2008-10-22 22:53:27.
            20143 failures have occurred since the last success.
         [Replications Check,DC] A recent replication attempt failed:
            From ADC to DC
            Naming Context: CN=Schema,CN=Configuration,DC=etc,DC=local
            The replication generated an error (8524):
            The DSA operation is unable to proceed because of a DNS lookup failure.
            The failure occurred at 2011-02-16 13:48:54.
            The last success occurred at 2008-10-22 22:53:27.
            20143 failures have occurred since the last success.
            The guid-based DNS name bd34eb2e-09b6-46fb-b64a-0aa9a56a6c48._msdcs.etc.local
            is not registered on one or more DNS servers.
         [Replications Check,DC] A recent replication attempt failed:
            From BAANDEV to DC
            Naming Context: CN=Schema,CN=Configuration,DC=etc,DC=local
            The replication generated an error (1753):
            There are no more endpoints available from the endpoint mapper.
            The failure occurred at 2011-02-16 13:48:54.
            The last success occurred at 2009-11-07 13:55:52.
            11157 failures have occurred since the last success.
            The directory on BAANDEV is in the process.
            of starting up or shutting down, and is not available.
            Verify machine is not hung during boot.
         [Replications Check,DC] A recent replication attempt failed:
            From ADC to DC
            Naming Context: CN=Configuration,DC=etc,DC=local
            The replication generated an error (8524):
            The DSA operation is unable to proceed because of a DNS lookup failure.
            The failure occurred at 2011-02-16 13:48:43.
            The last success occurred at 2008-10-22 23:43:54.
            20143 failures have occurred since the last success.
            The guid-based DNS name bd34eb2e-09b6-46fb-b64a-0aa9a56a6c48._msdcs.etc.local
            is not registered on one or more DNS servers.
         [Replications Check,DC] A recent replication attempt failed:
            From BAANDEV to DC
            Naming Context: CN=Configuration,DC=etc,DC=local
            The replication generated an error (1753):
            There are no more endpoints available from the endpoint mapper.
            The failure occurred at 2011-02-16 13:48:43.
            The last success occurred at 2009-11-07 13:55:41.
            11157 failures have occurred since the last success.
            The directory on BAANDEV is in the process.
            of starting up or shutting down, and is not available.
            Verify machine is not hung during boot.
         [Replications Check,DC] A recent replication attempt failed:
            From BAANDEV to DC
            Naming Context: DC=etc,DC=local
            The replication generated an error (1753):
            There are no more endpoints available from the endpoint mapper.
            The failure occurred at 2011-02-16 13:48:36.
            The last success occurred at 2009-11-07 13:55:52.
            11156 failures have occurred since the last success.
            The directory on BAANDEV is in the process.
            of starting up or shutting down, and is not available.
            Verify machine is not hung during boot.
         [Replications Check,DC] A recent replication attempt failed:
            From ADC to DC
            Naming Context: DC=etc,DC=local
            The replication generated an error (8524):
            The DSA operation is unable to proceed because of a DNS lookup failure.
            The failure occurred at 2011-02-16 13:49:01.
            The last success occurred at 2008-10-22 23:51:57.
            20143 failures have occurred since the last success.
            The guid-based DNS name bd34eb2e-09b6-46fb-b64a-0aa9a56a6c48._msdcs.etc.local
            is not registered on one or more DNS servers.
         REPLICATION-RECEIVED LATENCY WARNING
         DC:  Current time is 2011-02-16 14:22:40.
            DC=ForestDnsZones,DC=etc,DC=local
               Last replication recieved from BAANDEV at 2009-11-07 14:05:56.
               WARNING:  This latency is over the Tombstone Lifetime of 180 days!
               Last replication recieved from ADC at 2008-10-22 22:53:27.
               WARNING:  This latency is over the Tombstone Lifetime of 180 days!
            DC=DomainDnsZones,DC=etc,DC=local
               Last replication recieved from BAANDEV at 2009-11-07 14:05:56.
               WARNING:  This latency is over the Tombstone Lifetime of 180 days!
               Last replication recieved from ADC at 2008-10-22 22:53:27.
               WARNING:  This latency is over the Tombstone Lifetime of 180 days!
            CN=Schema,CN=Configuration,DC=etc,DC=local
               Last replication recieved from BAANDEV at 2009-11-07 14:05:56.
               WARNING:  This latency is over the Tombstone Lifetime of 180 days!
               Last replication recieved from ADC at 2008-10-22 22:53:27.
               WARNING:  This latency is over the Tombstone Lifetime of 180 days!
            CN=Configuration,DC=etc,DC=local
               Last replication recieved from BAANDEV at 2009-11-07 14:05:45.
               WARNING:  This latency is over the Tombstone Lifetime of 180 days!
               Last replication recieved from ADC at 2008-10-22 23:43:54.
               WARNING:  This latency is over the Tombstone Lifetime of 180 days!
            DC=etc,DC=local
               Last replication recieved from BAANDEV at 2009-11-07 14:05:56.
               WARNING:  This latency is over the Tombstone Lifetime of 180 days!
               Last replication recieved from ADC at 2008-10-22 23:51:57.
               WARNING:  This latency is over the Tombstone Lifetime of 180 days!
         ......................... DC passed test Replications
      Starting test: NCSecDesc
         ......................... DC passed test NCSecDesc
      Starting test: NetLogons
         ......................... DC passed test NetLogons
      Starting test: Advertising
         Warning: DC is not advertising as a time server.
         ......................... DC failed test Advertising
      Starting test: KnowsOfRoleHolders
         Warning: ADC is the PDC Owner, but is not responding to DS RPC Bind.
         [ADC] LDAP search failed with error 58,
         The specified server cannot perform the requested operation..
         Warning: ADC is the PDC Owner, but is not responding to LDAP Bind.
         ......................... DC failed test KnowsOfRoleHolders
      Starting test: RidManager
         ......................... DC passed test RidManager
      Starting test: MachineAccount
         ......................... DC passed test MachineAccount
      Starting test: Services
         ......................... DC passed test Services
      Starting test: ObjectsReplicated
         ......................... DC passed test ObjectsReplicated
      Starting test: frssysvol
         ......................... DC passed test frssysvol
      Starting test: frsevent
         ......................... DC passed test frsevent
      Starting test: kccevent
         ......................... DC passed test kccevent
      Starting test: systemlog
         An Error Event occured.  EventID: 0x40000005
            Time Generated: 02/16/2011   13:29:57
            Event String: The kerberos client received a KRB_AP_ERR_TKT_NYV

         An Error Event occured.  EventID: 0x0000165B
            Time Generated: 02/16/2011   14:18:26
            Event String: The session setup from computer 'VERAMANI' failed

         An Error Event occured.  EventID: 0x000016AD
            Time Generated: 02/16/2011   14:20:27
            Event String: The session setup from the computer VERAMANI

         ......................... DC failed test systemlog
      Starting test: VerifyReferences
         ......................... DC passed test VerifyReferences
   
   Running partition tests on : ForestDnsZones
      Starting test: CrossRefValidation
         ......................... ForestDnsZones passed test CrossRefValidation
      Starting test: CheckSDRefDom
         ......................... ForestDnsZones passed test CheckSDRefDom
   
   Running partition tests on : DomainDnsZones
      Starting test: CrossRefValidation
         ......................... DomainDnsZones passed test CrossRefValidation
      Starting test: CheckSDRefDom
         ......................... DomainDnsZones passed test CheckSDRefDom
   
   Running partition tests on : Schema
      Starting test: CrossRefValidation
         ......................... Schema passed test CrossRefValidation
      Starting test: CheckSDRefDom
         ......................... Schema passed test CheckSDRefDom
   
   Running partition tests on : Configuration
      Starting test: CrossRefValidation
         ......................... Configuration passed test CrossRefValidation
      Starting test: CheckSDRefDom
         ......................... Configuration passed test CheckSDRefDom
   
   Running partition tests on : etc
      Starting test: CrossRefValidation
         ......................... etc passed test CrossRefValidation
      Starting test: CheckSDRefDom
         ......................... etc passed test CheckSDRefDom
   
   Running enterprise tests on : etc.local
      Starting test: Intersite
         ......................... etc.local passed test Intersite
      Starting test: FsmoCheck
         Warning: DcGetDcName(PDC_REQUIRED) call failed, error 1355
         A Primary Domain Controller could not be located.
         The server holding the PDC role is down.
         Warning: DcGetDcName(TIME_SERVER) call failed, error 1355
         A Time Server could not be located.
         The server holding the PDC role is down.
         Warning: DcGetDcName(GOOD_TIME_SERVER_PREFERRED) call failed, error 1355
         A Good Time Server could not be located.
         ......................... etc.local failed test FsmoCheck
0
Comment
Question by:amit_krishnan
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 4
  • 2
  • +2
12 Comments
 
LVL 2

Expert Comment

by:storkyIV
ID: 34905348
Hello,

So the log states that the DC is trying to replicate, as you've said you no longer have a secondary DC, I would turn this off for a starter!
0
 
LVL 11

Expert Comment

by:Snibborg
ID: 34905390
When you shut down the other DC are you certain it was not the first DC in the domain?  Did you check where the FSMO roles were located to confirm this?

Snibborg
0
 
LVL 2

Expert Comment

by:danny1875
ID: 34905403
From what i can see, it seems as if the secondary DC was holding the PDC Emulator FSMO role. If the secondary DC is down and will not be re-introduced into the organisation you could seize that role on your current working DC. Before you do anything here you should check to see where the roles are currently being held and take things from there..
0
Optimize your web performance

What's in the eBook?
- Full list of reasons for poor performance
- Ultimate measures to speed things up
- Primary web monitoring types
- KPIs you should be monitoring in order to increase your ROI

 
LVL 11

Expert Comment

by:Snibborg
ID: 34905454
Here are some articles to assist you in this:

http://support.microsoft.com/kb/255690
http://support.microsoft.com/kb/324801

Snibborg
0
 

Author Comment

by:amit_krishnan
ID: 34905537
Thank you for your quick replies. The secondary domain was removed about 8 months ago, before I joined the organization, and I am unsure what protocol was followed when this change occurred. However, we have experienced no problems until the 9th of February, when all this started happening.

@snibborg, I appreciate the links. However through the graphical interface all checks out except the PDC emulator tab, which shows an error and will not allow me to change the master as it says the current role holder is offline.

I think there are multiple issues in our setup, however the current problems have occurred only recently while the replication errors have been occurring for some time now. The main thing worrying me the the line from the final FSMO test in dc diag, Starting test: FsmoCheck
         Warning: DcGetDcName(PDC_REQUIRED) call failed, error 1355
         A Primary Domain Controller could not be located.

I am pretty new at this so all help is highly appreciated. Thanks.
0
 
LVL 2

Assisted Solution

by:danny1875
danny1875 earned 1000 total points
ID: 34905580
Amit,

check to see where the DC thinks the PDC Emulator role is held. You can do this via ntdsutil. As i mentioned in my previous post you may well need to seize that role on your working domian controller
0
 
LVL 27

Accepted Solution

by:
KenMcF earned 1000 total points
ID: 34905640
You are going to have to run a metadatacleanup of the old DC that was removed and sieze the fsmoo roles. It looks like the DC was not removed properly. Here are links to the steps to run both.


http://support.microsoft.com/kb/216498
http://support.microsoft.com/kb/216498
0
 

Author Comment

by:amit_krishnan
ID: 34905898
Thank you for all the advice. Please find below the latest dcdiag dump. It looks a lot cleaner, but getting a time server error which I am working to fix now. I guess it is a matter of waiting to see if the problem recurs.

Domain Controller Diagnosis

Performing initial setup:
   Done gathering initial info.

Doing initial required tests
   
   Testing server: Default-First-Site-Name\DC
      Starting test: Connectivity
         ......................... DC passed test Connectivity

Doing primary tests
   
   Testing server: Default-First-Site-Name\DC
      Starting test: Replications
         ......................... DC passed test Replications
      Starting test: NCSecDesc
         ......................... DC passed test NCSecDesc
      Starting test: NetLogons
         ......................... DC passed test NetLogons
      Starting test: Advertising
         Warning: DC is not advertising as a time server.
         ......................... DC failed test Advertising
      Starting test: KnowsOfRoleHolders
         ......................... DC passed test KnowsOfRoleHolders
      Starting test: RidManager
         ......................... DC passed test RidManager
      Starting test: MachineAccount
         ......................... DC passed test MachineAccount
      Starting test: Services
         ......................... DC passed test Services
      Starting test: ObjectsReplicated
         ......................... DC passed test ObjectsReplicated
      Starting test: frssysvol
         ......................... DC passed test frssysvol
      Starting test: frsevent
         ......................... DC passed test frsevent
      Starting test: kccevent
         An Warning Event occured.  EventID: 0x8000072D
            Time Generated: 02/16/2011   15:51:45
            (Event String could not be retrieved)
         ......................... DC failed test kccevent
      Starting test: systemlog
         An Error Event occured.  EventID: 0xC0001B77
            Time Generated: 02/16/2011   15:51:45
            (Event String could not be retrieved)
         An Error Event occured.  EventID: 0x00004105
            Time Generated: 02/16/2011   16:02:16
            Event String: The maximum account identifier allocated to this

         ......................... DC failed test systemlog
      Starting test: VerifyReferences
         ......................... DC passed test VerifyReferences
   
   Running partition tests on : ForestDnsZones
      Starting test: CrossRefValidation
         ......................... ForestDnsZones passed test CrossRefValidation
      Starting test: CheckSDRefDom
         ......................... ForestDnsZones passed test CheckSDRefDom
   
   Running partition tests on : DomainDnsZones
      Starting test: CrossRefValidation
         ......................... DomainDnsZones passed test CrossRefValidation
      Starting test: CheckSDRefDom
         ......................... DomainDnsZones passed test CheckSDRefDom
   
   Running partition tests on : Schema
      Starting test: CrossRefValidation
         ......................... Schema passed test CrossRefValidation
      Starting test: CheckSDRefDom
         ......................... Schema passed test CheckSDRefDom
   
   Running partition tests on : Configuration
      Starting test: CrossRefValidation
         ......................... Configuration passed test CrossRefValidation
      Starting test: CheckSDRefDom
         ......................... Configuration passed test CheckSDRefDom
   
   Running partition tests on : etc
      Starting test: CrossRefValidation
         ......................... etc passed test CrossRefValidation
      Starting test: CheckSDRefDom
         ......................... etc passed test CheckSDRefDom
   
   Running enterprise tests on : etc.local
      Starting test: Intersite
         ......................... etc.local passed test Intersite
      Starting test: FsmoCheck
         Warning: DcGetDcName(TIME_SERVER) call failed, error 1355
         A Time Server could not be located.
         The server holding the PDC role is down.
         Warning: DcGetDcName(GOOD_TIME_SERVER_PREFERRED) call failed, error 1355
         A Good Time Server could not be located.
         ......................... etc.local failed test FsmoCheck
0
 
LVL 2

Expert Comment

by:danny1875
ID: 34905989
Hi Amit,

You definitely need to get that PDC Emulator role seized onto your working DC, that will be the reason for the time server error.
0
 

Author Comment

by:amit_krishnan
ID: 34906209
I have followed the procedure to clean up the metadata and seize the PDC emulator role to the DC, and restart the time service, after which all tests except systemlog have passed. All seems to be working fine now. I am able to create new users without hanging the system. Thank you all for the assistance.

Will monitor for some time and try to resolve the systemlog issue as well. Any idea if this indicates a critical problem?

     Starting test: systemlog
         An Error Event occured.  EventID: 0xC0001B77
            Time Generated: 02/16/2011   15:51:45
            (Event String could not be retrieved)
         An Error Event occured.  EventID: 0x00004105
            Time Generated: 02/16/2011   16:02:16
            Event String: The maximum account identifier allocated to this

         ......................... DC failed test systemlog
0
 
LVL 2

Expert Comment

by:danny1875
ID: 34906271
Amit,

Basically what that is saying is that you have errors in your event logs. Try saving them and clearing them. re-run dcdiag and you should be all set.
0
 

Author Comment

by:amit_krishnan
ID: 34906463
All seems to be clean and working now. Very much appreciate the assistance. Thanks.
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The HP utility "HP Lights-Out Online Configuration Utility for Windows Server 2003/2008" could be of great use when it comes to remotely configure a HP servers ILO WITHOUT rebooting the server. We would only need to create and run scripts using thi…
Scenerio: You have a server running Server 2003 and have applied a retail pack of Terminal Server Licenses.  You want to change servers or your server has crashed and you need to reapply the Terminal Server Licenses. When you enter the 16-digit lic…
In this video you will find out how to export Office 365 mailboxes using the built in eDiscovery tool. Bear in mind that although this method might be useful in some cases, using PST files as Office 365 backup is troublesome in a long run (more on t…
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…
Suggested Courses

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question