[Last Call] Learn how to a build a cloud-first strategyRegister Now


Half network can't authenticate/join Network

Posted on 2009-04-13
Medium Priority
Last Modified: 2012-05-06
Win Srv 2003

I have a SERSIOUS issue I can't track down.

Today 1/2 the people in the company couldn't join the network. It seems mostly the people that shutdown their pc's over the weekend can't sign in. but there are a few that did turn off their pc and stil signed in.

I pretty much tracked it initially to the certserve on my PDC. The service was crashing and after hours it came that I had to reinstall the Certservice. Once installed my dcdiag came up clean and the errors in EV weren't populating anymore. After I rebooted, I did dcdiag one more time and and SystemLog had Failure. It also is not Replicating to 2nd DC. I've seen errors pointing to DNS can't resolve and something with KDC.

KDC Error"
The currently selected KDC certificate was once valid, but now is invalid and no suitable replacement was found.  Smartcard logon may not function correctly if this problem is not remedied.  Have the system administrator check on the state of the domain's public key infrastructure.  The chain status is in the error data.

I was getting the error prior to reinstalling the CertServ on the PDC (not after):
 Automatic certificate enrollment for local system failed to renew one Domain Controller certificate (0x800706ba).  The RPC server is unavailable

The PDC is also controls printers, and no print jobs from anyone are going through.

***This is the warning I get on the PDC in regards to replicating to other DC:

This server is the owner of the following FSMO role, but does not consider it valid. For the partition which contains the FSMO, this server has not replicated successfully with any of its partners since this server has been restarted. Replication errors are preventing validation of this role.
Operations which require contacting a FSMO operation master will fail until this condition is corrected.
FSMO Role: CN=Schema,CN=Configuration,DC=DOMAINNAME,DC=WP

when running DCDIAG these errors occur:
[Replications Check,GRANT] A recent replication attempt failed:
   From DC2 to GRANT
   Naming Context: DC=DomainDnsZones,DC=DOMAINNAME,DC=WP
   The replication generated an error (1908):
   Could not find the domain controller for this domain.
   The failure occurred at 2009-04-13 18:56:25.
   The last success occurred at 2009-04-13 18:50:52.
   1 failures have occurred since the last success.
   Kerberos Error.
   A KDC was not found to authenticate the call.
   Check that sufficient domain controllers are available.

 Starting test: kccevent
    An Warning Event occured.  EventID: 0x80250828
       Time Generated: 04/13/2009   18:56:02
       (Event String could not be retrieved)
    ......................... GRANT failed test kccevent

Running dcdiag /test:dns  results in:
Testing server: Default-First-Site-Name\GRANT
   Starting test: Connectivity
      The host 21ce9160-7378-4a3c-b3d1-b0713fdd3391._msdcs.DOMAINNAME.WP could not be resolved to an
      IP address.  Check the DNS server, DHCP, server name, etc
      Although the Guid DNS name (21ce9160-7378-4a3c-b3d1-b0713fdd3391._msdcs.DOMAINNAME.WP) couldn't be
      resolved, the server name (grant.DOMAINNAME.WP) resolved to the IP address ( and was
      pingable.  Check that the IP address is registered correctly with the DNS server.
      ......................... GRANT failed test Connectivity

In AD Sites and Services, if I right click on DC2 and check topology, I get error "The RPC server is unavailable.

I can ping via Name of the 2nd DC.

I AM AT A POINT WHERE REMOVING THE SECOND DC IS AN OPTION BUT NOT SURE HOW! I can't have another day of 1/2 the company not being able to work. I dont' understand why some pc's won't even find the Domain.

What's Odd is I have a laptop that works, however if I plug it into another working port, it does not see the network anymore, then I plug it back into the old port and it starts working again.

Odd thing #2 - I updated with all patches from MS. However the .Net Framework 3.5 SP1 .. will not install, 1/2 way through it comes back with error "Ectration Failed: File is Corrupt".. but it's not I can take the exact file and run it on other machines, I even tried redownloading.. everytime, samthing.
Question by:MushroomStamp
  • 8
  • 2

Author Comment

ID: 24134742
I have since removed the 2nd DC via the Manage Server wizard.


Author Comment

ID: 24134802
After Removing 2nd DC.. I get these errors on PDC

Certificate Services could not process request 16 due to an error: The revocation function was unable to check revocation because the revocation server was offline. 0x80092013 (-2146885613).  The request was for CN=Dyno1Aux.Sturman.WoodlandPark.  Additional information: Error Verifying Request Signature or Signing Certificate

Author Comment

ID: 24135064
now going on hour #12 straight.. i'm tired..

If I remove a machine from the domain, I can not add it back. It says no domain controller found .. no dns..  WTF!!!! I am getting 0 errors in any log on the PDC and half the machines on the network are still working..
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.


Expert Comment

ID: 24135675
How did you setup the DNS?
Where are the FSMO roles hosted?
Are you able to do a forward- or reverse-lookup for any host from your network and for your DC(s)?

Author Comment

ID: 24137617
This was an inherited network from a predecessor. I assume the DNS was created at the time of the domain. FSMO roles are all hosted on the PDC.. I now only have the one DC.  Some machines are able to lookup.. others are not. There really is no significant difference between the ones that can and can't. I'm thinking it has something to do with certificates and ones that have timed out.  

Since I'm using DHCP and all computers get the same info I'm totally baffled by the fact why some work and some dont'. Although, NONE of them can print. (print server is on the PDC)

Accepted Solution

MushroomStamp earned 0 total points
ID: 24137721
I am now seeing the following error  after I did a /netdiag /test:dns /fix

Event provider attempted to register query "select * from SnmpExtendedNotification" whose target class "SnmpExtendedNotification" does not exist.

Also as employees are rolling in and logging in.. this error error appears for each user in the logs on the PDC.

Certificate Services could not process request 40 due to an error: The revocation function was unable to check revocation because the revocation server was offline. 0x80092013 (-2146885613).  The request was for CN=edrummon-lt.DOMAINNAME.WP.  Additional information: Error Verifying Request Signature or Signing Certificate
LVL 31

Expert Comment

ID: 24137800
Run these, in order:
certutil -dcinfo deletebad
certutil -pulse
gpupdate /force

Author Comment

ID: 24137916
Ok.. did that.. it said it required a reboot for some of the policies, so I did.  I see no difference. Is there a particular policy setting that might effect this? I reviewed the policies (4)... nothing jumps out at me.


Author Comment

ID: 24138244
Interesting note... There are 4 out of 15 printers that still work. the 11 that don't all have an offline status. (no they are not really offline).  If people had one of the 4 printers already added, they are able to print. However I can not add one of the working computers to anyone that doesn't already have them. It comes back asking for credentials. When I enter my username/password, it comes back and says there are existing credentials and asks if I want to replace. However, when I try to replace, it comes back saying the existing credentials can not be overwritten.

Author Comment

ID: 24138678
UNBELIEVABLE.. I have found the problem.. it was 2 fold.. There were the issues listed above and a Switch going bad issue. Thanks everyone !..
LVL 31

Expert Comment

ID: 24149118
Is there a particular policy setting that might effect this?
-- No, it is the autoenrollment that is advertised through AD, which this is one way to pull it.

Glad you got it all straightened out!

Featured Post

New Tabletop Appliances Blow Competitors Away!

WatchGuard’s new T15, T35 and T55 tabletop UTMs provide the highest-performing security inspection in their class, allowing users at small offices, home offices and distributed enterprises to experience blazing-fast Internet speeds without sacrificing enterprise-grade security.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Uncontrolled local administrators groups within any organization pose a huge security risk. Because these groups are locally managed it becomes difficult to audit and maintain them.
Active Directory can easily get cluttered with unused service, user and computer accounts. In this article, I will show you the way I like to implement ADCleanup..
This tutorial will walk an individual through the process of transferring the five major, necessary Active Directory Roles, commonly referred to as the FSMO roles from a Windows Server 2008 domain controller to a Windows Server 2012 domain controlle…
This tutorial will walk an individual through the process of configuring their Windows Server 2012 domain controller to synchronize its time with a trusted, external resource. Use Google, Bing, or other preferred search engine to locate trusted NTP …

826 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question