Solved

Half network can't authenticate/join Network

Posted on 2009-04-13
11
1,095 Views
Last Modified: 2012-05-06
Win Srv 2003

I have a SERSIOUS issue I can't track down.

Today 1/2 the people in the company couldn't join the network. It seems mostly the people that shutdown their pc's over the weekend can't sign in. but there are a few that did turn off their pc and stil signed in.

I pretty much tracked it initially to the certserve on my PDC. The service was crashing and after hours it came that I had to reinstall the Certservice. Once installed my dcdiag came up clean and the errors in EV weren't populating anymore. After I rebooted, I did dcdiag one more time and and SystemLog had Failure. It also is not Replicating to 2nd DC. I've seen errors pointing to DNS can't resolve and something with KDC.

KDC Error"
The currently selected KDC certificate was once valid, but now is invalid and no suitable replacement was found.  Smartcard logon may not function correctly if this problem is not remedied.  Have the system administrator check on the state of the domain's public key infrastructure.  The chain status is in the error data.

I was getting the error prior to reinstalling the CertServ on the PDC (not after):
 Automatic certificate enrollment for local system failed to renew one Domain Controller certificate (0x800706ba).  The RPC server is unavailable

The PDC is also controls printers, and no print jobs from anyone are going through.

***This is the warning I get on the PDC in regards to replicating to other DC:

This server is the owner of the following FSMO role, but does not consider it valid. For the partition which contains the FSMO, this server has not replicated successfully with any of its partners since this server has been restarted. Replication errors are preventing validation of this role.
 
Operations which require contacting a FSMO operation master will fail until this condition is corrected.
 
FSMO Role: CN=Schema,CN=Configuration,DC=DOMAINNAME,DC=WP
***

when running DCDIAG these errors occur:
[Replications Check,GRANT] A recent replication attempt failed:
   From DC2 to GRANT
   Naming Context: DC=DomainDnsZones,DC=DOMAINNAME,DC=WP
   The replication generated an error (1908):
   Could not find the domain controller for this domain.
   The failure occurred at 2009-04-13 18:56:25.
   The last success occurred at 2009-04-13 18:50:52.
   1 failures have occurred since the last success.
   Kerberos Error.
   A KDC was not found to authenticate the call.
   Check that sufficient domain controllers are available.

 Starting test: kccevent
    An Warning Event occured.  EventID: 0x80250828
       Time Generated: 04/13/2009   18:56:02
       (Event String could not be retrieved)
    ......................... GRANT failed test kccevent



Running dcdiag /test:dns  results in:
Testing server: Default-First-Site-Name\GRANT
   Starting test: Connectivity
      The host 21ce9160-7378-4a3c-b3d1-b0713fdd3391._msdcs.DOMAINNAME.WP could not be resolved to an
      IP address.  Check the DNS server, DHCP, server name, etc
      Although the Guid DNS name (21ce9160-7378-4a3c-b3d1-b0713fdd3391._msdcs.DOMAINNAME.WP) couldn't be
      resolved, the server name (grant.DOMAINNAME.WP) resolved to the IP address (10.10.1.125) and was
      pingable.  Check that the IP address is registered correctly with the DNS server.
      ......................... GRANT failed test Connectivity

In AD Sites and Services, if I right click on DC2 and check topology, I get error "The RPC server is unavailable.


I can ping via Name of the 2nd DC.

I AM AT A POINT WHERE REMOVING THE SECOND DC IS AN OPTION BUT NOT SURE HOW! I can't have another day of 1/2 the company not being able to work. I dont' understand why some pc's won't even find the Domain.


What's Odd is I have a laptop that works, however if I plug it into another working port, it does not see the network anymore, then I plug it back into the old port and it starts working again.

Odd thing #2 - I updated with all patches from MS. However the .Net Framework 3.5 SP1 .. will not install, 1/2 way through it comes back with error "Ectration Failed: File is Corrupt".. but it's not I can take the exact file and run it on other machines, I even tried redownloading.. everytime, samthing.
0
Comment
Question by:MushroomStamp
  • 8
  • 2
11 Comments
 

Author Comment

by:MushroomStamp
ID: 24134742
I have since removed the 2nd DC via the Manage Server wizard.

0
 

Author Comment

by:MushroomStamp
ID: 24134802
After Removing 2nd DC.. I get these errors on PDC

Certificate Services could not process request 16 due to an error: The revocation function was unable to check revocation because the revocation server was offline. 0x80092013 (-2146885613).  The request was for CN=Dyno1Aux.Sturman.WoodlandPark.  Additional information: Error Verifying Request Signature or Signing Certificate
0
 

Author Comment

by:MushroomStamp
ID: 24135064
now going on hour #12 straight.. i'm tired..

If I remove a machine from the domain, I can not add it back. It says no domain controller found .. no dns..  WTF!!!! I am getting 0 errors in any log on the PDC and half the machines on the network are still working..
0
 
LVL 6

Expert Comment

by:meugen
ID: 24135675
How did you setup the DNS?
Where are the FSMO roles hosted?
Are you able to do a forward- or reverse-lookup for any host from your network and for your DC(s)?
0
 

Author Comment

by:MushroomStamp
ID: 24137617
This was an inherited network from a predecessor. I assume the DNS was created at the time of the domain. FSMO roles are all hosted on the PDC.. I now only have the one DC.  Some machines are able to lookup.. others are not. There really is no significant difference between the ones that can and can't. I'm thinking it has something to do with certificates and ones that have timed out.  

Since I'm using DHCP and all computers get the same info I'm totally baffled by the fact why some work and some dont'. Although, NONE of them can print. (print server is on the PDC)
0
Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 

Accepted Solution

by:
MushroomStamp earned 0 total points
ID: 24137721
I am now seeing the following error  after I did a /netdiag /test:dns /fix

Event provider attempted to register query "select * from SnmpExtendedNotification" whose target class "SnmpExtendedNotification" does not exist.

Also as employees are rolling in and logging in.. this error error appears for each user in the logs on the PDC.

Certificate Services could not process request 40 due to an error: The revocation function was unable to check revocation because the revocation server was offline. 0x80092013 (-2146885613).  The request was for CN=edrummon-lt.DOMAINNAME.WP.  Additional information: Error Verifying Request Signature or Signing Certificate
0
 
LVL 31

Expert Comment

by:Paranormastic
ID: 24137800
Run these, in order:
certutil -dcinfo deletebad
certutil -pulse
gpupdate /force
0
 

Author Comment

by:MushroomStamp
ID: 24137916
Ok.. did that.. it said it required a reboot for some of the policies, so I did.  I see no difference. Is there a particular policy setting that might effect this? I reviewed the policies (4)... nothing jumps out at me.

0
 

Author Comment

by:MushroomStamp
ID: 24138244
Interesting note... There are 4 out of 15 printers that still work. the 11 that don't all have an offline status. (no they are not really offline).  If people had one of the 4 printers already added, they are able to print. However I can not add one of the working computers to anyone that doesn't already have them. It comes back asking for credentials. When I enter my username/password, it comes back and says there are existing credentials and asks if I want to replace. However, when I try to replace, it comes back saying the existing credentials can not be overwritten.
0
 

Author Comment

by:MushroomStamp
ID: 24138678
UNBELIEVABLE.. I have found the problem.. it was 2 fold.. There were the issues listed above and a Switch going bad issue. Thanks everyone !..
0
 
LVL 31

Expert Comment

by:Paranormastic
ID: 24149118
Is there a particular policy setting that might effect this?
-- No, it is the autoenrollment that is advertised through AD, which this is one way to pull it.

Glad you got it all straightened out!
0

Featured Post

Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

Join & Write a Comment

Suggested Solutions

[b]Ok so now I will show you how to add a user name to the description at login. [/b] First connect to your DC (Domain Controller / Active Directory Server) SET PERMISSIONS FOR SCRIPT TO UPDATE COMPUTER DESCRIPTION TO USERNAME 1. Open Active …
One of the most often confused topics in the area DNS is the idea of GLUE records. Specifically, what they are, when they are needed, when they are provided, and how they are created. First, WHAT IS GLUE? To understand GLUE, you must first under…
This tutorial will walk an individual through the process of transferring the five major, necessary Active Directory Roles, commonly referred to as the FSMO roles from a Windows Server 2008 domain controller to a Windows Server 2012 domain controlle…
This tutorial will walk an individual through the process of configuring their Windows Server 2012 domain controller to synchronize its time with a trusted, external resource. Use Google, Bing, or other preferred search engine to locate trusted NTP …

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now