Win Srv 2003
I have a SERSIOUS issue I can't track down.
Today 1/2 the people in the company couldn't join the network. It seems mostly the people that shutdown their pc's over the weekend can't sign in. but there are a few that did turn off their pc and stil signed in.
I pretty much tracked it initially to the certserve on my PDC. The service was crashing and after hours it came that I had to reinstall the Certservice. Once installed my dcdiag came up clean and the errors in EV weren't populating anymore. After I rebooted, I did dcdiag one more time and and SystemLog had Failure. It also is not Replicating to 2nd DC. I've seen errors pointing to DNS can't resolve and something with KDC.
The currently selected KDC certificate was once valid, but now is invalid and no suitable replacement was found. Smartcard logon may not function correctly if this problem is not remedied. Have the system administrator check on the state of the domain's public key infrastructure. The chain status is in the error data.
I was getting the error prior to reinstalling the CertServ on the PDC (not after):
Automatic certificate enrollment for local system failed to renew one Domain Controller certificate (0x800706ba). The RPC server is unavailable
The PDC is also controls printers, and no print jobs from anyone are going through.
***This is the warning I get on the PDC in regards to replicating to other DC:
This server is the owner of the following FSMO role, but does not consider it valid. For the partition which contains the FSMO, this server has not replicated successfully with any of its partners since this server has been restarted. Replication errors are preventing validation of this role.
Operations which require contacting a FSMO operation master will fail until this condition is corrected.
FSMO Role: CN=Schema,CN=Configuration,DC=DOMAINNAME,DC=WP
when running DCDIAG these errors occur:
[Replications Check,GRANT] A recent replication attempt failed:
From DC2 to GRANT
Naming Context: DC=DomainDnsZones,DC=DOMAINNAME,DC=WP
The replication generated an error (1908):
Could not find the domain controller for this domain.
The failure occurred at 2009-04-13 18:56:25.
The last success occurred at 2009-04-13 18:50:52.
1 failures have occurred since the last success.
A KDC was not found to authenticate the call.
Check that sufficient domain controllers are available.
Starting test: kccevent
An Warning Event occured. EventID: 0x80250828
Time Generated: 04/13/2009 18:56:02
(Event String could not be retrieved)
......................... GRANT failed test kccevent
Running dcdiag /test:dns results in:
Testing server: Default-First-Site-Name\GRANT
Starting test: Connectivity
The host 21ce9160-7378-4a3c-b3d1-b0713fdd3391._msdcs.DOMAINNAME.WP could not be resolved to an
IP address. Check the DNS server, DHCP, server name, etc
Although the Guid DNS name (21ce9160-7378-4a3c-b3d1-b0713fdd3391._msdcs.DOMAINNAME.WP) couldn't be
resolved, the server name (grant.DOMAINNAME.WP) resolved to the IP address (10.10.1.125) and was
pingable. Check that the IP address is registered correctly with the DNS server.
......................... GRANT failed test Connectivity
In AD Sites and Services, if I right click on DC2 and check topology, I get error "The RPC server is unavailable.
I can ping via Name of the 2nd DC.
I AM AT A POINT WHERE REMOVING THE SECOND DC IS AN OPTION BUT NOT SURE HOW! I can't have another day of 1/2 the company not being able to work. I dont' understand why some pc's won't even find the Domain.
What's Odd is I have a laptop that works, however if I plug it into another working port, it does not see the network anymore, then I plug it back into the old port and it starts working again.
Odd thing #2 - I updated with all patches from MS. However the .Net Framework 3.5 SP1 .. will not install, 1/2 way through it comes back with error "Ectration Failed: File is Corrupt".. but it's not I can take the exact file and run it on other machines, I even tried redownloading.. everytime, samthing.