Link to home
Start Free TrialLog in
Avatar of craig
craig

asked on

AD Problems - Time Sync, Repl errors, Domain Membership Issues

I had this question after viewing Windows 2012 DC replication issue.

-I recently took over a new customer.  They have a single DC on Server 2012 R2.  Joining a windows client to the domain seems normal.  But after a few hours...a few days, the client cannot authenticate as a domain member even with cached information. (sorry, I don't have the specific error message).

-The client looks like a domain member, but in Local Users and Groups, the computer appears to have fallen off the domain with only the SIDS showing instead of domain usernames\groups.

-The server event log looks like a disaster.  It appears that the server was originally named Temp and then named DC2.  However, DNS and AD still points at Temp.

Initial Errors:  
1925 The attempt to establish a replication link for the following writable directory partition failed.
4 The Kerberos client received a KRB_AP_ERR_MODIFIED error from the server dc2$.
47 Time Provider NtpClient: No valid response has been received from manually configured peer pool.ntp.org after 8 attempts to contact it.

Ran Repadmin with results attached.Repadmin.rtf
replsummary Repadmin-replsummary.rtf

The server CoastTemp does not exist.  However, the IP is listed in DNS.

Your expertise is greatly appreciated.  I am thinking a domain rebuild is in order.  But, there are 40 clients and the cust will not want to pay the expense of a rebuild because they don't understand that they have problems.  Some of you will tell me to drop the customer :)

Best,
Craig
Avatar of noci
noci

Task one show the customer his problem.....
ie. link events they experienced to specific log messages, then show other log messages that might predict similar problems with other workstations...

If that doesn't convince then advise to do nothing (there is no problem ...) until there is a problem.
Try to do as accurate as possible to do predictions based on evidence you can find.

I can't say if this is a customer where you want to expend your time on,  OTOH there are customers that want the kingdom for a penny.
Avatar of craig

ASKER

Noci-
Other than a rebuild...have some ideas on repairing AD?

Thanks,
Craig
No i have no experience with windows systems. So i have no advice on that part of the issue.
The customer needs to find the value in your service.... With you getting rewarded for your effort and both parties need to be happy about it.

Based on what you told i can give some advice...
At least all modern systems need a consistent timesource (esp. when timing is part of authentication & authorisation).
AD uses Kerberos, and Kerberos has requirements about consistent timing, so getting that straightend out is a priority.
(most effective: sync DC to a stable clock (SNTP/NTP based, either through hardware (GSM) or through a time service on the internet
(Check http://support.ntp.org/bin/view/Main/WebHome ).
Then let all workstation sync with the DC... that should get a major disruptor out of the way....).

It might prove your additional value for your customer, without beeing too expensive to investigate....
The least you get are logs where events can be related to each other to get a clue for the next step.
Check that the NETLOGON service is running. This happens when NETLOGON is paused or stopped
Avatar of craig

ASKER

Shaun-
Thank you...but Netlogon is running.  The problem looks deeper than this.  Kerberos and Timesync look like a mess.
can you check on clients and DCs if anywhere public DNS servers (such as google dns 8.8.8.8) are there as preferred or alternate dns servers set, in that case you will face issues most probably
OR
on client machines do you have enforced any dns suffix search list manually / through GPO where primary domain is not 1st in the list, you can run ipconfig /all on client machine to check if its pointing to correct dns domain as 1st entry
Avatar of craig

ASKER

most definitely, they are forcing a different suffix. they have a nuspire appliance.

however, even with static ip same problems.
it does not matter if you have static or dynamic IP entry
as long as dns search suffix entries are there, 1st entry should point to dns domain the clinet is member of otherwise its likely client will fail to resolve and authenticate with DCs and unexpected results may occur
ASKER CERTIFIED SOLUTION
Avatar of DrDave242
DrDave242
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of craig

ASKER

Dr Dave..... Metadata did the trick. Now to clean up DNS. They have a managed sonicwall that points to external DNS.  I believe it's causing issues but can't prove it.

odd thing.... Carbonite was seeing multiple servers being backed up and shutting down because the client is designed to backup on one server. had to delete the job and recreate it.


next step is to bring a test client into the environment and join the domain to see if it stays on.
A few authentication events that say weak security.
Thank you!
Let me know if that client is able to join the domain and stay authenticated. I think I know the "weak security" events you're talking about; you may want to discuss with your customer whether to implement the change mentioned in those events (if they really are the ones I'm thinking of).
Avatar of craig

ASKER

cleared out Metadata associated with nonexistent machine as well as DNS.