AD Problems - Time Sync, Repl errors, Domain Membership Issues

I had this question after viewing Windows 2012 DC replication issue.

-I recently took over a new customer.  They have a single DC on Server 2012 R2.  Joining a windows client to the domain seems normal.  But after a few hours...a few days, the client cannot authenticate as a domain member even with cached information. (sorry, I don't have the specific error message).

-The client looks like a domain member, but in Local Users and Groups, the computer appears to have fallen off the domain with only the SIDS showing instead of domain usernames\groups.

-The server event log looks like a disaster.  It appears that the server was originally named Temp and then named DC2.  However, DNS and AD still points at Temp.

Initial Errors:  
1925 The attempt to establish a replication link for the following writable directory partition failed.
4 The Kerberos client received a KRB_AP_ERR_MODIFIED error from the server dc2$.
47 Time Provider NtpClient: No valid response has been received from manually configured peer after 8 attempts to contact it.

Ran Repadmin with results attached.Repadmin.rtf
replsummary Repadmin-replsummary.rtf

The server CoastTemp does not exist.  However, the IP is listed in DNS.

Your expertise is greatly appreciated.  I am thinking a domain rebuild is in order.  But, there are 40 clients and the cust will not want to pay the expense of a rebuild because they don't understand that they have problems.  Some of you will tell me to drop the customer :)

Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

nociSoftware EngineerCommented:
Task one show the customer his problem.....
ie. link events they experienced to specific log messages, then show other log messages that might predict similar problems with other workstations...

If that doesn't convince then advise to do nothing (there is no problem ...) until there is a problem.
Try to do as accurate as possible to do predictions based on evidence you can find.

I can't say if this is a customer where you want to expend your time on,  OTOH there are customers that want the kingdom for a penny.
craigAuthor Commented:
Other than a rebuild...have some ideas on repairing AD?

nociSoftware EngineerCommented:
No i have no experience with windows systems. So i have no advice on that part of the issue.
The customer needs to find the value in your service.... With you getting rewarded for your effort and both parties need to be happy about it.

Based on what you told i can give some advice...
At least all modern systems need a consistent timesource (esp. when timing is part of authentication & authorisation).
AD uses Kerberos, and Kerberos has requirements about consistent timing, so getting that straightend out is a priority.
(most effective: sync DC to a stable clock (SNTP/NTP based, either through hardware (GSM) or through a time service on the internet
(Check ).
Then let all workstation sync with the DC... that should get a major disruptor out of the way....).

It might prove your additional value for your customer, without beeing too expensive to investigate....
The least you get are logs where events can be related to each other to get a clue for the next step.
Redefine Your Security with AI & Machine Learning

The implications of AI and machine learning in cyber security are massive and constantly growing, creating both efficiencies and new challenges across the board. Check out our on-demand webinar to learn more about how AI can help your organization!

Shaun VermaakTechnical SpecialistCommented:
Check that the NETLOGON service is running. This happens when NETLOGON is paused or stopped
craigAuthor Commented:
Thank you...but Netlogon is running.  The problem looks deeper than this.  Kerberos and Timesync look like a mess.
can you check on clients and DCs if anywhere public DNS servers (such as google dns are there as preferred or alternate dns servers set, in that case you will face issues most probably
on client machines do you have enforced any dns suffix search list manually / through GPO where primary domain is not 1st in the list, you can run ipconfig /all on client machine to check if its pointing to correct dns domain as 1st entry
craigAuthor Commented:
most definitely, they are forcing a different suffix. they have a nuspire appliance.

however, even with static ip same problems.
it does not matter if you have static or dynamic IP entry
as long as dns search suffix entries are there, 1st entry should point to dns domain the clinet is member of otherwise its likely client will fail to resolve and authenticate with DCs and unexpected results may occur
DrDave242Senior Support EngineerCommented:
They have a single DC on Server 2012 R2.
1925 The attempt to establish a replication link for the following writable directory partition failed.

If they have a single DC, there's nothing for it to replicate with, but this DC apparently believes there's another DC in the domain. (This is backed up by the repadmin output.) Most likely, the other DC was never demoted before being taken offline.

First, run netdom query fsmo from an elevated command prompt on the existing DC and confirm whether it holds all of the FSMO roles. If it doesn't (i.e., if any roles list the nonexistent DC as their holder), you'll need to seize those roles on the existing DC. There are a couple of ways to do this, but the simplest way is via the Move-ADDirectoryServerOperationMasterRole Powershell cmdlet with the -Force switch, since it allows you to seize multiple roles with a single command. The older way to do this is via the Ntdsutil interface, which still works but is a little more time-consuming. Instructions for using this method are here, if you're interested.

Once the roles are all on the existing DC, you'll need to perform a metadata cleanup to remove the defunct DC from AD. This used to require Ntdsutil as well, but the process is simpler nowadays. Instructions for the new (GUI) and old (Ntdsutil) methods of performing a metadata cleanup are here.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
craigAuthor Commented:
Dr Dave..... Metadata did the trick. Now to clean up DNS. They have a managed sonicwall that points to external DNS.  I believe it's causing issues but can't prove it.

odd thing.... Carbonite was seeing multiple servers being backed up and shutting down because the client is designed to backup on one server. had to delete the job and recreate it.

next step is to bring a test client into the environment and join the domain to see if it stays on.
A few authentication events that say weak security.
Thank you!
DrDave242Senior Support EngineerCommented:
Let me know if that client is able to join the domain and stay authenticated. I think I know the "weak security" events you're talking about; you may want to discuss with your customer whether to implement the change mentioned in those events (if they really are the ones I'm thinking of).
craigAuthor Commented:
cleared out Metadata associated with nonexistent machine as well as DNS.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Windows OS

From novice to tech pro — start learning today.