Avatar of Josh Rogalski
Josh RogalskiFlag for United States of America

asked on 

How to diagnose and troubleshoot a suspected Active Directory problem

We have an Active Directory Forest with a primary and sub domain.  This week we started experiencing issues with certain products that connect to said domain.  The products either fail, or connection from some accounts who should have permission fail.  This includes built in actions like joining a pc to the domain or in some cases remote desktop.  Odd that it isn't all products and some user accounts are affected and some aren't.  Any ideas on how to troubleshoot this would be spectacular.  No known changes or modifications that we know of.  


Server 2019 DC's

Functional level of domain - 2012R2

Tried rebooting domain controllers


User generated image

Active DirectoryDesktopsDomain Controller* LDAPWindows OS

Avatar of undefined
Last Comment
Josh Rogalski
Avatar of Scott Silva
Scott Silva
Flag of United States of America image

AD can be a balancing act...
Time on all systems needs to be within 5 minutes, and if in different time zones, they need to be set properly.
Systems need to have full contact with DC's or they will try cached credentials, and sometimes they fail.
The basics of running dcdiag and checking firewalls is first...
Also look at logs to see if any errors show up there...

Avatar of Hello There
Hello There

This includes built in actions like joining a pc to the domain or in some cases remote desktop.
This sounds like a DNS problem.
Can you run dcdiag /test:dns command? Also, do your DCs point to each other? I mean this:

DC1
Primary DNS: DC2
Secondary DNS: DC1

DC2
Primary DNS: DC1
Secondary DNS: DC2
Avatar of arnold
arnold
Flag of United States of America image

While the computer is off the network, login, reconnect the network

Run
Nslookup -q=SRV _ldap._tcp.dc._msdcs.ypuraddomainnmae.com

You can repeat the same for subdomains to confirm you get info.

Dcdiag /v
How many Dcs does the environment have

Confirming trust connections..

You included info that you have subdomains, but the error you posted lacks context, is the user login attempt from a subdomain?
Avatar of Josh Rogalski
Josh Rogalski
Flag of United States of America image

ASKER

Our domain controllers use DNS just the way you mentioned, buy pointing to one another.  For some context we have a primary domain and a subdomain, for security purposes I will redact the private information and call them the "staff.lan" domain and the "student.staff.lan" subdomain.  We have 4 domain controllers in the primary, and 4 in the sub domain (at two different sites).  It happens on both the primary and sub domain but not to every account attempting to access the box.  Some boxes it doesn't happen at all, and some products that aren't windows (like appliances) that connect to AD for authentication are affected.  I also verified that the trust still shows in "Active Directory Domains and Trusts".  Not sure if that is a good indicator of health or not but it is there in the setting.

This is the readout we get from the "dcdiag dns" test is as follows:

C:\Windows\system32>dcdiag /test:dns

Directory Server Diagnosis

Performing initial setup:
   Trying to find home server...
   Home Server = DC1
   * Identified AD Forest.
   Done gathering initial info.

Doing initial required tests

   Testing server: Auburn-Administrative\DC1
      Starting test: Connectivity
         ......................... DC1 passed test Connectivity

Doing primary tests

   Testing server: Auburn-Administrative\DC1

      Starting test: DNS

         DNS Tests are running and not hung. Please wait a few minutes...
         ......................... DC1 passed test DNS

   Running partition tests on : DomainDnsZones

   Running partition tests on : ForestDnsZones

   Running partition tests on : Schema

   Running partition tests on : Configuration

   Running partition tests on : staff

   Running enterprise tests on : staff.lan
      Starting test: DNS
         Test results for domain controllers:

            DC: DC1.staff.lan
            Domain: staff.lan


               TEST: Basic (Basc)
                  Warning: Adapter 78:2B:CB:66:C9:3D has dynamic IP address (can be a misconfiguration)

               DC1 PASS WARN PASS PASS PASS PASS n/a
         ......................... staff.lan passed test DNS
Avatar of arnold
arnold
Flag of United States of America image

check the systems where it does happen to make sure they do not use external name servers.
Avatar of Josh Rogalski
Josh Rogalski
Flag of United States of America image

ASKER

One of the systems in question uses DC1 for it's first DNS, but then it uses external DNS (google DNS) for it's 2nd and 3rd DNS servers.  Curiously, why do you mention external name servers Arnold?
Avatar of arnold
arnold
Flag of United States of America image

Because DNS does not have a test1 first and only then use anotehr, it is a random choice. in this scenario this system has a 66% chance of querying the external on bootup/user login.
at which point
nslookup -q=srv _ldap._tcp.dc._nsdcs.myaddomain.com
will return no DCs.
and this process will timeout

in an AD, there should not be a system joined to the domain that refers to any external DNS.

This is your issue.
If you want to offload DNS lookups, do it on the DNS server by defining conditional or permanent forwarders.
Avatar of Josh Rogalski
Josh Rogalski
Flag of United States of America image

ASKER

Oh we do have forwarders on the internal DNS servers to the external servers.  The machines themselves (outside that special appliance) don't forward to any other DNS servers.  
Avatar of arnold
arnold
Flag of United States of America image

Is the computer system where you are experiencing this issue, does it refer to external DNS?

Since you mentioned you have subdomains, do you have conditional forwarders for requests on that domain to go the the DC's that are responsible for those subdomains?
or the system sends all non-local requests to the Google DNS where it gets no valid answer?
Avatar of Josh Rogalski
Josh Rogalski
Flag of United States of America image

ASKER

It is happening randomly to many different PC's and to some machines that aren't even pc's (software appliances that authenticate against AD).  They only refer to local DNS and the local DNS forwards external requests upstream.  

The subdomain is less of an issue since most of the problems are being seen in the primary "staff.lan" domain.
Avatar of Scott Silva
Scott Silva
Flag of United States of America image

i still say it is either a DNS issue or a network issue. If dns is configured properly, then the machines are randomly not able to reach the dns or the ad server...
Look for switch problems, and also for machines that might get plugged in with a fixed address the same as the DC's..

ASKER CERTIFIED SOLUTION
Avatar of Josh Rogalski
Josh Rogalski
Flag of United States of America image

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
Windows OS
Windows OS

This topic area includes legacy versions of Windows prior to Windows 2000: Windows 3/3.1, Windows 95 and Windows 98, plus any other Windows-related versions including Windows Mobile.

129K
Questions
--
Followers
--
Top Experts
Get a personalized solution from industry experts
Ask the experts
Read over 600 more reviews

TRUSTED BY

IBM logoIntel logoMicrosoft logoUbisoft logoSAP logo
Qualcomm logoCitrix Systems logoWorkday logoErnst & Young logo
High performer badgeUsers love us badge
LinkedIn logoFacebook logoX logoInstagram logoTikTok logoYouTube logo