Solved

Domain Controller authentication not working when one DC is down

Posted on 2007-03-19
11
428 Views
Last Modified: 2010-04-18
We have two W2k3 domain controllers - the FSMO and a GC.  When we shutdown the FSMO, say to apply patches, users cannot authenticate to the domain.  Because a GC is still up, there should be no authentication problem.  But, users cannot authenticate.  The reverse is also true: If I shutdown my GC, and the FSMO is up, I cannot authenticate to the domain in Chicago.

Why? Ive been trying to resolve this issue literally for months and have yet to find any problems with my DNS, stub zones, event log errors, replication, anything...  

When I use replmon to search domain controllers for replication errors, none are posted. Both DCs are running AD integrated DNS, and are the primary and secondary name servers advertised via dhcp.  Replication-wise, the data is consistent across both DCs. So, why cant I log onto my workstation if one server is not available?

Furthermore, I have a 3rd DC (a GC) in another state.  If both DCs fail locally here, I should be able to reach that remote GC.  Exchange seems to redirect itself to the GC in Washington correctly if I fail the GC in Chicago, but I still cannot log into a workstation or other machine on the local network if one or the other local DC servers is down.

It really makes no sense to me...



0
Comment
Question by:es-it
  • 4
  • 2
  • 2
  • +2
11 Comments
 
LVL 6

Accepted Solution

by:
shayneg earned 125 total points
ID: 18751565
do you have site links to each site in Active Directory. If so remove them and let AD sort out replication itself as long as you have a fully meshed network
0
 
LVL 48

Assisted Solution

by:Jay_Jay70
Jay_Jay70 earned 125 total points
ID: 18752131
0
 

Author Comment

by:es-it
ID: 18752349
thanks guys. heres some more info:
In sites and services I have Chicago and Washington.  
In Chicago site I have servers dc1 and dc2 (fsmo and gc).  
In Washington site I have server dc3 (gc).  

For dc1 I have an automatically generated connector to dc2. But, a manually created connector to dc3.
For dc2 I have an automatically generated connector to dc1. But, a manually created connector to dc3.
For dc3 I have two manually created connectors to dc1 and dc2.

In inter-site transport, I have a single site-link that includes both sites (Chicago and Washington).  IP addresses are also linked correctly to their respective sites.

So, do I -
a) Delete the manually-created connectors and keep the site-link?
b) Delete the site-link and keep the manually created connectors?
c) Delete the site-link AND the manually created connectors?

We are single forest, single domain with plenty o bandwidth between Chicago and Washington.

0
Has Powershell sent you back into the Stone Age?

If managing Active Directory using Windows Powershell® is making you feel like you stepped back in time, you are not alone.  For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why.

 
LVL 48

Expert Comment

by:Jay_Jay70
ID: 18752461
do you get any errors at all when the FSMO holder is off? JRNL_WRAP errors etc
0
 

Author Comment

by:es-it
ID: 18752590
The only errors I see are on the GC in Chicago when coming up from reboot:
Application Error 1030:
Windows cannot query for the list of Group Policy objects. Check the event log for possible messages previously logged by the policy engine that describes the reason for this.
And before that:
Windows cannot access the file gpt.ini for GPO cn={84731FCC-ABA7-4A7A-B201-482C1C83B49C},cn=policies,cn=system,DC=domain,DC=org. The file must be present at the location <\\domain\SysVol\domain\Policies\{84731FCC-ABA7-4A7A-B201-482C1C83B49C}\gpt.ini>. (Access is denied. ). Group Policy processing aborted.

But a few minutes later, GPO processing is fine, and no more errors in the event log: "Security policy in the Group policy objects has been applied successfully."
0
 

Author Comment

by:es-it
ID: 18752911
Sorry. I didnt really answer your question Jay_Jay. When the FSMO is powered off, I dont get errors on the other DCs, but I cannot log in to my workstation.  Theres nothing in the system or application event log on my workstation either.
0
 
LVL 11

Expert Comment

by:AnthonyP9618
ID: 18753181
Did you turn Logon Caching off for your domain?  By default the value is set to 10, so there would have to have been a change in Group Policy to turn this off.  This would allow your users to still login to their workstations, without the DC actually having to authenticate the logon.  However, it may not solve problems once they logon when they need to reach other network resources.

Turn on Logon Caching:
http://technet2.microsoft.com/WindowsServer/en/library/35958fa8-2e47-4cf9-9f11-5095e5b5525e1033.mspx?mfr=true
0
 
LVL 6

Expert Comment

by:shayneg
ID: 18754481
remove manually generated connections in the NTDS settings for each office. Make sure you have a link between all offices so you have a proper meshed network. then go to each server with each site and right click ang go to properties. Where it says "This server is the bridghead server for the following transports" remove all transports out so the box is empty. Do this for all other servers and then leave AD to replicate. I know this because I have just had the same issue and we had to call Microsft. In 2003 you shouldnt need to use bridghead servers. Also make sure you DNS is functioning correctly. The process above will also help DNS
0
 

Author Comment

by:es-it
ID: 18756959
Thanks guys. Im going to wait til this weekend to remove the manual connectors per shayneq's suggestion.  I do not have any servers listed as bridgeheads. Ill let you know how it goes.
0
 
LVL 1

Expert Comment

by:Computer101
ID: 20286958
Forced accept.

Computer101
EE Admin
0

Featured Post

Comprehensive Backup Solutions for Microsoft

Acronis protects the complete Microsoft technology stack: Windows Server, Windows PC, laptop and Surface data; Microsoft business applications; Microsoft Hyper-V; Azure VMs; Microsoft Windows Server 2016; Microsoft Exchange 2016 and SQL Server 2016.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In-place Upgrading Dirsync to Azure AD Connect
This article shows the method of using the Resultant Set of Policy Tool to locate Group Policy that applies a particular setting.
Microsoft Active Directory, the widely used IT infrastructure, is known for its high risk of credential theft. The best way to test your Active Directory’s vulnerabilities to pass-the-ticket, pass-the-hash, privilege escalation, and malware attacks …
This video shows how to use Hyena, from SystemTools Software, to bulk import 100 user accounts from an external text file. View in 1080p for best video quality.

679 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question