Domain Controller authentication not working when one DC is down

We have two W2k3 domain controllers - the FSMO and a GC.  When we shutdown the FSMO, say to apply patches, users cannot authenticate to the domain.  Because a GC is still up, there should be no authentication problem.  But, users cannot authenticate.  The reverse is also true: If I shutdown my GC, and the FSMO is up, I cannot authenticate to the domain in Chicago.

Why? Ive been trying to resolve this issue literally for months and have yet to find any problems with my DNS, stub zones, event log errors, replication, anything...  

When I use replmon to search domain controllers for replication errors, none are posted. Both DCs are running AD integrated DNS, and are the primary and secondary name servers advertised via dhcp.  Replication-wise, the data is consistent across both DCs. So, why cant I log onto my workstation if one server is not available?

Furthermore, I have a 3rd DC (a GC) in another state.  If both DCs fail locally here, I should be able to reach that remote GC.  Exchange seems to redirect itself to the GC in Washington correctly if I fail the GC in Chicago, but I still cannot log into a workstation or other machine on the local network if one or the other local DC servers is down.

It really makes no sense to me...



es-itAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

shaynegCommented:
do you have site links to each site in Active Directory. If so remove them and let AD sort out replication itself as long as you have a fully meshed network
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
es-itAuthor Commented:
thanks guys. heres some more info:
In sites and services I have Chicago and Washington.  
In Chicago site I have servers dc1 and dc2 (fsmo and gc).  
In Washington site I have server dc3 (gc).  

For dc1 I have an automatically generated connector to dc2. But, a manually created connector to dc3.
For dc2 I have an automatically generated connector to dc1. But, a manually created connector to dc3.
For dc3 I have two manually created connectors to dc1 and dc2.

In inter-site transport, I have a single site-link that includes both sites (Chicago and Washington).  IP addresses are also linked correctly to their respective sites.

So, do I -
a) Delete the manually-created connectors and keep the site-link?
b) Delete the site-link and keep the manually created connectors?
c) Delete the site-link AND the manually created connectors?

We are single forest, single domain with plenty o bandwidth between Chicago and Washington.

0
Newly released Acronis True Image 2019

In announcing the release of the 15th Anniversary Edition of Acronis True Image 2019, the company revealed that its artificial intelligence-based anti-ransomware technology – stopped more than 200,000 ransomware attacks on 150,000 customers last year.

Jay_Jay70Commented:
do you get any errors at all when the FSMO holder is off? JRNL_WRAP errors etc
0
es-itAuthor Commented:
The only errors I see are on the GC in Chicago when coming up from reboot:
Application Error 1030:
Windows cannot query for the list of Group Policy objects. Check the event log for possible messages previously logged by the policy engine that describes the reason for this.
And before that:
Windows cannot access the file gpt.ini for GPO cn={84731FCC-ABA7-4A7A-B201-482C1C83B49C},cn=policies,cn=system,DC=domain,DC=org. The file must be present at the location <\\domain\SysVol\domain\Policies\{84731FCC-ABA7-4A7A-B201-482C1C83B49C}\gpt.ini>. (Access is denied. ). Group Policy processing aborted.

But a few minutes later, GPO processing is fine, and no more errors in the event log: "Security policy in the Group policy objects has been applied successfully."
0
es-itAuthor Commented:
Sorry. I didnt really answer your question Jay_Jay. When the FSMO is powered off, I dont get errors on the other DCs, but I cannot log in to my workstation.  Theres nothing in the system or application event log on my workstation either.
0
AnthonyP9618Commented:
Did you turn Logon Caching off for your domain?  By default the value is set to 10, so there would have to have been a change in Group Policy to turn this off.  This would allow your users to still login to their workstations, without the DC actually having to authenticate the logon.  However, it may not solve problems once they logon when they need to reach other network resources.

Turn on Logon Caching:
http://technet2.microsoft.com/WindowsServer/en/library/35958fa8-2e47-4cf9-9f11-5095e5b5525e1033.mspx?mfr=true
0
shaynegCommented:
remove manually generated connections in the NTDS settings for each office. Make sure you have a link between all offices so you have a proper meshed network. then go to each server with each site and right click ang go to properties. Where it says "This server is the bridghead server for the following transports" remove all transports out so the box is empty. Do this for all other servers and then leave AD to replicate. I know this because I have just had the same issue and we had to call Microsft. In 2003 you shouldnt need to use bridghead servers. Also make sure you DNS is functioning correctly. The process above will also help DNS
0
es-itAuthor Commented:
Thanks guys. Im going to wait til this weekend to remove the manual connectors per shayneq's suggestion.  I do not have any servers listed as bridgeheads. Ill let you know how it goes.
0
Computer101Commented:
Forced accept.

Computer101
EE Admin
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Windows Server 2003

From novice to tech pro — start learning today.