2003 DC issuse over VPN

I have a small network with several branch offices.

The Network:
- Main office has the FSMO w/another DC and four branch offices all have DC's with DNS local.
- Branch office has 5 pc's each with a local SQL server.
- The branch offices were connected via RRAS VPN.
- That RRAS server died. (long story)
- The powers that be decided to do Router to Router VPN's via Astaro 120's with IPSEC at 128 and auto filter. The VPN is wide open and no packet filtering.
- The branch offices were offline for 5 months
- Reconnected the DC's from branch office to main office.
- Some replication is occuring like accounts
- The branch office can not access the main office resources such as shares and SQL.
- Remote desktop works from branch to main office.
- All Antivirus has been removed from the servers at main FSMO, and Branches DC's.
- All ports are open across the VPN and no port or packet filtering is taking place. It is wide open.
- No local firewall software is running on any of the servers
- All Anit-spyware software has been removed
- All Servers and routers are current with security updates
- DNS is resolving both by IP and Name both FQDN and by netbios names at all sites
- All devices are pingable by IP and Name on both sides of the VPN
- Active Directory Replication is occuring but with issues
- Devices from Main office CAN access ALL shared resources on the branch DC's and network via the VPN
- All netdiag tests PASSed on both ends of the VPN at each site
- All routing is routing correctly
- The local branch sites have the kerberos tickets for the Main office servers

Known Issues:
- Branches cannot access shared resources by Name at Main office in GUI or command line has System 5 error, BUT they can access all resources by IP address
- Branches Errors are occurring with kerberos with authentication is attempted
- Branches Errors are occurring when SQL authentication is attempted and return error is SSPI
- Branches is stating replication errors in Active Directory
- Main Office server(s) appear to be missing the kerberos tickets for authentication for the remote sites
- Kerberos error in log at Main Office all fields are blank in log event.

What I do not know:
- Is the VPN not forwarding the kerberos tickets for authentication
- Does Active Directory has some sort of unknown issue with Kerberos
- Is SQL Security not binding correctly with Active Directory

Somehow I feel the delay in getting the branch office back online has caused something to get out of sync.

Pulling out my hair, would like some direction on a resolution and what logs I need to post here.
Who is Participating?

Improve company productivity with a Business Account.Sign Up

Mike KlineConnect With a Mentor Commented:
Just something to try here, you said that you implemented a new router to router VPN.   Maybe you only need to use TCP for kerberos (or try it on a few machines)
A common problem is that routers will arbitrarily fragment UDP packets; when this happens the Kerberos ticket request packets are discarded by the KDC. Windows Vista and Windows Server 2008 now default to using TCP for kerberos ticket requests. Typically you work around this issue by implementing the following KB article:

244474 How to force Kerberos to use TCP instead of UDP in Windows Server 2003, in Windows XP, and in Windows 2000 - http://support.microsoft.com/default.aspx?scid=kb;EN-US;244474
We are testing a new VPN and have had some issues and may try to foce TCP also.
Hmmm... This is pretty freaky... I wonder what about:

You mentioned that the dcs (of the branch offices) and the FSMO DC's were not connected for 5 months? You mentioned that your dns is good? I know that windows will "Invalidate" and account if that computer account is not in contact with the dc and able to sync/update password changes within 30 days.

Can any dc at any branch office connect to the main dc by name? Can any client machine connect to the branch office DC by name? Can any client pc connect to the Main Dc's by name if you change the dns settings of the client pc to the ip of the fsmo rolde holder? (using the fsmo role as the dns server?)

 -Branches Errors are occurring with kerberos with authentication is attempted | Are you getting any PAC Errors?
-- Does Active Directory has some sort of unknown issue with Kerberos: No AD Loves Kerberos!

What client eventlog errors are you getting? If youre getting 1030 (etc) errors, try rejoining the client pc to the domain using the same name (Reset the client PC account name, or delete it) but join it using the dns of the FSMO role holder...

technutzAuthor Commented:
Ok freaky I can agree with...

No passwords have been changed with admin or user account as they on have one account called staff... yeah I know about mutli accounts but...it is complicated with just a couple of accounts....

Q: You ask can any DC connect to main dc by name?
A: No, but yes by IP, really weird that is.

Q: Can Client connect to main DC by names if dns is changed? I have not tried that. I have only focused on trying to get DC to talk to Main DC. Clients only need to talk to local DC. I will have to experimnet with that. Not the better solution with that config.

Q: Am I getting PAC errors.
A: I would says no, because I do not know what that is and it does not appear in the logs.

Q: Does AD have issues with kerberos?
A: I can only state that the tickets on the local DC do not match the tickets on the FSMO DC.

Q: Sumary: Eventlog on re-join.
A: I have not attempted a re-join, as I do not understand why the current join is not working or where to begin to troubleshoot.... :(

Freaky guy signing off...plz help if you have a clue...ho hummmm...sigh

Thanks for tasking!
NEW Internet Security Report Now Available!

WatchGuard’s Threat Lab is a group of dedicated threat researchers committed to helping you stay ahead of the bad guys by providing in-depth analysis of the top security threats to your network.  Check out this quarters report on the threats that shook the industry in Q4 2017.

I'm going to bet it's DNS issue.  
Make sure your local DNS servers redirect to your main office's DNS server(over the VPN).   OR add your main office's DNS server to a secondary DNS server on client PC's at a branch office.
technutzAuthor Commented:
@Korbus: I changed the DNS to point to the main office and disable the local DNS. Still no dice, System Error 5 Access denied. I have put them back into place as normal.

@mkline71: I have made the req change to switch to TCP for Kerberos. I am waiting for another the opportunity to reboot and we will find out.

technutzAuthor Commented:

After rebooting and Forcing Kerberos both on the branch office and main office server did not work. Below is the error message when the branch office attempts to connect. Again there are no tickets at the main office that represent the branch office using the ticket viewer utility.

Event Type: Failure Audit
Event Source: Security
Event Category: Logon/Logoff
Event ID: 529
Date:  5/10/2009
Time:  9:23:19 PM
Computer: UCSVR01
Logon Failure:
   Reason:  Unknown user name or bad password
  User Name:
  Logon Type: 3
  Logon Process: Kerberos
  Authentication Package: Kerberos
  Workstation Name: -
  Caller User Name: -
  Caller Domain: -
   Caller Logon ID: -
  Caller Process ID: -
  Transited Services: -
  Source Network Address:
  Source Port: 1186

Any other thoughts? How can I tell if the DC has tomb stoned? or How can I make kerby happy at the PDC at the main office?

Mike KlineConnect With a Mentor Commented:
You can run dcdiag /v or repadmin /showreps to see the last time replication happened (help with the tombstoned question)
technutzAuthor Commented:
Ok this was deeper that I could troubleshoot alone. However I do have a resolution now:

The router was not the problem and was confirmed by vendor looking at captured packets and kerberos. They were being passed.

The root issues: (per Microsoft Support)
- Since the branch office had been offline and not able to talk due to the VPN being down to the Primary Domain Controller. The branch offices did not received the encryption information update they needed to authenticate talk to the PDC. While the branch offices where not connected, the PDC updated itself with new security encryption information using tickets for Kerberos. Therefore the PDC was refusing to talk to all the branch offices with the old tickets used by the domain controllers at the branch offices. All this stuff is deep level stuff that I normal do not see, that is until it breaks of course.
The fix:
- Purge the kerberos tickets at branch offices
- Request new ones from PDC.
- Reboot

This has fixed the ALL resource sharing and SQL between branch offices and Main office.

Here is all we had to do for anyone else that runs into this issue:

In Console:
- net stop kdc
- klist purge ( windows resource kit /tools)
- netdom resetpwd /server:pdc /userd:administrator  /passwordd: password

if it says successfull run the following command
- net use \\pdc\ipc$

if this is successfull run
- net start kdc

Reboot Server

I hope that help!
technutzAuthor Commented:
Even though I had to get Microsoft Active Directory support involve we were close to an answer. I gave credit to those that posted links, commands and references to research. I thank everyone that participated.
Mike KlineCommented:
Good fix man!!! Nicely done, I should have gotten the klist purge...I love learning :)
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.