Link to home
Start Free TrialLog in
Avatar of andyhamm
andyhamm

asked on

Kerberos/Domain login problems after ADR of 2003 Server - very strange behavior

Hi Everyone,

Here is the history:

A RAID set on a 2003 Server system disk (with Exchange), failed and the system could not be recovered by rebuilding the array. The only option that was available was to use the ADR and restore to a time 2 months earlier (the local admin only does the ADR when he feels that there has been a major change in the environment). The DATA array is O.K, so it does not need to be recovered, and we had an up to date Exchange backup. So, with this being my best available option I restored the server using the ADR method.

The server came back up in a spitting image of itself of a time 2 months previous, which is fine. I performed the Exchange restore, which went off without a hitch. The event logs of the server look clean at this point, other than the fact that it is not connected to the network.

Here is the strange part:

We connect the server back onto the network, it connects to the internet and mail begins flowing in and out of the server again, in general things look good.

Then, we try to bring the workstations back online, and they get errors stating that they cannot find the Domain Controller, and cannot log in. We fiddle a bit, and I find that if you disconnect the LAN cable from the (Win XP SP2) workstations and then log into the domain, then reconnect the cable you can access the server and Exchange.

We are connected, allthough I cannot have the users perform this ritual everytime they have to reboot their workstations. There are a few machines on the network that are still running Windows 98, and they can connect to the domain with no problem.

But wait, there are other issues:

I cannot connect from the server to the workstations, I get various no permission type errors depending on how I attempt to connect, some result in kerberos errors in the servers event log, while others logs erros equate to having two computer accounts with the same name on the domain (which I assure you is not the case). There are no other servers on the network that would be out of synch because the server replicates with no one. DNS does not have multiple entries for the machines.

I can ping the workstations, resolve their names properly and I have confirmed that DHCP and DNS are set up properly on the server. The workstations use the server as their only DNS server, and they can all resolve names without issue. I flushed DNS on the workstations, released their IPs and renewed again and made sure that the time was in sych with the server. On the surface, everything looks fine.

Furthermore, you get all of the problems assosiated with a non working domain authentication issue such as the machines cannot connect to printers shared on other workstations etc.

The suggested solution:

The only solution that I can come up with for this problem is to remove and re-add all of XP workstations Computer accounts from the domain, but there is an issue with that. This is a relitively new istallation (4 months up and running), and there were difficulties coppying the users profiles on the workstations. The users have a software set that takes quite some time to configure, and they have only recently ironed out all of the bugs from the last migration. Taking them down that path again is a situation that has to be avoided at all costs.

Any other suggestions?

Thanks,

Andy

Avatar of andyhamm
andyhamm

ASKER

Well, I have found the answer to part of the problem.

As we all know, computers on a Windows Domain have computer accounts (unless they are old 9x boxes). These computer accounts have passwords, and these passwords change every 30 days (by default). Since the backup that was used to restore the system is older than 30 days, the machine account passwords are out of sync.

Now all I need to do is find a way to re-sync all of my computer account passwords....

Any suggestions?

Andy
Can I get points for answering my own questions?

Here is the solution:

Get NETDOM.EXE from the support tools on the CD.

and create the following batch file:

NETDOM RESETPWD /Server:servername.domainname /UserD:NETBIOSDOMAINNAME\administrator /PasswordD:*

replace the 'servername.domainname' with the fully qualified domain name of your server (internal fqdn that is), and 'NETBIOSDOMAINNAME' with your own domain name. You can enter your administrator password instead of the *, but I wouldn't recomend it as it will be easy to discover.

I put the script on the server in one of the common shares. Then you have to go around to every computer and run the script (you can't run it remotely, even if you could log onto the machines), it will promt you for the administrators password and hopefully you will be in good shape. I run this script from a command prompt so I can see any error messages which would otherwise be missed if it was launched from the gui.

Now that everything is back in place, take a step back and beat yourself up for not backing up the system state on the server every day because your tape drive is not big enough. Start backing it up to a hard drive if you have to because if we had a current system state backup, we never would have had gotten ourselves into this mess in the first place.

Lesson re-learned from experience.

Andy
ASKER CERTIFIED SOLUTION
Avatar of EE_AutoDeleter
EE_AutoDeleter

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial