Link to home
Start Free TrialLog in
Avatar of manic_andy
manic_andyFlag for New Zealand

asked on

Time Sync Problem

Hey,

Recently one of my VMs (vcenter actually) started saying the time sync was out even though it wasn't and was set the same as my DCs and was set to sync with them.  Now today I notice this in the event log-

The failure code from authentication protocol Kerberos was "The time at the Primary Domain Controller is different than the time at the Backup Domain Controller or member server by too large an amount.
(0xc0000133)".

Its referincing one of my DCs (DC2) which is the alternate DNS but holds all of the FSMO roles.

Recently I replaced one my preferred DNS server with another, well, I swapped the IP addresses round as I was retiring the old one.  FMSO rules remained on the secondary DC2 the entire time.  This all seemed to go fine at the time and has been like this for 2-3 months now.  Originally I had my old DC1 getting time from an external source so I have just done net time /querysntp on each DC and I get this.

OLDDC1 - syncs to ntp.massey.ac.nz
NEWDC1 - syncs to OLDDC1
DC2 (all FSMO roles) - syncs to OLDDC1

And the affected client is trying to sync to DC2.

All the times seem to be fine between all of them, but any ideas what I can do to resolve before I get any serious time issues.  Or any recommendations for best practices for time sync?

All the DCs and the client VM affected are Windows Server 2003.

Thanks,

Andy
Avatar of IanTh
IanTh
Flag of United Kingdom of Great Britain and Northern Ireland image

why dont you use the external time sync on your lan as thats will never change
Avatar of manic_andy

ASKER

Sorry, don't get what you mean by time sync on the lan?
ASKER CERTIFIED SOLUTION
Avatar of 172pilotSteve
172pilotSteve
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
OK so I've had a bash at this and seemed to have mixed results.

To sum up my environment.

OLDDC1 - used to have FSMO but IP changed a while a ago as due to retire
DC1
DC2 - holds all FSMO roles

So I went onto DC2 and did
w32tm /config /manualpeerlist:"ntp.massey.ac.nz 130.123.2.98" /syncfromflags:manual /reliable:yes /update
and net stop w32time then net start w32time and that seemed to work.  DC2 event log shows it connecting to the time server externally and if I do a net time /querysntp from DC2 it shows up as the external server.  Great.

So, I go to OLDDC1 and do
w32tm /config /syncfromflags:domhier /reliable:no /update
and net stop w32time then net start w32time

BUT, I then get event ID 14 for source W32Time
The time provider NtpClient was unable to find a domain controller to use as a time source. NtpClient will try again in 15 minutes.

So I set it back to sync via the external with with reliable NO and it synced OK.  Why wouldn't that work?

From my desktop I did a query sntp and although my logon server is DC1 is shows the time as coming from OLDDC1??

Any ideas whats going on?
That is pretty funky..   I would check to make sure at least to begin with, that all of the DCs are "close enough" because Kerberos will fail if the time isn't really close, and that will stop the DCs from communicating properly, which will just make it much worse and do strange things like this...

That being said, a couple other things I'd look at would be DNS...  Are all of your DCs also running DNS?    If you look at the Active Directory DNS zone, are the DCs all listed with the proper IP addresses?  Are there other DCs or IP addresses listed which are no longer accurate?  

Each DC should be pointing to itself for DNS, but not until replication and time sync is working..  If you're still having problems, sometimes you can get it all to work by pointing the DNS client (in the network configuration) for each DC to point to the DC that has the FSMOs, and then they'll replicate correctly and you can set each to point to itself again.

Are there any other AD / DNS errors listed in any Event logs?
Yeah all the DCs have time which looks the same, same as all my member servers.

Yep in DNS all my DCs show up correctly with the correct IP address.  I do have an additonal entry in the ForestDNSZones tree for the IP address of my old management server for some reason but this isn't a DC or DNS server?

All the DCs/DNS point to DC2(fsmo holder) for prefered DNS and DC1 for alternate DNS.  None of them point to OLDDC1.

No AD or DNS errors for months.
Hmm..  OK -  Re-reading what you said...  Maybe you need to be more patient after the "The time provider NtpClient was unable to find a domain controller to use as a time source. NtpClient will try again in 15 minutes" error...  I had that same error on a VM late last week, and it was just that the NTP was starting up before the AD services, etc, and on the next cycle, it did work, and all was well...

One factor that was making my situation worse was that the ESXi host had an invalid time server configured, and for some reason, this made the client tools sync the WRONG TIME to the VMs, until the NTP service was able to correct (after 15 minutes)..  Fixing the NTP config on the host helped a lot..

What version of ESX(i) or other VMWare are you using?
Thanks I'll give it another try tomorrow and wait the 15 mins for it to retry, just paniced I think and set it back.

The DCs are all physical servers actually.  The original problem which alerted me to this was because my vCenter VM gave an error relating to time sync, and I've just checked and its done it again just so I may give that VM a bounce tomorrow when I can to see if that helps.  I'm using VMware ESXi 4.1 and vSphere 4.1 for those.
Ok..  That's cool - One less variable (the ESXi forcing a host timesync).  

If the vcenter is in the domain, and you haven't changed the timesync config for that VM, and it's NOT set to sync to the hardware with the vmTools, then it should EVENTUALLY get a timesync from ONE of the DCs...

Worst case, if the time is REALLY off, then AD wont be available right away because Kerberos will fail to authenticate the machine account, but DNS should still work to find the DCs, and after 15 minutes the VM should be able to set it's clock and all will be OK..  At least that's the plan!  :-)

Still not a bad idea to set the ESXi servers to the right time zone and set an external NTP host, because each time when you turn on those VMs, unlike a PHYSICAL machine, the virtuals dont have CMOS batteries and clocks, so even if you dont want them syncing to the hardware, when they first boot up, there's nowhere else to get the initial clock than the ESXi CMOS clock, so that's where it goes...
Time syncing working OK.  Never got the original DC sorted but will be retiring it soon anyway.