Link to home
Start Free TrialLog in
Avatar of honeywell2012
honeywell2012

asked on

Time Drifting Forward

Hello
We have an issue with about 60 Windows 2003 R2 Dell R410 servers drifting ahead in time.
The remaining 40 servers dont drift, these are some more Dell R410's, 710's  all running windows 2003 R2.

We have a scheduled task on each server that syncs time every 5 minutes with the domain (Hosted on VMware ESX). Before the next sync it gained approx 10 seconds.

We have tried updating the Server Firmware, BIOS and completely replacing the sever and we still have the same issue.

I was wondering if anyone knows of any tools that can help me identify whats causing the drift?

Any help is much appreciated.
Avatar of sjklein42
sjklein42
Flag of United States of America image

I think this may apply:

use of the USEPMTIMER switch in the boot.ini

http://blogs.technet.com/b/perfguru/archive/2008/02/18/explanation-for-the-usepmtimer-switch-in-the-boot-ini.aspx

A Windows Server 2003-based server may experience time-stamp counter drift if the server uses dual-core AMD Opteron processors or multiprocessor AMD Opteron processors. When this problem occurs, operations that rely on the time-stamp counter may not function correctly. These operations include network communications and performance monitoring.

http://support.microsoft.com/kb/938448

Avatar of lwalcher
Another option would be to use VMware Tools on these W2K3 VM's to sync with the ESX host. Ideally you should sync ALL your devices the same way, though.

It sounds like you are just using the Network Time Protocol (NTP) support built into Windows, is that correct? If so, all member servers will sync with the domain controller(s) in their AD site, which in turn will sync with the DC holding the PDC Emulator FSMO role. The PDC Emulator will need to be configured to point to an internal or external time source outside of the Windows domain. Is the time drifiting on any of your domain controllers? Maybe there is one DC that is causing this problem for all its NTP "clients".
Avatar of honeywell2012
honeywell2012

ASKER

Hi Hsjklein42

This Article refers to HP severs and AMD processors and we have Dell Servers with Intel processors.
I will give it try and let you know.
Hi lwalcher:

Yes we are using Windows NTP and member servers get time from DC's and DC's get time from PDC, and PDC gets time form a Physical Source.

The DC's are in sync and only these certain severs are drifting.

The Domain with 2 DC's has over 100 Workstations which also don't drift, so i don't think its the DC's?

All devices are sync'd in the same way via the domain. I have previously tried using VMware Tools to sync the DC's but this was causing on drift on the DC's, have gone with the recommended of w32tm.
HI,

When you create VM that is a DC.The configuration needs to be set in such a way that even if you have the time to be in SYNC with the PDC it should not sync with the physical machine.Check http://www.vmware.com/files/pdf/Virtualizing_Windows_Active_Directory.pdf and go through page number 5.

Hope this helps !!!
Hi Gurdeep
Thanks for the link, I have previously followed this article, the only difference is our environment is the PDC does not go to a stratum 1 time source. Its currently clocking time form a physical server setup as windows NTP server.
Hi Honeywell,

If you want the PDC to be the authoritative time server (and want the time source to be external) then configure the time service registry value as the following:-

1.HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\Parameters\AnnounceFlag --- Change the DWORD value to 5 (Decimal)

2.HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\Parameters --- Change the DWORD value Type to NTP

3.HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\Parameters --- Change the NtpServer DWORD value to pool.ntp.org,0x1

4.HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\TimeProviders\NtpClient --Change the SpecialPollInterval DWORD value to 1800 (Decimal)

5.HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\TimeProviders\NtpClient -- Change SpecialPollTimeRemaining DWORD value to pool.ntp.org,7baa3bf

6.HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\Config\MaxPosPhaseCorrection --  Change MaxPosPhaseCorrection DWORD value to 1800 (Decimal)

7.HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\Config\MaxNegPhaseCorrection -- Change MaxNegPhaseCorrection DWORD calue to 1800 (Decimal)

Hope this helps !!!

You may have already read this, but lots of good info here:
http://www.vmware.com/files/pdf/Timekeeping-In-VirtualMachines.pdf

On page 4 it states, "In...ESX/ESXi 4.0 and earlier, VMwareTools does not correct errors in which the guest clock is ahead of real time, only those in which the guest clock is behind." So it actually would NOT solve your problem to use VMware Tools to sync time on your guests.

From page 12, this may be what's happening on your VMs:

"A daemon present in Windows NT-family systems (that is, Windows NT 4.0 and later) checks the system time of day against the CMOS TOD clock once per hour. If the system time is off by more than 60 seconds, it is reset to match the TOD clock...One possible (though rare) problem can occur if the daemon sets the clock ahead while the virtual machine is in the process of catching up on an interrupt backlog. Because the virtual machine is not aware that the guest operating system clock has been reset, it continues catching up, causing the clock to overshoot real time."

From page 23, here are some things to double-check (I can tell by your answers that you are pretty sharp and have been working this issue for awhile, so please don't take these the wrong way):

• If possible, use the most recent release of your VMware product, or at least the most recent minor release of the major version you are using. We are always working on improving timekeeping performance and fixing problems.
• If possible, use the most recent supported minor version of the guest operating system in each of your virtual machines. Updates and vendor patches sometimes fix timekeeping issues...Check the VMware knowledge base for articles about specific configuration options or workarounds for guest operating system bugs that might be needed for the operating system version each of your virtual machines is running....
• Check that your host system is configured for the correct time and time zone. Check that it is running suitable clock synchronization software, as described in “Host Clock Synchronization” on page 18.
• Check that your virtual machines are set to the correct time zone...
• Check that you have appropriate clock synchronization software installed and configured in your virtual machines, as described in “Synchronizing Virtual Machines and Hosts with Real Time” on page 17.
• Check that VMware Tools is installed in your virtual machines. Even if you are not using VMware Tools periodic clock synchronization, the one-time clock corrections discussed in “Using VMware Tools Clock Synchronization” on page 18 are important. In addition, the VMware Tools package includes specialized device drivers that improve overall performance of virtual machines, reducing CPU load and thereby indirectly helping timekeeping performance as well.

And last but DEFINITELY not least, Page 24-26 has some specific troubleshooting steps/tools, which is what you were asking for in your original post! :)
Here are some potential solutions from VMware also:

Time runs too fast in a Windows virtual machine when the Multimedia Timer interface is used
http://kb.vmware.com/kb/1005953
(If you are on ESX 3.5 or later this should not be an issue)

Timekeeping best practices for Windows, including NTP
http://kb.vmware.com/kb/1318
Hi sjklein42

I have configured the boot.ini file with USEPMTIMER and the time still drifts.

Other Points i have notices:

When Server is booted in BIOS the Time BIOS time does not drift. Also if i disconnect the NIC from the server the time stops drifting.

Regards
Hi Gurdeep

I have setup the PDC and as per MS recommendations.

6 and 7 should be 3600 (1hour)
4, should be 900 (Every 15 mins)
3 and 5 are configured as a physical server hence not going to the internet

I don't see how these setting can affect the issue we are facing?
Hi lwalcher

Thanks for the info.

None of our Virtual Servers are drifting time. The servers that are a drifting are some physical one. We have over 100 workstations and over 40 servers that don't drift.  So i dont think it will be something to do with VMware. A colleague has just told be these servers were drifting before they joined the domain.
honeywell2012: My apologies, I misunderstood your original post when you said, "We have a scheduled task on each server that syncs time every 5 minutes with the domain (Hosted on VMware ESX)." I assumed this meant your 60 servers having the problem were hosted on ESX. So you are saying it is your two DC's (i.e., "the domain") that are hosted on ESX?

If that is the case, and the DC's are consistently in sync, the next steps I would try would be:

* Check Event Logs for w32time warning/error/info events
* Try manually syncing on a problematic server  and see if you get any errors: w32time /resync
http://technet.microsoft.com/en-us/library/cc773263(WS.10).aspx
* Check w32time settings on a problematic server:
w32tm /query [/computer:<target>] {/source | /configuration | /peers | /status} [/verbose]
* Compare EVERYTHING related to w32time/network on one known bad and one known good Dell R410 server: drivers, firmware, DNS settings, patch levels, w32time registry entries, DNS/AD settings, firewall settings, time zone settings in the system tray clock, anti-virus, BIOS revision. There just HAS to be SOMETHING different that is causing this strangeness.
Hi lwalcher
I see no errors or warning on w32tm. Manual sync and auto sync always works but clock starts to drift within 10 seconds of syncing.
All these servers were bought in the same batch, so they have the same firmware and the servers were build using the same Image, so they have the same patches and settings.

The only difference on the servers is the location of the severs. All servers that time drift connect to the core switch via a local switch, the servers connected to the core switch don't drift.

Thanks
Are the servers that time drift on the other side of a WAN from the core switch? Or are these switches directly connected in the same LAN?

The reason I ask is that there is a classic AD sync issue over a WAN (related to MTU size). I would expect this to cause other AD-related issues on the remote side, and if this is the issue you should definitely see AD/time sync problems on your domain controllers in the same AD site as the servers that are drifting.
Hi lwalcher

Thanks for your help so far.

The switches are directly connected to the same LAN. The DC's are never out of sync. All servers are on the same LAN.

I have just found some servers that don't drift but are in the same location as other servers that do drift?

Very weird......

The time sync's with the domain and is correct, within 10 seconds it's drifted ahead by 1 second.

Running out of ideas now!!
If you run the "w32tm /query /source" command from the command line on two identical servers, one that has drift and one that doesn't, do they show the same source DC? (Maybe one DC is somehow "off" and the other "on"?)

Stepping back a bit, Microsoft does state that w32time is not designed for highly accurate timekeeping:
http://support.microsoft.com/kb/939322

However, 10 seconds drift every 5 minutes seems beyond the pale, IMHO.

Maybe the DCs are actually the ones that are off by 10 seconds (lagging instead of drifting forward) and these physical boxes are actually right? Do you have an independent time source (e.g., atomic clock) you are using to verify?

Or maybe you are located in the Bermuda Triangle? ;) Running low on ideas here, also, as you can see! :)

Luke
Hi lwalcher

The DC's are sync'd to the exact second. I know w32tm does not support highly accurate time, a few seconds off is acceptable but not approx 10 sec's every 5 minutes:

The DC's are syncing to a physical NTP server and the time on this and DC's is the same. We have no atomic clock and this network is air gapped.

see below w32tm /stripchart showing host and DC timings. At 16.27 i did a w32tm .resync and time starts to drift within a few seconds.

The current time is 12/19/2011 4:25:54 PM (local time).
16:25:54 d:+00.0000000s o:-02.4439461s  [                    *      |                           ]
16:25:56 d:+00.0000000s o:-02.6465641s  [                    *      |                           ]
16:25:58 d:+00.0000000s o:-02.7554381s  [                   *       |                           ]
16:26:00 d:+00.0000000s o:-02.9580600s  [                   *       |                           ]
16:26:02 d:+00.0000000s o:-02.9731822s  [                   *       |                           ]
16:26:04 d:+00.0000000s o:-02.9883044s  [                   *       |                           ]
16:26:06 d:+00.0000000s o:-03.0346772s  [                   *       |                           ]
16:26:08 d:+00.0000000s o:-03.1748018s  [                  *        |                           ]
16:26:10 d:+00.0000000s o:-03.2992972s  [                  *        |                           ]
16:26:12 d:+00.0000000s o:-03.3769206s  [                  *        |                           ]
16:26:14 d:+00.0000000s o:-03.3764175s  [                  *        |                           ]
16:26:16 d:+00.0000000s o:-03.3759144s  [                  *        |                           ]
16:26:18 d:+00.0000000s o:-03.4222833s  [                  *        |                           ]
16:26:20 d:+00.0000000s o:-03.4530308s  [                  *        |                           ]
16:26:22 d:+00.0000000s o:-03.5462756s  [                 *         |                           ]
16:26:24 d:+00.0000000s o:-03.5613939s  [                 *         |                           ]
16:26:26 d:+00.0000000s o:-03.6077667s  [                 *         |                           ]
16:26:28 d:+00.0000000s o:-03.7635166s  [                 *         |                           ]
16:26:30 d:+00.0000000s o:-03.9817638s  [                *          |                           ]
16:26:33 d:+00.0000000s o:-04.1531312s  [                *          |                           ]
16:26:35 d:+00.0000000s o:-04.2307546s  [               *           |                           ]
16:26:37 d:+00.0000000s o:-04.2458729s  [               *           |                           ]
16:26:39 d:+00.0000000s o:-04.2922457s  [               *           |                           ]
16:26:41 d:+00.0000000s o:-04.5104968s  [               *           |                           ]
16:26:43 d:+00.0000000s o:-04.6349922s  [              *            |                           ]
16:26:45 d:+00.0000000s o:-04.7282409s  [              *            |                           ]
16:26:47 d:+00.0000000s o:-04.7277378s  [              *            |                           ]
16:26:49 d:+00.0000000s o:-04.7272347s  [              *            |                           ]
16:26:51 d:+00.0000000s o:-04.7892250s  [              *            |                           ]
16:26:53 d:+00.0000000s o:-04.8512192s  [              *            |                           ]
16:26:55 d:+00.0000000s o:-04.8819667s  [              *            |                           ]
16:26:57 d:+00.0000000s o:-04.8970889s  [              *            |                           ]
16:26:59 d:+00.0000000s o:-04.9903298s  [             *             |                           ]
16:27:01 d:+00.0000000s o:-05.1304505s  [             *             |                           ]
16:27:03 d:+00.0000000s o:-05.2861965s  [            *              |                           ]
16:27:05 d:+00.0000000s o:-05.4731892s  [            *              |                           ]
16:27:07 d:+00.0000000s o:-05.6758111s  [           *               |                           ]
16:27:09 d:+00.0000000s o:-05.6909333s  [           *               |                           ]
16:27:11 d:+00.0000000s o:-05.7529314s  [           *               |                           ]
16:27:13 d:+00.0000000s o:-05.9555533s  [           *               |                           ]
16:27:15 d:+00.0000000s o:-05.9862969s  [           *               |                           ]
16:27:17 d:+00.0000000s o:-05.9857938s  [           *               |                           ]
16:27:19 d:+00.0000000s o:-06.0634172s  [          *                |                           ]
16:27:21 d:+00.0000000s o:-06.0629141s  [          *                |                           ]
16:27:23 d:+00.0000000s o:-06.1092791s  [          *                |                           ]
16:27:25 d:+00.0000000s o:-06.3119010s  [          *                |                           ]
16:27:27 d:+00.0000000s o:-06.3113979s  [          *                |                           ]
16:27:29 d:+00.0000000s o:-06.3890213s  [         *                 |                           ]
16:27:31 d:+00.0000000s o:-06.4978992s  [         *                 |                           ]
16:27:33 d:+00.0000000s o:-06.5911479s  [         *                 |                           ]
16:27:35 d:+00.0156214s o:-06.7234579s  [         *                 |                           ]
16:27:37 d:+00.0000000s o:-06.8713893s  [        *                  |                           ]
16:27:39 d:+00.0000000s o:-06.9177621s  [        *                  |                           ]
16:27:41 d:+00.0000000s o:-07.1203879s  [       *                   |                           ]
16:27:43 d:+00.0000000s o:-07.2448833s  [       *                   |                           ]
16:27:45 d:+00.0000000s o:-07.4787558s  [      *                    |                           ]
16:27:47 d:+00.0000000s o:-07.6501232s  [      *                    |                           ]
16:27:49 d:+00.0000000s o:-07.6496201s  [      *                    |                           ]
16:27:51 d:+00.0000000s o:-07.6491170s  [      *                    |                           ]
16:27:53 d:+00.0000000s o:-07.7109706s  [      *                    |                           ]
16:27:48 d:+00.0000000s o:+00.0001244s  [                           *                           ]
16:27:50 d:+00.0000000s o:-00.1868761s  [                          *|                           ]
16:27:52 d:+00.0000000s o:-00.4207486s  [                          *|                           ]
16:27:54 d:+00.0000000s o:-00.4202455s  [                          *|                           ]
16:27:56 d:+00.0000000s o:-00.4978689s  [                          *|                           ]
16:27:58 d:+00.0000000s o:-00.6223604s  [                         * |                           ]
16:28:00 d:+00.0000000s o:-00.7156052s  [                         * |                           ]
16:28:02 d:+00.0000000s o:-00.9651030s  [                        *  |                           ]
16:28:04 d:+00.0000000s o:-00.9958505s  [                        *  |                           ]
16:28:06 d:+00.0000000s o:-01.0734700s  [                        *  |                           ]
16:28:08 d:+00.0000000s o:-01.1667187s  [                        *  |                           ]
16:28:10 d:+00.0000000s o:-01.3849620s  [                       *   |                           ]
16:28:12 d:+00.0000000s o:-01.4157056s  [                       *   |                           ]
16:28:14 d:+00.0000000s o:-01.4777037s  [                       *   |                           ]
16:28:16 d:+00.0000000s o:-01.5396940s  [                       *   |                           ]
16:28:18 d:+00.0000000s o:-01.5391909s  [                       *   |                           ]
16:28:20 d:+00.0000000s o:-01.5386878s  [                       *   |                           ]
16:28:22 d:+00.0000000s o:-01.7881895s  [                      *    |                           ]
16:28:24 d:+00.0000000s o:-01.9126810s  [                      *    |                           ]
16:28:26 d:+00.0000000s o:-01.9121779s  [                      *    |                           ]
16:28:28 d:+00.0000000s o:-02.1616757s  [                     *     |                           ]
16:28:30 d:+00.0000000s o:-02.2392952s  [                     *     |                           ]
16:28:32 d:+00.0000000s o:-02.3012933s  [                     *     |                           ]
16:28:34 d:+00.0000000s o:-02.5195327s  [                    *      |                           ]
16:28:36 d:+00.0000000s o:-02.5190296s  [                    *      |                           ]
16:28:38 d:+00.0000000s o:-02.5654024s  [                    *      |                           ]
16:28:40 d:+00.0000000s o:-02.7055270s  [                    *      |                           ]
16:28:42 d:+00.0000000s o:-02.7987718s  [                   *       |                           ]
16:28:44 d:+00.0000000s o:-02.9857684s  [                   *       |                           ]
16:28:47 d:+00.0000000s o:-02.9852614s  [                   *       |                           ]
16:28:49 d:+00.0000000s o:-03.0628809s  [                   *       |                           ]
16:28:51 d:+00.0000000s o:-03.0936284s  [                  *        |                           ]
16:28:53 d:+00.0000000s o:-03.0931253s  [                  *        |                           ]
16:28:55 d:+00.0000000s o:-03.1394981s  [                  *        |                           ]
16:28:57 d:+00.0000000s o:-03.1858670s  [                  *        |                           ]
16:28:59 d:+00.0000000s o:-03.1853639s  [                  *        |                           ]
16:29:01 d:+00.0000000s o:-03.3411060s  [                  *        |                           ]
16:29:03 d:+00.0000000s o:-03.3562282s  [                  *        |                           ]
16:29:05 d:+00.0000000s o:-03.5119742s  [                 *         |                           ]
16:29:07 d:+00.0000000s o:-03.7146000s  [                 *         |                           ]
16:29:09 d:+00.0000000s o:-03.7140969s  [                 *         |                           ]
16:29:11 d:+00.0000000s o:-03.7604697s  [                 *         |                           ]
16:29:13 d:+00.0000000s o:-03.9318410s  [                *          |                           ]
16:29:15 d:+00.0000000s o:-03.9625885s  [                *          |                           ]
16:29:17 d:+00.0000000s o:-04.0402041s  [                *          |                           ]
16:29:19 d:+00.0000000s o:-04.1178236s  [                *          |                           ]
16:29:21 d:+00.0000000s o:-04.1798217s  [                *          |                           ]
16:29:23 d:+00.0000000s o:-04.4293156s  [               *           |                           ]
16:29:25 d:+00.0000000s o:-04.4288125s  [               *           |                           ]
16:29:27 d:+00.0000000s o:-04.5220612s  [               *           |                           ]
16:29:29 d:+00.0000000s o:-04.5840593s  [              *            |                           ]
16:29:31 d:+00.0000000s o:-04.6148029s  [              *            |                           ]
16:29:33 d:+00.0156214s o:-04.8096141s  [              *            |                           ]
16:29:35 d:+00.0000000s o:-05.0513012s  [             *             |                           ]
16:29:37 d:+00.0000000s o:-05.1758005s  [             *             |                           ]
16:29:39 d:+00.0000000s o:-05.4252983s  [            *              |                           ]
16:29:41 d:+00.0000000s o:-05.4247952s  [            *              |                           ]
16:29:43 d:+00.0000000s o:-05.4242921s  [            *              |                           ]
16:29:45 d:+00.0000000s o:-05.5487914s  [            *              |                           ]
16:29:47 d:+00.0000000s o:-05.5795350s  [            *              |                           ]
16:29:49 d:+00.0000000s o:-05.6259039s  [            *              |                           ]
16:29:51 d:+00.0468642s o:-05.7738314s  [           *               |                           ]
16:29:53 d:+00.0000000s o:-05.8123818s  [           *               |                           ]
16:29:55 d:+00.0000000s o:-05.8118787s  [           *               |                           ]
16:29:57 d:+00.0000000s o:-05.8426262s  [           *               |                           ]
16:29:59 d:+00.0000000s o:-05.8889951s  [           *               |                           ]
16:30:01 d:+00.0000000s o:-06.0098512s  [          *                |                           ]
16:30:03 d:+00.0000000s o:-06.1398269s  [          *                |                           ]
16:30:05 d:+00.0000000s o:-06.2693952s  [          *                |                           ]
16:30:07 d:+00.0000000s o:-06.3681203s  [         *                 |                           ]
16:30:09 d:+00.0000000s o:-06.5608009s  [         *                 |                           ]
16:30:11 d:+00.0000000s o:-06.5657742s  [         *                 |                           ]
16:30:13 d:+00.0000000s o:-06.6017944s  [         *                 |                           ]
16:30:15 d:+00.0000000s o:-06.5755171s  [         *                 |                           ]
16:30:17 d:+00.0000000s o:-06.7367434s  [        *                  |                           ]
16:30:19 d:+00.0000000s o:-06.8040142s  [        *                  |                           ]
16:30:21 d:+00.0000000s o:-07.0121164s  [        *                  |                           ]
16:30:23 d:+00.0000000s o:-07.0793872s  [        *                  |                           ]
16:30:25 d:+00.0000000s o:-07.1312364s  [       *                   |                           ]
16:30:27 d:+00.0000000s o:-07.1047554s  [       *                   |                           ]
16:30:29 d:+00.0000000s o:-07.1251503s  [       *                   |                           ]
^C
ensure that your time server us a physical server or a real ntp.

else the following may occur;

your esxh host may try to keep it's time in syn with the bios, your vm's /tools may keep up/in sysnc wi the esx host time and then your time software will try to sync to a ntp server, there's going to be some issues
ASKER CERTIFIED SOLUTION
Avatar of honeywell2012
honeywell2012

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Resolved a site specific issue. Other may face this issue, first thing to look for is java.