asked on

NTP behavior on Solaris.

Hello,
We have 4 NTP servers (Solaris 9). All clients are getting date/time from these servers. These servers are getting data/time from 3 GPS devices. Over past weekend, GPS license was expired on 2 GPS devices and its time went back to 1999, while time remains same. One was still fine. Due to this, all clients went back 19 years back. Immediately we added two new new GPS devices and restarted NTP daemon on all NTP servers and all clients. Below is the output of updated IPs (after issue was corrected)

time-serv1 # cat /etc/inet/ntp.conf
server 192.168.xx.xx
server 172.28.42.xx
server 172.28.34.yy

driftfile /var/ntp/ntp.drift
statsdir /var/ntp/ntpstats/
filegen peerstats file peerstats type day enable
filegen loopstats file loopstats type day enable
filegen clockstats file clockstats type day enable

time-serv1 #
time-serv1 # ntpq -p
     remote           refid      st t when poll reach   delay   offset    disp
==============================================================================
*gps-clock3.	 .GPS.            1 u  715 1024  377     3.52    1.718    2.14
+172.28.42.xx   .GPS.            1 u  697 1024  377    44.37   -0.865    1.16
+172.28.34.yy   .GPS.            1 u  820 1024  377    70.02    0.865    1.01
time-serv1 #

Open in new window

Is there setting, which can be applied on NTP servers or individual clients and tell it, do not sync with bad ones, because sudden 19 years drop doesn't make sense. Why it couldn't not have synced with good GPS device and picked date from bad ones ? There was one good, out of 3.
Any advice please ?

Thanks

Dr. Klahn

Are the clients Windows or linux systems?

If they are linux systems, are they using the ntp daemon or ntpdate to update the time?

There's a problem here that may not be obvious. If a system boots up and requests the time from the NTP server at boot time, it must get the correct time, every time, because if it does not then a problem follows. Consider if it gets the wrong time at boot and later on it is prohibited from changing to the correct time because the time difference is too large. It will be stuck with the wrong time forever.

Suppose there's a shell script that uses ntpdate -q to query multiple time servers and it gets multiple times back. In the example you've cited, two would have one date and two would have a different date. Which set of dates is correct?

It's not as easy as one might think and a better approach may be to use a reliable stratum 2 server such as the Naval Observatory or the NTP pool. Over the last 20 years I think I've seen one wrong response from the NTP pool.

Jay Pe

ASKER

Clients are Solaris as well as Linux. On the day of incident, we started getting complain that databases are out of sync. Upon checking we found that year went back to 1999. Then checked NTP servers (Solaris 9) and found that original GPS devices was problem.
No client was rebooted but it picked wrong year.
In above example, 192.168.xx.xx is good GPS device. Both 172.28.xxxx are new devices, we added on that day.

arnold

Dr. KLahn is right, only if your clock maintenance relies on ntpdate would the shift of 19 years...
Why are you using GPS devices (that...) where ntp.org public servers with which your systems could sync.

Ntpd has builtin mechanism to adjust for drift.

Usually there is a single master with subordinates this way you do not gave to contend trying to track down the errand ntp server providing incorrect data.

Ntpq
Can be used to query peers to see the drift among them. The more sources your ntp server has the more accurate.

In your case using three internal, two went back in time which is what ......... To have the wrong info.

You could use this also to monitor
If memory serves, a change of more than thirty minutes ntpd will not adjust (this takes into shift because of zone changes, daylight savings)
Ntpdate authoritatively changes the clock on bootup.

Jay Pe

ASKER

I am not sure, if GPS device is being synchronized with ntp.org. Otherwise change in setup would be a big change in my organization and I can talk to management, for that.

When you say "a change of more than thirty minutes ntpd will not adjust" , do I need to set something in my config file to have this ? I want to change/set something here, which can prevent any similar incident.

arnold

Dr. Klahn pointed out

On Solaris during bootup, the ntp service runs ntpdate to sync time.

Do your ntp systems have external access?
The ntpd.conf
Server identifies...

Depending on where you are, there are publicly available ntp servers based on ciesium .....
Ntp.conf on the client side

....

Jay Pe

ASKER

I understand that during bootup, server will run ntpdate. But in this incident, none of the client rebooted and still it took date back.

Our NTP servers do not have external access. That is reason, they are connected with GPS devices, which are open to internet. To adjust drift, do I need to put some parameter in ntp.conf on 4 NTP servers? Below is current file

time-serv1 # cat /etc/inet/ntp.conf
server 192.168.xx.xx
server 172.28.42.xx
server 172.28.34.yy

driftfile /var/ntp/ntp.drift
statsdir /var/ntp/ntpstats/
filegen peerstats file peerstats type day enable
filegen loopstats file loopstats type day enable
filegen clockstats file clockstats type day enable

time-serv1 #

Open in new window

arnold

A reboot is a sure way, but it only requires the restart of the ntpd service. Ntp crashed, restarted.
I do not believe you can alter ntp' behavior dealing with adjustments beyond a certain limit.
That is the built-in protection.

If memory serves when configuring source, you can assign reliability....

noci

ntpd normaly should disregard any date/time shift > roughly 15 minutes.
So any clock that is more than 15 minutes off should be disregarded.

ntpdate is different, it is used to prime the clock once (without regard for any difference between local/remote time).
after which ntpd is responsible to keep it on pace.

imho you could change the server lines to the following:

server <ipaddress>  minpoll 4 maxpoll 7 iburst

Open in new window

Meaning polling time should be between 16 and 128 seconds. and when first starting the ntp deamon ask 4 queries in quick succession.
long poll times might lead to less stable times.

Also add this to get the system clock as part of the system

server 127.127.1.0 minpoll 4 maxpoll 7
fudge 127.127.1.0 stratum 10

Open in new window

Then there is a fallback that will provide some time in case of ntp server unreachability.
The driftfile will record the observed speed differences of the local system clock vs. the network time.

Jay Pe

ASKER

server 127.127.1.0 minpoll 4 maxpoll 7
fudge 127.127.1.0 stratum 10

Open in new window

I am trying to understand above configuration. Do we need to define seond line (fudge), when none of our servers are connected with internet. But there are 4 NTP servers, i.e. 2 in Seattle and 2 in Phoenix. If fudge not needed, should I make it as below ?
Or is stratum needed ?

server 127.127.1.0 minpoll 4 maxpoll 7
server 127.127.2.0 minpoll 4 maxpoll 7
server 127.188.1.0 minpoll 4 maxpoll 7
server 127.188.2.0 minpoll 4 maxpoll 7

Open in new window

arnold

this is a local loopback type ..
server 127.127.1.0 minpoll 4 maxpoll 7
fudge 127.127.1.0 stratum 10
not sure what your second example is of, do your servers use 127. segment for their IPs??

Jay Pe

ASKER

okay, sorry if I confused you. I thought, example is saying about NTP servers's IP. We have 4 NTP servers. 2 in seattle and 2 in Phoenix. Assume that their IPs are
10.32.11.11
10.32.11.12
10.62.11.11
10.62.11.12

arnold

if you have have paths among them, you should sync one to the others.

often, one picks one and has it sync to an external source as well,

ntp.org has a regional repository....

depending on where you are, your government, science organization may provide an NTP server based on an atomic clock, or based on cezium ....

ntpq
lpeers...

Jay Pe

ASKER

Our setup is like this.. Below are 4 NTP servers, located in 2 locations. All clients are syncing time with these servers.

10.32.11.11
10.32.11.12
10.62.11.11
10.62.11.12

Open in new window

If I login to these servers, each of them have entry of 4 GPS devices, means, NTP servers are synching time with these devices.

gps-clock1
gps-clock2
gps-clock3
gps-clock4

Open in new window

This is old setup and not going to change soon, i.e. management will not do anything to get it synced with external source. On the day of incident, gps-clock2 and gps-clock3 went bad and many clients picked up year 1999.

So current requirement is, I need to fix those 4 NTP servers (IP's above) so that, if it such incident happen again, clients should not pick the server with large drift. What configuration, I would need in ntp.conf of all 4 NTP master servers ? Should it be like below example of need fudge also ?

server 10.32.11.11 minpoll 4 maxpoll 7
server 10.32.11.12 minpoll 4 maxpoll 7
server 10.62.11.11 minpoll 4 maxpoll 7
server 10.62.11.12 minpoll 4 maxpoll 7

Open in new window

arnold

each server has its own drift and reliance on others.

I've not heard of a situation as yours as noted before where the ntp adjusted the clocks so drastically.

The only time that could happen is when ntpdate is called during bootup as it presumes the system's clock might be way off if it was off for a while, ntpdate does not include sanity check that if the change is more than 30 minutes, stay with what you have. it presumes the peer is accurate.

commonly any ntpd service, start ntpdate firs to get the system sync up, then it runs the ntpd daemon which manages the drift for self correction while also contacts the member servers.....

Jay Pe

ASKER

Yes, that was strange. I am not sure if this could play a role, but our NTP master servere are Solaris 9, old servers. There may be bugs.
if I suggest management to change the model. Instead of GPS devices, NTP master servers should sync with external devices, do you have nay recommendation, we should point to which atomic clock ? We are in US and servers are located in 10 different locations in US.

arnold

Please check whether your time management is based on cron
or ps -ef | grep ntp

cron likely uses ntpdate to sync with a source
looks like tick and tock are no longer available

NIST
https://tf.nist.gov/tf-cgi/servers.cgi
ntp.org

both sources are fine.

Jay Pe

ASKER

There is no entry for ntp in cron of any user. This is from one of the NTP master server.

bash-2.05$ ps -ef| grep ntp
  pete  6067  5988  0 12:41:12 pts/8    0:00 grep ntp
    root 15247     1  0   Oct 24 ?       73:31 /usr/lib/inet/xntpd
bash-2.05$

Open in new window

arnold

ok, so you are running the xntpd...

If memory servers, when used Solaris, we did not use the Solaris provided on, but had to use the sunfreeware, GNU NTPd...

if you look at the /etc/init.d/xntpd or it might be ntpd
you see that before it starts the xntpd/ntpd daemon, it runs ntpdate

note you have two settings, one deals with ntp as the client and one as Ntp the server to others.

Jay Pe

ASKER

Yes, I can see in xntpd

# Wait until date is close before starting xntpd
(/usr/sbin/ntpdate $ARGS; sleep 2; /usr/lib/inet/xntpd) &

Open in new window

arnold

ntpd commonly would not shift in one interval by more than 30 minutes, or was it 60. Timezone calculation do not come into play.
I.e. If a system was set in one zone, then transfered to another, once /etc/localtime points to the correct timezone the response to date will reflect the correct, this type of change does not impact/affect ntpd.... I think it need not be restarted ...

Jay Pe

ASKER

In that case "minpoll 4 maxpoll 7" and "stratum 10" is not required ?

arnold

I'm not sure I understand the question.

ASKER CERTIFIED SOLUTION

noci

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Jay Pe

ASKER

noci suggested to make these entries in ntp.conf in earlier post, so I want to know, how my ntp.conf should look like for master server

Should it just have -->

server xx.xx.xx.xx
server yy.yy.yy.yy

Open in new window