OOnn tthhee CCaarree aanndd FFeeeeddiinngg ooff XXnnttppdd Dennis Ferguson _U_n_i_v_e_r_s_i_t_y _o_f _T_o_r_o_n_t_o IInnttrroodduuccttiioonn This is a collection of notes concerning the use of _x_n_t_p_d and related programs, and on dealing successfully with the Net- work Time Protocol in general. Be careful to note that what fol- lows is in no way canonical nor authoritative, but rather has a relatively high content of opinion and conjecture derived from an understanding of the protocol and some (often bad) experience with it. I might change my mind tomorrow, and would be happy to have my mind changed for me by popular dissent. This is the advertising section. _X_n_t_p_d is a complete imple- mentation of the NTP Version 2 specification, as defined in RFC 1119. It also retains compatability with NTP Version 1, as defined in RFC 1059, though this compatability is sometimes only semiautomatic. _X_n_t_p_d does no floating point arithmetic, and manipulates the 64 bit NTP timestamps as 64 bit integers where appropriate. _X_n_t_p_d fully implements NTP Version 2 authentication and NTP mode 6 control messages. A complete, flexible address-and-mask restriction facility has been included. The server is fully reconfigurable at run time via NTP private mode 7 messages, and exposes a considerable amount of internal detail via that mechanism as well. The latter function allows _x_n_t_p_d to be extensively debugged via protocol messages, something which makes it an advantageous porting base to construct servers intended to run on minimal platforms. The code is biased towards the needs of a busy primary server. Tables are hashed to allow efficient handling of a large number of pollers, though at the expense of additional overhead when the number of pollers is small. Many fancy features have been included to permit efficient management and monitoring of a busy primary server, features which are simply excess baggage for a server on a high stratum client. The code was written with near neurotic attention to details which can affect precision, and as a consequence should be able to make good use of high per- formance special purpose hardware (unfortunately, the latter does not really include Unix machines). _X_n_t_p_d was designed with radio clock support in mind, and can provide support for even the most perverse, inconvenient radio clock designs without seriously impacting, or requiring consideration for, the performance of other parts of the server. The server has methodically avoided the use of Unix-specific library routines where ever possible by implementing local versions, this to aid in porting the code to non-Unix platforms. 2 There are, however, several drawbacks to all of this. _X_n_t_p_d is very, very fat. The number lines of C code which _x_n_t_p_d is made from exceeds the number of lines of PDP-11 assembler in the fuzzball NTP server by a factor of 10 or so. This is rotten if your intended platform for the daemon is memory-limited. _X_n_t_p_d uses SIGIO for all input, a facility which appears to not have universal support and whose use seems to exercise the parts of your vendors' kernels which are most likely to have been done poorly. The code is unforgiving in the face of kernel problems which affect performance, and generally requires that you repair the problems in the kernel. The code has a distinctly experimen- tal flavour and contains features which could charitably be termed failed experiments, but which the author has not gotten around to backing out of yet. There is code which has not been exercised (e.g. leap second support) due to the inconvenience of setting up tests. Much was learned from the addition of support for a variety of clock devices, with the result that this support could use some rewriting. _X_n_t_p_d is far from being a finished product, and may never be. If this hasn't scared you away, read on. HHooww NNTTPP WWoorrkkss The approach used by NTP to achieve time synchronization among a group of clocks is somewhat different than other such protocols. In particular, NTP does not attempt to synchronize clocks to each other. Rather, each server attempts to synchro- nize to UTC (i.e. standard time) using the best available source. This is a fine point which is worth understanding. A group of NTP-synchronized clocks may be close to each other in time, but this is not a consequence of the clocks in the group having syn- chronized to each other, but rather because each clock has syn- chronized closely to UTC via the best source it has access to. As such, trying to synchronize a group of clocks to a set of servers whose time is not in mutual agreement may not result in any sort of useful synchronization of the clocks, even if you don't care about UTC. NTP operates on the premise that there is one true standard time, and that if several servers which claim synchronization to standard time disagree about what that time is then one or more of them must be broken. There is no attempt to resolve differences more gracefully since the premise is that substantial differences cannot exist. In essence, NTP expects that the time being distributed from the top of the synchroniza- tion subnet will be derived from some external source of UTC (e.g. a radio clock). This makes it somewhat inconvenient (though not impossible) to synchronize hosts together without a reliable source of UTC to synchronize them to. If your network is isolated and you cannot access other people's servers across the Internet, a radio clock may make a good investment. Time is distributed through a heirarchy of NTP servers, with each server adopting a _s_t_r_a_t_u_m which indicates how far away from 3 an external source of UTC it is operating at. Stratum 1 servers, which are at the top of the pile (or bottom, depending on your point of view), have access to some external time source, usually a radio clock synchronized to time signal broadcasts from radio stations which explicitly provide a standard time service. A stratum 2 server is one which is currently obtaining time from a stratum 1 server, a stratum 3 server gets its time from a stratum 2 server, and so on. To avoid long lived synchronization loops the number of strata is limited to 15. Each client in the synchronization subnet (which may also be a server for other, higher stratum clients) chooses exactly one of the available servers to synchronize to, usually from among the lowest stratum servers it has access to. It is thus possible to construct a synchronization subnet where each server has exactly one source of lower stratum time to synchronize to. This is, however, not an optimal configuration, for indeed NTP oper- ates under another premise as well, that each server's time should be viewed with a certain amount of distrust. NTP really prefers to have access to several sources of lower stratum time (at least three) since it can then apply an agreement algorithm to detect insanity on the part of any one of these. Normally, when all servers are in agreement, NTP will choose the best of these, where "best" is defined in terms of lowest stratum, clos- est (in terms of network delay) and claimed precision, along with several other considerations. The implication is that, while one should aim to provide each client with three or more sources of lower stratum time, several of these will only be providing backup service and may be of lesser quality in terms of network delay and stratum (i.e. a same stratum peer which receives time from lower stratum sources the local server doesn't acess directly can also provide good backup service). Finally, there is the issue of association modes. There are a number of "modes" in which NTP servers can associate with each other, with the mode of each server in the pair indicating the behaviour the other server can expect from it. In particular, when configuring a server to obtain time from other servers, there is a choice of two modes which may be alternatively used. Configuring an association in _s_y_m_m_e_t_r_i_c _a_c_t_i_v_e mode (usually indicated by a _p_e_e_r declaration in configuration files) indicates to the remote server that one wishes to obtain time from the remote server and that one is also willing to supply time to the remote server if need be. Configurating an association in _c_l_i_e_n_t mode (usually indicated by a _s_e_r_v_e_r declaration in configuration files) indicates the one wishes to obtain time from the remote server, but that one is nnoott willing to provide time to the remote. It is the author's opinion that symmetric active mode should be used for configuring all peer associations between NTP daemons (or stateful NTP entities). Client mode is for use by boot time date-setting programs and the like, which really have no time to provide and which don't retain state about associa- tions over the longer term. 4 CCoonnffiigguurriinngg XXnnttppdd At start up time _x_n_t_p_d reads its initial configuration from a file, usually _/_e_t_c_/_n_t_p_._c_o_n_f unless you have compiled in a dif- ferent name. Putting something in this file which will enable the server to obtain time from somewhere is usually the first big hurdle after installation of the software. At its simplest (this is usually where you should start), what you need to do in the configuration file is mention the servers that the daemon should poll for time with _p_e_e_r declarations. To jump right into this, a working configuration file might look like (ddoo nnoott ccooppyy tthhiiss ddiirreeccttllyy): # # Peer configuration for 128.100.100.7 # (expected to operate at stratum 2) # peer 128.100.49.105 # suzuki.ccie.utoronto.ca peer 128.8.10.1 # umd1.umd.edu peer 192.35.82.50 # lilben.tn.cornell.edu driftfile /etc/ntp.drift This particular host is expected to operate at stratum 2 by virtue of the fact that two of the three servers declared (the first two, actually) have radio clocks and typically run at stra- tum 1. The third server in the list has no radio clock, but is known to maintain associations with a number of stratum 1 peers and typically operates at stratum 2. Of particular importance with the latter host is that it maintains associations with peers besides the two stratum 1 peers mentioned. Note that I also threw in a _d_r_i_f_t_f_i_l_e declaration, since this entry should also appear in all configuration files. One of the things the NTP daemon does when it is first started is to compute the error in the speed of the clock on the computer it is running on. It usually takes about a day or so after the daemon is started to compute a good estimate of this (and it needs a good estimate to synchronize closely to its server). Once the initial value is computed it will change only by relatively small amounts during the course of continued operation. The _d_r_i_f_t_f_i_l_e entry indicates to the daemon the name of a file where it may store the current value of the frequency error (or clock drift) so that, if the daemon is stopped and restarted, it can reini- tialize itself to the previous estimate and avoid the day's worth of time it will take to resynchronize. Since this is a desire- able feature, a _d_r_i_f_t_f_i_l_e declaration should always be included in the configuration file. Returning to peer selection, the canonical source of infor- mation concerning the location of stratum 1 and good quality 5 stratum 2 servers is the file _p_u_b_/_n_t_p_/_c_l_o_c_k_._t_x_t available via anonymous ftp from louie.udel.edu. If you are setting up a sin- gle NTP host for testing you might obtain this file and locate three servers to obtain time from. Ideally one would pick two primary servers (i.e. stratum 1) and one secondary (stratum 2) from that list to use. It is also better to chose servers which are likely to be "close" to you in terms of network topology, though you shouldn't worry overly about this if you are unable to determine who is close and who isn't. Hopefully, however, your intention is to synchronize a larger number of hosts via NTP. In this case you will want to synchronize a few of your hosts to more distant primary servers, and then synchronize (some of) the rest of your hosts to these. A good way to do this is the following scheme: (1) Choose three of your local hosts to operate as stratum 2 servers. Then choose six stratum 1 servers and configure each of your stratum 2 servers with a pair of these. Try to ensure that each stratum 2 server has at least one "close" stratum 1 peer, if you can. In addition, peer each of the stratum 2 servers to the other two (i.e. each configuration will list four peers, two remote and two local). (2) From the remaining hosts, choose those you would like to operate at stratum 3. This might be all of the rest if you're are only concerned with a dozen or two hosts in total, or might be things like file servers and machines with good clocks if you have many hosts to synchronize. Provide each stratum 3 server with an identical configura- tion file pointing at the three stratum 2 hosts. (3) Synchronize anything which remains (which will operate at stratum 4) to three or four nearby stratum 3 servers. The above arrangement should provide very good, robust time service with a minimum of traffic to distant servers and with manageable loads on all local servers. While it is theoretically possible to extend the synchronization subnet to even higher strata, this is seldom justified and can make the maintenance of configuration files unmanageable. Serving time to a higher stra- tum peer is very inexpensive in terms of load on the lower stra- tum server if the latter is located on the same concatenated LAN. When planning your network you might, beyond this, keep in mind a couple of generic "don'ts", in particular: (1) Don't synchronize a server to a same stratum peer unless the latter is receiving time from lower stratum sources the for- mer doesn't talk to directly, and (2) Don't configure peer associations with higher stratum servers. Let the higher strata configure lower stratum servers, but not the inverse. 6 There are many useful exceptions to these rules. When in doubt, however, follow them. Note that mention was made of machines with "good" clocks versus machines with "bad" ones. There are two things that make a clock "good", the precision of the clock (e.g. how many low order bits in a struct timeval are actually significant) and the frequency of occurance (or lack thereof) of such things as lost clock interrupts. Among the most common computers I have observed there to be a fairly simple algorithm for determining the goodness of its clock. If the machine is a Vax it probably has a good clock (the low order bit in the time is in the microseconds and most of these seem to manage to get along with- out losing clock interrupts). If the machine is a Sun it proba- bly doesn't (the low order clock bit is at the 10 or 20 millisec- ond mark and Suns like to lose clock interrupts, particularly if they have a screen and particularly if they run SunOS 4.0.x). If you have IBM RTs running AOS 4.3, they have fair clocks (low order clock bit at about a millisecond and they don't lose clock interrupts, though they do have trouble with clock rollovers while reading the low order clock bits) but I recommend them as low stratum NTP servers anyway since they aren't much use as any- thing else. For other machines you are on your own since I don't have enough data points to venture an opinion. In any event, if at all possible you should try to use machines with "good" clocks for the lower strata. A final item worth considering is what to do if you buy a radio clock. An effective way to deal with this is to not change the configuration above at all, but simply turn one of the stra- tum 2 servers into a stratum 1 by attaching the radio clock. Note that the remainder of your synchronization subnet will be promoted one stratum when you do this, but the remaining servers maintaining distant associations will continue to provide good backup service should the radio fail. Don't depend solely on your radio clock equipped time server unless circumstances demand it. XXnnttppdd vveerrssuuss NNttppdd There are several items of note when dealing with a mixture of _x_n_t_p_d and _n_t_p_d servers. _X_n_t_p_d is an NTP Version 2 implementa- tion. As such, by default when no additional information is available concerning the preferences of the peer, _x_n_t_p_d claims to be Version 2 in the packets that it sends. _N_t_p_d, while implementing most of the Version 2 algorithms, still believes itself to be a Version 1 implementation. The sticky part here is that when _n_t_p_d receives a packet claiming to be from a Version 2 server, it throws it away early in the input section. Hence there is a danger that in some situations _n_t_p_d will ignore packets from _x_n_t_p_d. 7 _X_n_t_p_d is aware of this problem. In particular, when _x_n_t_p_d is polled by a host claiming to be a Version 1 implementation, _x_n_t_p_d claims to be a Version 1 implementation in the packets returned to the poller. This allows _x_n_t_p_d to serve _n_t_p_d clients transparently. The trouble occurs when an _n_t_p_d server is config- ured in an _x_n_t_p_d configuration file. With no further indication, _x_n_t_p_d will send packets claiming to be Version 2 when it polls, and these will be ignored by _n_t_p_d. To get around this, _x_n_t_p_d allows a qualifier to be added to configuration entries to indicate which version to use when polling. Hence the entry peer 130.43.2.2 version 1 # apple.com (running ntpd) will cause Version 1 packets to be sent to the host address 130.43.2.2. If you are testing _x_n_t_p_d against existing _n_t_p_d servers you will need to be careful about this. There are a few other items to watch when converting an _n_t_p_d configuration file for use with _x_n_t_p_d. The first is to remove the _p_r_e_c_i_s_i_o_n entry from the configuration file, if there is one. There was a time when the precision claimed by a server was mostly commentary, with no particularly useful purpose. This is no longer the case, however, and so changing the precision a server claims should only be done with some consideration as to how this alters the performance of the server. The default pre- cision claimed by _x_n_t_p_d will be right for most situations. A section later on will deal with when and how it is appropriate to change a server's precision without doing things you don't intend. Second, note that in the example configuration file above numeric addresses are used in the peer declarations. It is also possible to use names requiring resolution instead, but only if some additional configuration is done _(_x_n_t_p_d doesn't include the resolver routines itself, and requires that a second program be used to do name resolution). If you find numeric addresses offensive, see below. Finally, _p_a_s_s_i_v_e and _c_l_i_e_n_t entries in an _n_t_p_d configuration file have no useful semantics for _x_n_t_p_d and should be deleted. _X_n_t_p_d won't reset the kernel variable _t_i_c_k_a_d_j when it starts, so you can remove anything dealing with this in the configuration file. The configuration of radio clock peers is done using dif- ferent configuration language in _x_n_t_p_d configuration files, so you will need to delete these entries from your _n_t_p_d configura- tion file and see below for the equivalent language. 8 TTrraaffffiicc MMoonniittoorriinngg _X_n_t_p_d handles peers whose stratum is higher than the stratum of the local server and pollers using client mode by a fast path which minimizes the work done in responding to their polls, and normally retains no memory of these pollers. Sometimes, however, it is interesting to be able to determine who is polling the server, and how often, as well as who has been sending other types of queries to the server. To allow this, _x_n_t_p_d implements a traffic monitoring facil- ity which records the source address and a minimal amount of other information from each packet which is received by the server. This can be enabled by adding the following line to the server's configuration file: monitor yes The recorded information can be displayed using the _x_n_t_p_d_c query program, described briefly below. AAddddrreessss--aanndd--MMaasskk RReessttrriiccttiioonnss The address-and-mask configuration facility supported by _x_n_t_p_d is quite flexible and general. The major drawback is that while the internal implementation is very nice, the user inter- face sucks. For this reason it is probably worth doing an exam- ple here. Briefly, the facility works as follows. There is an inter- nal list, each entry of which holds an address, a mask and a set of flags. On receipt of a packet, the source address of the packet is compared to each entry in the list, with a match being posted when the following is true: (source_addr & mask) == (address & mask) A particular source address may match several list entries. In this case the entry with the most one bits in the mask is chosen. The flags associated with this entry are returned. In the current implementation the flags always add restric- tions. In effect, an entry with no flags set leaves matching hosts unrestricted. An entry can be added to the internal list using a _r_e_s_t_r_i_c_t statement. The flags associated with the entry are specified textually. For example, the _n_o_t_r_u_s_t flag indicates that hosts matching this entry, while treated normally in other respects, shouldn't be trusted for synchronization. The _n_o_m_o_d_i_f_y flag indicates that hosts matching this entry should not be allowed to do run time configuration. There are many more flags, see the man page. 9 Now the example. Suppose you are running the server on a host whose address is 128.100.100.7. You would like to ensure that run time reconfiguration requests can only be made from the local host and that the server only ever synchronizes to one of a pair of off campus servers or, failing that, a time source on net 128.100. The following entries in the configuration file would implement this policy: # By default, don't trust and don't allow modifications restrict default notrust nomodify # These guys are trusted for time, but no modifications allowed restrict 128.100.0.0 mask 255.255.0.0 nomodify restrict 128.8.10.1 nomodify restrict 192.35.82.50 nomodify # The local addresses are unrestricted restrict 128.100.100.7 restrict 127.0.0.1 The first entry is the default entry, which all hosts match and hence which provides the default set of flags. The next three entries indicate that matching hosts will only have the _n_o_m_o_d_i_f_y flag set and hence will be trusted for time. If the mask isn't specified in the _r_e_s_t_r_i_c_t statement it defaults to 255.255.255.255. Note that the address 128.100.100.7 matches three entries in the table, the default entry (mask 0.0.0.0), the entry for net 128.100 (mask 255.255.0.0) and the entry for the host itself (mask 255.255.255.255). As expected, the flags for the host are derived from the last entry since the mask has the most bits set. The only other thing worth mentioning is that the restrict statements apply to packets from all hosts, including those that are configured elsewhere in the configuration file and even including your clock pseudopeer(s), in any. Hence, if you spec- ify a default set of restrictions which you don't wish to be applied to your configured peers, you must remove those restric- tions for the configured peers with additional _r_e_s_t_r_i_c_t state- ments mentioning each peer. AAuutthheennttiiccaattiioonn _X_n_t_p_d supports the optional authentication procedure speci- fied in the NTP Version 2 specification. Briefly, when an asso- ciation runs in authenticated mode each packet transmitted has appended to it a 32-bit key ID and a 64-bit crypto checksum of the contents of the packet computed using the DES algorithm. The receiving peer recomputes the checksum and compares it with the one included in the packet. For this to work, the peers must share a DES encryption key and, further, must associate the shared key with the same key ID. 10 This facility requires some minor modifications to the basic packet processing procedures to actually implement the restric- tions to be placed on unauthenticated associations. These modi- fications are enabled by the _a_u_t_h_e_n_t_i_c_a_t_e configuration state- ment. In particular, in authenticated mode peers which send unauthenticated packets, peers which send authenticated packets which the local server is unable to decrypt and peers which send authenticated packets encrypted using a key we don't trust are all marked untrustworthy and unsuitable for synchronization. Note that while the server may know many keys (identified by many key IDs), it is possible to declare only a subset of these as trusted. This allows the server to share keys with a client which requires authenticated time and which trusts the server but which is not trusted by the server. Also, some additional con- figuration language is required to specify the key ID to be used to authenticate each configured peer association. Hence, for a server running in authenticated mode, the configuration file might look similar to the following: # # Peer configuration for 128.100.100.7 # (expected to operate at stratum 2) # Fully authenticated this time. # peer 128.100.49.105 key 22 # suzuki.ccie.utoronto.ca peer 128.8.10.1 key 4 # umd1.umd.edu peer 192.35.82.50 key 6 # lilben.tn.cornell.edu authenticate yes trustedkey 4 6 22 keys /etc/ntp.keys authdelay 0.000323 driftfile /etc/ntp.drift There are a couple of previously unmentioned things in here. The _a_u_t_h_d_e_l_a_y statement is an estimate of the amount of process- ing time taken between the freezing of a transmit timestamp and the actual transmission of the packet when authentication is enabled (i.e. more or less the time it takes for the DES routine to encrypt a single block), and is used as a correction for the transmit timestamp. This can be computed for your CPU by the _a_u_t_h_s_p_e_e_d program in the distribution's _a_u_t_h_s_t_u_f_f_/ directory. The usage is similar to the following: authspeed -n 30000 auth.samplekeys The _k_e_y_s statement, which declares the location of a file containing the list of keys and associated key IDs the server knows about (for obvious reasons this file is better left 11 unreadable by anyone except the server). The contents of this file might look like: # # Keys 4, 6 and 22 are the actually identical # 4 A DonTTelL 6 S 89dfdca8a8cbd998 22 N c4ef6e5454e5ec4c # # The following 3 are also identical # 100 A SeCReT 10000 N d3e54352e5548080 1000000 S a7cb86a4cba80101 A DES encryption key is 56 bits long, and is written as an 8 octet number with 7 bits of the key in each octet. The eighth bit in the octet is a parity bit and is set maintain odd parity in each octet. In the keys file, the first token on each line indicates the key ID, the second token the format of the key and the third the key itself. There are three key formats. An _A indicates that the key is a 1-to-8 character ASCII string and that the 7-bit ASCII representation of each character should be used as an octet of the key (like a Unix password). An _S indi- cates that the key is written as a hex number in the DES standard format, with the low order bit of each octet being the parity bit. An _N indicates that the key is again a hex number, but that it is written in NTP standard format, with the high order bit of each octet being the parity bit (confusing enough?). The daemon demands that, for the latter two formats, the parity bits be set correctly to maintain odd parity (there was a good reason for this which I have since forgotten...). The key IDs are arbi- trary, unsigned 32-bit numbers. The big trouble with the authentication facility is the keys file. It is a maintenance headache and a security problem. This should be fixed some day. QQuueerryy PPrrooggrraammss Two separate query programs are included with the _x_n_t_p dis- tribution, _n_t_p_q and _x_n_t_p_d_c. _N_t_p_q is a rather nice program which sends queries and receives responses using NTP standard mode 6 control messages. Since it uses the standard query protocol it may be used to query the fuzzball NTP servers as well as _x_n_t_p_d. _X_n_p_t_d_c is a horrid program which uses NTP private mode 7 messages to make requests to the server. The format and and con- tents of these messages are specific to _x_n_t_p_d. The program does allow inspection of a wide variety of internal counters and other 12 state data, and hence does make a pretty good debugging tool even if it is frustrating to use. The other thing of note about _x_n_t_- _p_d_c is that it provides a user interface to the runtime reconfig- uration facility. Both of these programs are nonessential. The sole reason for this section is to point out an inconsistancy which can be awfully annoying if it catches you, and which is worth keeping firmly in mind. Both _x_n_t_p_d_c and _x_n_t_p_d demand that anything which has dimensions of time be specified in units of seconds, both in the configuration file and when doing runtime reconfiguration. Both programs also print the values in seconds. _N_t_p_q, on the other hand, obeys the standard by printing all time values in milliseconds. This makes the process of looking at values with _n_t_p_q and then changing them in the configuration file or with _x_n_t_p_d_c very prone to errors (by three orders of magnitude). I wish this problem didn't exist, but _x_n_t_p_d and its love of seconds predate the mode 6 protocol and the latter's (fuzzball-derived) millisecond orientation, making the inconsistancy irresolvable without considerable work. RRuunnttiimmee RReeccoonnffiigguurraattiioonn _X_n_t_p_d was written specifically to allow its configuration to be fully modifiable at run time. Indeed, the only way to config- ure the server is at run time. The configuration file is read only after the rest of the server has been initialized into a running, but default unconfigured, state. This facility was included not so much for the benefit of Unix, where it is handy but not strictly essential, but rather for limited platforms where the feature is more important for maintenance. Never-the-less, runtime configuration works very nicely for Unix servers as well. Nearly all of the things it is possible to configure in the configuration file may be altered via NTP mode 7 messages using the _x_n_t_p_d_c program. Mode 6 messages may also provide some lim- ited configuration functionality (though I think the only thing you can currently do with mode 6 messages is set the leap second warning bits) and the _n_t_p_q program provides generic support for the latter. Mode 6 and mode 7 messages which would modify the configura- tion of the server are required to be authenticated using stan- dard NTP authentication. To enable the facilities one must, in addition to specifying the location of a keys file, indicate in the configuration file the key IDs to be used for authenticating reconfiguration requests. Hence the following fragment might be added to a configuration file to enable the mode 6 _(_n_t_p_q_) and mode 7 _(_x_n_t_p_d_c_) facilities in the daemon: keys /etc/ntp.keys requestkey 65535 # for mode 7 requests 13 controlkey 65534 # for mode 6 requests If the _r_e_q_u_e_s_t_k_e_y and/or the _c_o_n_t_r_o_l_k_e_y configuration statements are omitted from the configuration file, the corresponding run- time reconfiguration facility is disabled. The query programs require the user to specify a key ID and a key to use for authenticating requests to be sent. The key ID provided should be the same as the one mentioned in the configu- ration file, while the key should match that corresponding to the key ID in the keys file. As the query programs prompt for the key as a password, it is useful to make the request and control authentication keys typable (in ASCII format) from the keyboard. NNaammee RReessoolluuttiioonn A recent addition to _x_n_t_p_d is the ability to specify host names requiring resolution in _p_e_e_r and _s_e_r_v_e_r statements in the configuration file. There are several reasons why this was not permitted in the past. Chief among these is the fact that name service is unreliable and the interface to the Unix resolver rou- tines is synchronous. The hangs this combination can cause are unacceptable once the NTP server is running (and remember it is up and running before the configuration file is read). Instead of running the resolver itself the daemon defers this task to a separate program, _x_n_t_p_r_e_s. When the daemon comes across a _p_e_e_r or _s_e_r_v_e_r entry with a non-numeric host address it records the relevant information in a temporary file and contin- ues on. When the end of the configuration file has been reached and one or more entries requiring the resolver have been found the server runs an instance of _x_n_t_p_r_e_s with the temporary file as an argument. The server then continues on normally but with the offending peers/servers omitted from its configuration. What _x_n_t_p_r_e_s does attempt to resolve each name. When it successfully gets one it configures the peer entry into the server using the same mode 7 runtime reconfiguration facility that _x_n_t_p_d_c uses. If temporary resolver failures occur, _x_n_t_p_r_e_s will periodically retry the offending requests until a definite response is received. The program will continue to run until all entries have been resolved. There are several configuration requirements if _x_n_t_p_r_e_s is to be used. The path to the _x_n_t_p_r_e_s program must be made known to the daemon via a _r_e_s_o_l_v_e_r configuration entry, and mode 7 run- time reconfiguration must be enabled. The following fragment might be used to accomplish this: resolver /local/etc/xntpres keys /etc/ntp.keys requestkey 65535 14 Note that _x_n_t_p_r_e_s sends packets to the server with a source address of 127.0.0.1. You should obviously avoid _r_e_s_t_r_i_c_ting modification requests from this address or _x_n_t_p_r_e_s will fail. TTiicckkaaddjj aanndd FFrriieennddss _X_n_t_p_d understands intimately the peculiarities of the kernel implementation of the _a_d_j_t_i_m_e(2) system call as it is done on most machines. Two variables are of interest, these being _t_i_c_k and _t_i_c_k_a_d_j_. The variable _t_i_c_k is expected to be the number of microsec- onds added to the system time on each clock interrupt. The vari- able _t_i_c_k_a_d_j is used by the time adjustment code as a slew rate. When the time is being slewed via a call to _a_d_j_t_i_m_e_(_) the kernel essentially increases or reduces _t_i_c_k by _t_i_c_k_a_d_j microseconds until the specified slew has been done. Unfortunately, the Berkeley code will vary the clock increment by exactly _t_i_c_k_a_d_j microseconds only, meaning that adjustments are truncated to be an integral multiple of _t_i_c_k_a_d_j (this latter behaviour is a mis- feature, and is the only reason the code needs to concern itself with the internal implementation of _a_d_j_t_i_m_e at all). Thus, to make very sure it avoids problems related to the roundoff, the daemon on startup reads the values of _t_i_c_k and _t_i_c_k_a_d_j from _/_d_e_v_/_k_m_e_m when it starts. It then ensures that all adjustments given to _a_d_j_t_i_m_e_(_) are an even multiple of _t_i_c_k_a_d_j microseconds, and computes the biggest adjustment that can be made in the 4 second adjustment interval (using both the value of _t_i_c_k_a_d_j and the value of _t_i_c_k) so it can avoid exceeding this limit if possi- ble. Unfortunately, the value of _t_i_c_k_a_d_j set by default is almost always too large for _x_n_t_p_d. NTP operates by continuously making small adjustments to the clock. If _t_i_c_k_a_d_j is set too large the adjustments will disappear in the roundoff, while if _t_i_c_k_a_d_j is too small, NTP will have difficulty if it needs to make an occa- sional large adjustment. While the daemon itself will read the kernel's value of _t_i_c_k_a_d_j it will not change the value even if it is unsuitable. You must do this yourself before the daemon is started, either with _a_d_b or, in the running kernel only, with the _t_i_c_k_a_d_j program in the _x_n_t_p distribution's _u_t_i_l_/ directory. Note that the latter program will also compute an optimal value of _t_i_c_k_a_d_j for NTP use based on the kernel's value of _t_i_c_k_. The _t_i_c_k_a_d_j program will also reset several other kernel variables if asked. It will change the value of _t_i_c_k if you wish, this being necessary on a few machines with very broken clocks. It will also set the value of the _d_o_s_y_n_c_t_o_d_r variable to zero if it exists and if you want. This kernel variable tells some Suns to attempt synchronize the system clock to the time of day clock, something you really don't want to be happen when _x_n_t_p_d is trying to keep it under control. 15 All this stuff about diddling kernel variables so the NTP daemon will work is really silly. If vendors would ship machines with clocks that kept reasonable time and would make their _a_d_j_- _t_i_m_e_(_) system call apply the slew it is given exactly, indepen- dent of the value of _t_i_c_k_a_d_j_, all this could go away. TTuunniinngg YYoouurr SSuubbnneett There are several parameters available for tuning your server. These are set using the _p_r_e_c_i_s_i_o_n and _m_a_x_s_k_e_w configura- tion statements. A fragment which would simply reset these to their default values (i.e. do nothing useful) follows: precision -6 maxskew 0.01 The _p_r_e_c_i_s_i_o_n statement sets the value of _s_y_s_._p_r_e_c_i_s_i_o_n while the _m_a_x_s_k_e_w statement sets the value of _N_T_P_._M_A_X_S_K_W (both of these variables are defined in RFC 1119). Sys.precision is defined in the NTP specification to be the base 2 logarithm of the expected precision of the system clock. It used to be set by reading the kernel's clock interrupt period and computing a value of sys.precision which gave a precision close to this, in essence making the value rather unpredictable. This was unfortunate since, for NTP version 2, sys.precision acquired several quite important functions and is useful as a tuning parameter. The current behaviour is to default this value to -6 in all servers. The NTP protocol makes use of sys.precision in several places. Sys.precison is included in packets sent to peers and is used by them as a sort of quality indicator. When faced with selecting one of several servers of the same stratum and about the same network path delay for synchonization purposes, clients will tend to prefer to synchronize to those claiming the smallest (most negative) sys.precision. The effect is particularly pro- nounced when all the servers are on the same LAN. Hence, if you run several stratum 1 servers, or 3 or 4 stratum 2 servers, but you would like the client machines to prefer one of these over the other(s) for synchronization, you can achieve this effect by decreasing sys.precision on the preferred server and/or increas- ing this value on the others. The other tuning parameter is the antihop aperture and is derived from sys.precision and NTP.MAXSKW using the following equation: 2**sys.precision + NTP.MAXSKW Making the antihop aperture larger will make the server less likely to hop from its current system peer (synchronization source) to another, while increasing the probability of the 16 server remaining synchonized to a peer which has gone insane. Making the antihop aperture smaller allows the server to hop more freely from peer to peer, but this can also cause it to generate a fair bit more NTP packet traffic than necessary for no good purpose. Given the agreement among current stratum 1 NTP servers and the performance typical of the Internet, it is recommended that the antihop aperture be maintained with a value of between 0.020 and 0.030 (if you calculate the default value you will find it is about 0.026). You can change the antihop aperture by changing the value of NTP.MAXSKW via a _m_a_x_s_k_e_w configuration statement. Note, however, that if you wish to change sys.precision via a _p_r_e_c_i_s_i_o_n configuration statement, but _d_o_n_'_t wish to alter the antihop aperture, you must change NTP.MAXSKW to compensate. All this stuff is far too complicated. LLeeaapp SSeeccoonndd SSuuppppoorrtt _X_n_t_p_d understands leap seconds and will attempt to take appropriate action when one occurs. In principle, within four seconds of the insertion of a leap second, every host running _x_n_t_p_d will have had its clock stepped back one second to maintain time synchronization with UTC. Servers with active radio clocks will also take any clock-dependent action required to avoid being lead astray by clocks which don't understand leap seconds. Unfortunately, it has proved awfully hard to devise a test for this feature. Because of this, what _x_n_t_p_d will actually do when a leap second occurs is a good question. CClloocckk SSuuppppoorrtt OOvveerrvviieeww _X_n_t_p_d was designed to support radio (and other external) clocks and does some parts of this function nicely. Clocks are treated by the protocol as favoured NTP peers, even to the point of referring to them with an (invalid) IP host address. Clock addresses are of the form 127.127._t._u, where _t specifies the par- ticular type of clock (i.e. refers to a particular clock driver) and _u is a "unit" number whose interpretation is clock driver dependent. This is analogous to the use of major and minor device numbers by Unix. Because clocks look much like peers, both configuration file syntax and runtime reconfiguration commands can be be adopted to control clocks unchanged. Clocks are configured via _s_e_r_v_e_r dec- larations in the configuration file (_p_e_e_r declarations can also be used, but for hysterical reasons the _s_e_r_v_e_r form is pre- ferred), can be started and stopped using _x_n_t_p_d_c and are subject to address-and-mask restrictions much like a normal peer. As a concession to the need to sometimes transmit additional informa- tion to clock drivers, however, an additional configuration file statement was added, the _f_u_d_g_e statement. This enables one to 17 specify the values two time quantities, two integral values and two flags, the use of which is dependent on the particular clock driver the values are given to. For example, to configure a PST radio clock which can be accessed through the serial device /dev/pst1, with propagation delays to WWV and WWVH of 7.5 and 26.5 milliseconds, respectively, on a machine with an imprecise system clock and with the driver set to disbelieve the radio clock once it has gone 30 minutes without an update, one might use the following configuration file entries: server 127.127.3.1 fudge 127.127.3.1 time1 0.0075 time2 0.0265 fudge 127.127.3.1 value2 30 flag1 1 Unfortunately, all _x_n_t_p_d radio clock support currently expects to use the Berkeley terminal driver. Worse, while most drivers don't absolutely demand it, all clock driver development to date has been done using the services of a line discipline which takes receive timestamps in the kernel, in the interrupt routine of the async hardware driver. Worse still, the algo- rithms used to process input data from the clocks assume that the terminal driver will be well behaved. This assumption may require modifications to individual async hardware drivers on some machines (Sun is bad for doing silly things with their ter- minal drivers). The reason for this is that the clock support was designed to take advantage of the performance of a particular dedicated hardware platform, and is somewhat less than forgiving of the woes of lesser machines. If you can provide a suitable platform for _x_n_t_p_d, however, the clock support is really very good. There are several undocumented programs which are useful if you are trying to set up a clock, to be found in the _c_l_o_c_k_s_t_u_f_f_/ directory of the distribution. The most useful of these is the _p_r_o_p_d_e_l_a_y program, which can compute high frequency radio propa- gation delays between any two points whose latitude and longitude are known. The program understands something about the phenomena which allow high frequency radio propagation to occur, and will generally provide a better estimate than a calculation based on the great circle distance. The other two programs in the direc- tory are _c_l_k_t_e_s_t, which allows one to exercise the generic clock line discipline, and _c_h_u_t_e_s_t, which runs the basic reduction algorithms used by the daemon on data received from a serial port. The authoritative source of information on particular clock drivers is the _x_n_t_p_d(8) manual page.