On Synchronizing DARTnet Clocks to the Millisecond - a Report David L. Mills University of Delaware 3 February 1991 1. Introduction This is a status report on the hunt toward synchronizing DARTnet clocks reliably to the millisecond. Included in this report are the results of a preliminary investigation into the timekeeping quality expected for DARTnet experiments. The investigation suggests that DARTnet routers can attain and sustain accuracies to the order of a millisecond and stabilities to the order of a millisecond per day. It also points out certain routing problems that can seriously undermine the confidence that this level of performance can be guaranteed for all experiments. This should be a high priority problem to resolve. Also in this report is a description of the fuzzball clock statistics mechanism together with file formats and data interpretation which can be used to calibrate experimental results after the fact in order to realize the ultimate timekeeping accuracy. All operating DARTnet routers are now running NTP Version 2 with each other and with two NTP stratum-1 (primary) time servers, wwvb.isi.edu at ISI and dcn5.udel.edu at UDel. The fuzzballs are running NTP Version 3 as described in the latest revision of the protocol specification found in the PostScript file pub/ntp/ntpv3.ps.Z on louie.udel.edu. While Version 3 is backwards-compatible with Version 2, some minor differences in status reporting exist. The result of this is that the Unix program ntpq, ordinarily used to probe NTP time servers for status, operates slightly differently with the Unix xntpd and fuzzball servers. (The main difference is that the tattletale indicator reported just after the peer ID is not always meaningful for the fuzzball.) The investigation also happened to reveal certain design problems with the fuzzball time servers used by DARTnet to synchronize local clocks to Coordinated Universal Time (UTC). These problems were due to the increasing demands placed upon the primary servers by a rapidly escalating cadre of users and involved relatively high timing jitter due to interrupt latencies, queueing delays and so forth. These problems have largely been alleviated by subtle and intricate jinks in the code; however, the problem remains that time service is suddenly getting very popular. To that end I have incorporated access controls in the UDel time servers designed to help hold down incidences of casual abuse, such as many users from the same network ganging up on the same server. I intend to update ISI and the remaining fuzzball time servers shortly. A message to that effect has been sent to the NTP interest list along with guidelines that should help reduce friendly fire casualties in future. 2. Retrospective Timekeeping Data I have configured a dedicated fuzzball dcn2.udel.edu (actually, one that is to be shipped to MIT when the NEARnet NSS is installed there). This machine is internally synchronized to the UDel master clocks and continuously watches every interface on every DARTnet router, as well as selected fuzzball time servers. The data are reported to a file which can be retrieved via FTP. For most purposes this statistical luxury can be ignored, since I intend to watch the clocks as part of my regular snoop rounds. However, for those experiments where timestamps reliable to the millisecond are required, experimenters may need to calibrate extracted timestamps using retrospective offset measurements available in this file. The binary data file is called stats.dat and is accessible via login tcp-test, password cerf (in the finest historical tradition). It consists of records of eight 16-bit words in little-endian, binary representation. The records are in the format: +------+------+------+------+------+------+------+------+ | Day | ToD | Code | Offset |Delay |Disper| +------+------+------+------+------+------+------+------+ word 0 1 2 3 4 5 6 7 Day NTP day, where day 0 is 1 January 1900; to obtain Modified Julian Day (MJD), add 41317 to this number ToD time of day relative to UTC midnight (milliseconds), high- order 32 bits followed by low-order 32 bits Code two-octet code, coded as follows 0-7 high-order octet of the peer status word described in Appendix B of RFC-1119 and ntpv3.ps, coded as follows 0 configured entry 1 peer authentication enabled 2 peer correctly authenticated (truechimer) 3 peer reachable 5-7 peer selection status, coded as follows (the character shown following the code is the tattletale indicator shown by the fuzzball monitoring program; however, the ntpq monitoring program may tattle different characters) 0 rejected 1 x passed sanity checks (tests 1 through 8 in Section 3.4.3) 2 . passed correctness checks (intersection algorithm in Section 4.2.1) 3 - passed candidate checks (only the first 10 peers on the candidate list are considered for clock selection in the fuzzball implementation) 4 + passed outlyer checks (clustering algorithm in Section 4.2.2) 5 # current synchronization source; max distance exceeded (in the fuzzball implementation this means the synchronization distance exceeds 1000 ms, normally considered real raunchy time) 6 * current synchronization source; max distance okay 7 reserved Note that these codes are cumulative; a status of n implies passing all tests at level n, together with all tests at levels less than n 8-15 peer ID Offset signed offset of peer clock relative to local clock (milliseconds), high-order 32 bits followed by low-order 32 bits Delay signed roundtrip delay via peer clock to the root of the synchronization subnet (milliseconds); note: under some conditions this number can be negative Disper unsigned dispersion via peer clock to the root of the synchronization subnet (milliseconds) Note that the file is fixed size, normally 512K bytes, and is zero- padded at the end. Ordinarily, this provides for about a month of data recording, following which I rename the old file and create a new one. At present, there is no facility for automatically creating a new file when the old one fills up. Trust me. 3. How Goes the Tick In order to assess the accuracy and robustness of NTP-synchronized clocks, I use a special-purpose simulator that processes the statistical data and closely mimics the operation of the NTP daemon on a real machine. The program reads a file such as produced by the handy-dandy ASCII conversion routine described in Appendix A and produces a file showing the resulting behavior of the filtering, selecting, combining and phase-lock-loop algorithms. An example of the data input to this program is shown below (all numbers are in milliseconds, except the day: MJD and the Code: hex). Day ToD Code Offset Delay Disper ------------------------------------------- 48289 79369 6115 -4 39 12 48289 106624 4204 -5 63 4 48289 119282 120a -13564 113 3623 48289 185707 120a -13129 115 3623 48289 213995 6115 -3 38 12 48289 217210 420e -6 116 27 48289 223560 6204 -5 62 3 48289 244875 2216 -6 192 27 48289 253305 130a 1182 153 2043 48289 276211 2208 -6 116 27 48289 277278 4209 -6 117 27 48289 279379 220b -6 118 27 48289 315930 3113 -86 262 11 48289 318826 4206 -4 65 3 48289 319932 130a 819 154 2043 48289 348936 3115 -3 38 11 48289 358368 6204 -5 63 3 48289 386798 130a 578 153 2043 48289 454050 130a 574 154 2043 48289 484507 3115 -3 38 11 48289 493956 6204 -5 63 3 48289 589423 130a 585 153 2043 48289 619696 3115 -3 37 11 48289 629130 6204 -5 63 3 48289 763976 6204 -5 63 3 48289 774498 4207 -6 63 3 48289 787098 2216 -7 195 42 48289 858592 1113 -33 159 12 48289 859525 120a -172 113 71 48289 889750 6115 -4 41 11 48289 898115 4204 -5 64 3 48289 911749 420f -5 138 13 48289 912807 3114 -10 48 2 48289 1016709 120d -28 132 16 48289 1023919 6115 -4 38 11 48289 1032084 4204 -5 62 3 48289 1056335 2216 -8 190 46 The last two hex digits of the Code indicate the peer ID, from which we learn that peer 0x0a is in awful shape, for what cause I do not know. Host 0x13 happens to be ISI fuzzball, shows a systematic offset of --33 ms due to a well known asymmetric path one way via DARTnet and the other way via NSFnet. Note that these are unfiltered data; things look much better after the NTP algorithms get to work. The following table is a snapshot taken from dcn2.udel.edu for every interface of every DARTnet router, along with fuzzball primary servers wwvb.isi.edu [128.9.2.129], dcn1.udel.edu [128.4.0.1] and dcn5.udel.edu [128.4.0.5], as well as a selected fuzzball secondary server clock.sura.net [192.80.214.42]. The data show the peer ID, IP address and (filtered) delay, offset and weight. The weight, which is used by the clock-combining algorithm (adapted from that used by NIST to produce the U.S. standard timescale from a bunch of cesium clocks), is computed as the quotient 65536 divided by the synchronization distance, itself computed as the total dispersion plus one-half the total delay to the root of the synchronization subnet, as described in the NTP Version 3 specification. For this purpose higher values of weight indicate higher quality of data, as determined by the algorithm. ID Address Dly Ofs Wgt -------------------------------------- 4 + [140.173.32.2] 25 -5 1638 5 - [140.173.16.1] 19 -6 2114 6 + [140.173.16.2] 25 -5 1985 7 + [140.173.64.1] 26 -4 1337 8 . [140.173.64.2] 78 -13 840 9 . [140.173.128.1] 78 -13 840 10 x [140.173.128.2] 80 -65 56 11 . [140.173.112.1] 78 -13 840 12 . [140.173.112.2] 88 -4 885 13 + [140.173.96.1] 87 -4 897 14 . [140.173.96.2] 77 -14 819 15 . [140.173.80.1] 90 -5 840 16 + [140.173.80.2] 87 -5 897 17 + [140.173.144.1] 88 -5 897 18 + [140.173.144.2] 90 -3 885 19 x [128.9.2.129] 169 -41 579 20 - [128.4.0.1] 39 -7 2340 21 * [128.4.0.5] 38 -3 3120 22 . [192.80.214.42] 55 -9 736 Following the peer ID is the tattletale indicator, which shows the status of that entry as determined by the NTP algorithms. The tattletale coding is as described above for the statistics file entry. The entry marked "*" is the current synchronization source as determined by NTP; however, in the case of this particular fuzzball this source is not used to synchronize the machine itself, since a primary reference source is available internally. The two entries marked "x" have been discarded by the intersection (modified Marzullo) algorithm as suspect falsetickers, while the entries marked "." have been discarded from the end of the surviving candidate list (the fuzzballs keep a maximum of ten of the lowest-distance candidates on that list). The entries marked "-" have been discarded by the clustering (Mills) algorithm, while the remaining entries marked "+" have survived all hazards and are therefore eligible for processed by the weighted-average (NIST) combining algorithm which produces the final local-clock correction (whew). For comparison and evaluation, the following two tables show the offset, delay and dispersion for the last eight measurements made from dcn2.udel.edu to wwvb.isi.edu and dcn5.udel.edu. Note that the dispersion (due primarily to maximum assumed local-clock frequency offset) indicates the approximate age of the sample, in minutes. wwvb.isi.edu Offset -68 -41 -42 -38 -66 -69 -38 -71 Delay 227 169 174 166 224 227 164 236 Disper 1 17 34 50 67 83 117 133 dcn5.udel.edu Offset -3 -4 -4 -5 -5 -6 -4 -4 Delay 38 39 38 39 39 44 38 39 Disper 1 5 9 13 17 22 26 30 The systematic offset of wwvb.isi.edu is clearly apparent, as is the typical NSFnet noise on that the leg of the roundtrip path. The dcn5.udel.edu data show low dispersions typical of the quiet DARTnet. However, dcn5.udel.edu shows a systematic offset of about -5 ms due to subtle delays in the internal clock-synchronization scheme used locally. When we get all our wires uncrossed here, that curiosity should disappear. Nevertheless, what experimenters usually need is a calibration of experiment host and router offsets between each other, in order to determine a common reference timescale. In principle, it doesn't matter what the systematic offsets are, just that they can be determined retrospectively from the statistics file. 4. How Tight the Tick The question yet remains on the accuracy and stability of the particular local clock on each experiment host and router. It is not possible to observe this directly with the available hardware for other than fuzzballs, and, even for fuzzballs requires a precision interval counter (such as constructed by one of my students). However, the NTP local- clock model is described as an adaptive-parameter, type-II, phase-lock loop (PLL). Fortunately, one of the parameters estimated, called the compliance (Greek tau in Appendix G of the NTP Version 3 specification), is a particularly useful indicator of local-oscillator stability. Large negative values indicate an oscillator with stabilities in the range better than 1 ms per day, while more positive values indicate other oscillators varying over the range up to several seconds per day. For fuzzballs (only) an indirect indicator of the compliance is the poll interval, which is included in all NTP packets. This value, expressed as a power of 2 in seconds, is shown in both the Unix and fuzzball monitoring displays. In version 3 of the protocol the poll interval is tied to the compliance, with effect that the interval varies from about a minute for the most unstable oscillator (the power grid) to about 17 minutes for most stable oscillators (LORAN-C receivers and cesium clocks). The displayed value can vary from 6 through 10, with the highest number indicating the highest stability. Ordinarily, this tidbit would be important only for experiments designed to run for some time (hours) while NTP is turned off. One of the goals of the NTP Version-3 models is to demonstrate accuracies to the millisecond, at least in DARTnet-like systems, as well as stabilities to the order of a millisecond per day. If these goals can be obtained in principle, a DARTnet platform could be synchronized, then NTP could be turned off during the experiment itself, if need be, with assurance the platform would not wander by more than fractional milliseconds during the experiment. In order to test whether these goals can be met, the simulator previously described was used to process the statistics data collected from dcn2.udel.edu for the last couple of weeks. A snatch of these data covering approximately the same interval used previously to illustrate the statistics format is shown below: Time Offset Freq Delay Dist Clks RefID --------------------------------------------------------- 20001.3 -5.711 -0.0023 41.0 33.8 8 20 20003.6 -5.708 -0.0023 40.0 31.5 6 20 20005.8 -5.704 -0.0023 41.0 29.5 6 20 20008.0 -5.699 -0.0022 40.0 28.5 6 20 20010.3 -5.698 -0.0022 39.0 29.9 6 20 20012.5 -5.698 -0.0022 39.0 29.5 6 20 20014.8 -5.697 -0.0022 42.0 30.8 6 20 20017.0 -5.692 -0.0022 39.0 31.5 10 20 20019.3 -5.689 -0.0022 39.0 28.4 10 20 20021.5 -5.686 -0.0022 39.0 28.6 10 20 20023.8 -5.683 -0.0022 40.0 28.8 10 20 20026.1 -5.680 -0.0021 38.0 27.7 10 20 20028.3 -5.676 -0.0021 38.0 29.4 10 20 20030.6 -5.673 -0.0021 39.0 28.1 10 20 20032.8 -5.670 -0.0021 39.0 30.3 10 20 20035.0 -5.662 -0.0021 42.0 32.2 10 20 20037.3 -5.657 -0.0020 40.0 28.6 10 20 20039.5 -5.652 -0.0020 40.0 30.6 10 20 20041.8 -5.649 -0.0020 39.0 29.9 10 20 20044.0 -5.645 -0.0020 39.0 31.6 10 20 20046.3 -5.642 -0.0020 39.0 29.5 10 20 20048.5 -5.637 -0.0019 40.0 29.7 10 20 20050.7 -5.630 -0.0019 40.0 30.4 9 20 20053.0 -5.623 -0.0019 40.0 33.0 9 20 20055.2 -5.615 -0.0018 39.0 29.6 9 20 20057.5 -5.607 -0.0018 39.0 28.7 9 20 20059.7 -5.600 -0.0018 39.0 30.4 9 20 20062.0 -5.593 -0.0017 40.0 27.4 9 20 20064.2 -5.589 -0.0017 43.0 33.5 9 20 20066.5 -5.580 -0.0017 40.0 29.9 10 20 20068.7 -5.572 -0.0017 41.0 30.7 10 20 20070.9 -5.567 -0.0016 38.0 27.2 7 20 20075.4 -5.552 -0.0016 38.0 32.7 9 20 20084.5 -5.528 -0.0014 40.0 36.0 9 20 20102.6 -5.470 -0.0011 38.0 40.1 9 20 20120.7 -5.415 -0.0007 37.0 42.4 9 20 20138.9 -5.377 -0.0005 38.0 40.0 9 20 20159.1 -5.340 -0.0003 38.0 43.4 8 21 20161.3 -5.338 -0.0003 39.0 38.8 8 21 20163.6 -5.335 -0.0002 38.0 41.2 8 21 20165.8 -5.331 -0.0002 38.0 39.6 8 21 20168.1 -5.327 -0.0002 38.0 39.5 8 21 20170.3 -5.323 -0.0002 37.0 38.9 8 21 20174.8 -5.316 -0.0002 37.0 42.4 8 21 20177.1 -5.313 -0.0001 38.0 37.1 8 21 In this table time is shown in minutes past a certain epoch, which is not important here, while the offset, delay and distance are shown in milliseconds. The frequency column shows the computed stability in parts-per-million. The sixth column shows the number of peers remaining on the candidate list after the intersection algorithm (10 max), while the last column shows the resulting synchronization source selection. The above data shows that, at least for periods of a couple of hours, the "real" time stays within a few hundred microseconds and the stability within a few hundred microseconds per day. However, while these data are typical of DARTnet behavior, at least for the present lightly loaded network and with the currently broken kernel clock code, performance is not always this good. Following is a summary of the network behavior over about the last couple of weeks. ID Samples Mean StdDev Max Min -------------------------------------------------- 0 5097 -4.122 1.719 0.983 -7.480 4 100 -4.930 7.879 73.000 -7.007 5 1608 -4.368 9.055 7.000 -103.000 6 1901 -3.967 2.810 19.000 -11.000 7 1929 -4.144 2.509 26.000 -11.000 8 558 -3.533 4.003 47.000 -26.000 9 547 -3.321 5.059 65.000 -25.000 10 968 -770.688 1025.602 1182.000-13564.000 11 549 -3.390 4.825 69.000 -25.000 12 951 -3.656 3.878 72.000 -17.000 13 445 -5.972 7.195 80.000 -30.000 14 549 -3.435 4.994 74.000 -25.000 15 945 -17.458 126.391 45.000 -1249.000 16 951 -3.561 4.118 71.000 -17.000 17 941 -3.604 3.735 72.000 -17.000 18 938 -6.643 29.347 56.000 -570.000 19 1017 -40.530 16.283 19.347 -234.000 20 2146 -5.368 4.084 27.000 -47.000 21 5419 -3.320 3.102 75.000 -24.000 22 499 -9.317 11.103 46.000 -76.000 In this table the peer ID corresponds to the fuzzball monitoring table described previously, while the statistics are computed from the offset data in the statistics file described previously. However, peer ID zero corresponds to the output of the simulation program, so represents the computed offset of the local platform (dcn2.udel.edu) relative to the mean timescale represented by all the DARTnet players. This timescale is provably the best clock in the house, with maximum error over the entire interval less than about +-3.5 ms relative to the mean. Most of the residual error is due to transients created when the machine was rebooted. Obviously, something was certainly broken at entry 10 at some time over the period of observation and probably at entry 15. The systematic offset previously mentioned with entry 19 (wwvb.isi.edu) is clearly apparent, but entry 15 doesn't look to good, either. 5. How Timely the Tick There is some concern about the time it takes to tune the NTP contraption exactly on frequency. The long answer is that it takes days. The short answer is the long answer stashed in a file that is read upon reboot. The xntpd program presently does this, which considerably shortens the "training" interval while the initial transient in setting the clock damps out. While the PLL is mildly underdamped and stabilizes offset within a few percent in about an hour, large offset surges can produce frequency errors that may not completely settle down until a day or two have elapsed. Ordinarily, these niceties should not be of concern, unless accuracies to the millisecond are required. Rarely do the local-clock offsets exceed ten milliseconds even right after reboot. Since reboot frequencies may be somewhat larger than usual with DARTnet activities, some consideration of the initial synchronization procedure may help explain what may at first seem odd behavior. When an NTP client comes up it starts sending packets to each of its configured buddies. Upon receiving a validated reply (eight separate sanity checks plus crypto-authentication between fuzzballs - optional with otherballs), the offset/delay/dispersion sample is stashed in an eight-stage shift register, one for each peer. As succeeding samples arrive, the filtering, selecting and combining engines puff away to digest the data. As the shift registers fill up the overall synchronization distance computed from the dispersion data ordinarily decreases below a preset threshold, in the case of fuzzballs 1000 ms. During all this time the algorithms are doing their best to select the best out of possibly very noisy peers and the tattletale indicators wiggle as advertised. However, the local clock is not updated until the overall distance falls below the preset threshold. At this point the PLL kicks in and the local clock begins to hum. During the initial interval after reboot one peer may tattle "#" and many (up to 10) tattle "+". Ordinarily, after about ten minutesthe bumps and grinds should eventually converge with the "#" be replaced by "*" and only a few "+". It is important to note that, during the initial acquisition interval and others where the timing noise is exceptionally high, the number of peers apparently eligible for synchronization may seem bogusly large. The clustering algorithm used to select the best subset from a bunch of clocks compares the dispersion of each clock relative to the ensemble of all available clocks and tosses outlyers until the ensemble dispersion is less than any clock dispersion or the number of clocks hits a lower bound, currently 3. Details of this intricate procedure can be found in the NTP Version 3 specification, Section 4.2.2. 6. How Will the Tick There are several projects that need to be completed before the clock assemblage can be considered completely stable. First and most urgently, Van Jacobson's fixes for the SunOS timekeeping code must be installed in all routers and experiment hosts. Second, the NTP configuration at dcn1.udel.edu needs to be augmented to include the experiment hosts as the addresses are determined. Note that the watchkeeper and UDel experiment host pogo.udel.edu have routes to DARTnet and associated local nets, but not the other DCnet hosts or time servers. A problem remains on how to synchronize the experiment hosts and possibly other hosts on the various local nets associated with DARTnet. While those hosts that know about DARTnet and its associated nets, the ISI and UDel time servers don't know about them. It is probably not a good idea to teach these servers about the DARTnet topology, since there are numerous other clients of these servers on those local nets and the preferred routes to them should not transit DARTnet. My suggested solution is to configure each of the experiment hosts to chime time exclusively from DARTnet routers. While this results the most accurate and stable time synchronization for DARTnet experiments, there may be a certain reluctance to surrender the clock to an experimental contraption in the first place. I don't have trouble doing that here and would welcome discussion on the issue. Finally, some way must be found for the time server at ISI and the watchkeeper at UDel to schmooze with each other via DARTnet. This is critical for calibrating the ISI timescale against the UDel timescale when some DARTers chime with one and others chime with the other. Assuming a second Ethernet interface can be spliced to this beast, individual host routes can be set up so that traffic only for these hosts will ride the DARTnet rails, while traffic for other hosts on the same net will wander the NSFnet swamps. It is probably not a good idea to avoid the problem by forcing all platforms to synchronize to only one of the two servers, but not the other, since they occasionally suffer radio timewarps, are rebooted, lose power, loose connectors and whatnot. In fact, the present situation is somewhat dangerous in that only two servers with two radios are available. The preferred way to avoid the timewarp problem is to splice a third radio to the UDel watchkeeper, so that it in effect becomes a dedicated primary server for DARTnet. I can do that if Steve Casner sends me his second WWVB clock as promised, which I can then (fix?) and install on dcn2.udel.edu. I am hoping Spectracom will give me one of their new WWVB clocks for me to evaluate, which also works. Eventually, the clock and the watchkeeper machine itself are to move to MIT; but, in that case, I assume that some way can be found to splice it to DARTnet from there. Appendix A. Handy Programs I do most of my data-reduction chores on a souped-up i386 with Microsoft QuickC. Among the handy-dandy programs used are two gems, one to convert the statistics file from binary format to ASCII, and the other to simulate the NTP algorithms as driven by the ASCII file. I offer them for what it's worth and not necessarily as examples of programming virtuosity. The first program below reads an input file in the format previously described and writes its ASCII representation on the output file. It also does some error checking to cast out junk that can happen due to various kinds of server calamities, like updating software versions. The command line consists of the input file name, the output file name and a selector. If the selector matches the ID of a peer, statistics for only that peer are produced. If the selector is zero, statistics for all peers are produced. The second program is the NTP simulator. It prompts for the name of an ASCII statistics file and then cranks the NTP algorithms to produce the data shown in the main body of this report. Output goes the control terminal or can be redirected to a file. This is the same program from which the sample C programs given in Appendix I of the NTP Version 3 specification were extracted. /* this program reads a statistics file and converts to ascii */ #include #include #define UTCDAY 41317 /* MJD at 1 Jan 1972 */ long int n1 = 0, n2 = 0; FILE *fopen(), *fp_in, *fp_out; main(int argc, char *argv[]) { int n; if (argc < 4) { printf ("usage: infile outfile peer\n"); exit (1); } if ((fp_in = fopen (argv[1], "rb")) == NULL) { printf ("input file does not exist\n"); exit (1); } fp_out = fopen (argv[2], "w"); if (sscanf(argv[3], "%i", &n) != 1) { printf ("invalid argument\n"); exit (1); } convert_file (fp_in, fp_out, n); fclose (fp_in); fclose (fp_out); return (0); } convert_file (FILE *input, FILE *output, int sel) { unsigned int buf[8], i, c, date, daychk, hour, minute; unsigned long time; float second; struct rec { /* record produced by fuzzball */ unsigned int date; unsigned long time; unsigned int code; long offset; int delay; int disp; } *p; n1 = 0; n2 = 0; daychk = 0; while (!feof(input)) { c = fread(buf, 2, 8, input); c = buf[1]; buf[1] = buf[2]; buf[2] = c; /* endian */ c = buf[4]; buf[4] = buf[5]; buf[5] = c; p = buf; date = (p->date & 0x3fff)+UTCDAY; time = p->time; if (daychk == 0) daychk = date; if ((date > 50000) || (date < daychk) || (time > 86400000) || (p->code == 0)) continue; n1++; if ((sel > 0) && (sel != (p->code & 0xff))) continue; n2++; hour = time/3600000; minute = (time%3600000)/60000; second = (time%60000)/1000.; /* fprintf(output, "%6u%3i:%02i:%06.3f", date, hour, minute, second); */ fprintf(output, "%6u%9lu", date, time); fprintf(output, " %04x%12li%6i%6i\n", p->code, p->offset, p->delay, p-> disp); } printf("input %li output %li\n", n1, n2); } /* NTP simulator This program reads sample data from a history file and simulates the behavior of the NTP filter, selection and local-clock algorithms. */ #include #include #include /* Parameters */ #define C_SYNC 0x8000 /* synch okay */ #define NMAX 40 /* max clocks */ #define FMAX 8 /* max filter size */ #define DAY 86400. /* one day of seconds */ #define MAXREACH 8192. /* reachability timeout */ #define HZ 1000. /* clock rate */ #define FILTER .5 /* filter weight */ #define SELECT .75 /* select weight */ #define MAXSTRAT 15. /* max stratum */ #define MAXSKEW 1. /* max skew error per MAXAGE */ #define MAXAGE 86400. /* max clock age */ #define MAXDISP 16. /* max dispersion */ #define MAXCLOCK 10 /* max candidate clocks */ #define MINCLOCK 3 /* min survivor clocks */ /* Function declarations */ void filter(int, double, float, float); /* filter algorithm */ int dts(void); /* intersection algorithm */ int select(void); /* selection algorithm */ float combine(void); /* combining algorithm */ float distance(int); /* compute synch distance */ void stat(int, float); /* update statistics */ void display(void); /* display statistics */ void reset(void); /* reset statistics */ void exit(int); /* off the bus */ /* Peer state variables */ float filtp[NMAX][FMAX]; /* filter offset samples */ float fildp[NMAX][FMAX]; /* filter delay samples */ float filep[NMAX][FMAX]; /* filter dispersion samples */ float tp[NMAX]; /* offset */ float dp[NMAX]; /* delay */ float ep[NMAX]; /* dispersion */ float rp[NMAX]; /* last offset */ double utc[NMAX]; /* last timestamp */ int cd[NMAX]; /* code */ int st[NMAX]; /* stratum */ /* System state constants/variables */ float rho = 1./HZ; /* max reading error */ float phi = MAXSKEW/MAXAGE; /* max skew rate */ float maxdist = MAXDISP/16.; /* max distance */ float bot, top; /* confidence interval limits */ float theta; /* clock offset */ float delta; /* roundtrip delay */ float epsil; /* dispersion */ int nclock; /* survivor clocks */ int source; /* clock source */ int n1, n2; /* min/max clock ids */ /* Local clock constants */ double Kf = 4194394.; /* frequency weight */ double Kg = 256.; /* phase weight */ double Kh = 8192.; /* compliance weight */ double Ks = 16.; /* compliance maximum */ double Kt = 8192.; /* compliance multiplier */ double sigma = 4.; /* adjustment interval */ /* Local clock variables */ double Vs; /* clock filter output */ double tau; /* PLL time constant */ double mu; /* update interval */ double freq; /* frequency error */ double phase; /* phase error */ double comp; /* error estimate */ double tstamp; /* at the tone */ /* Statistics vector */ double d0[NMAX]; /* sample count */ double d1[NMAX]; /* sum of samples */ double d2[NMAX]; /* sum of squares of samples */ double d3[NMAX]; /* sample maximum */ double d4[NMAX]; /* sample minimum */ /* Input file format */ unsigned int date; /* day number (MJD) */ unsigned long timex; /* time of day (ms past midnight) */ unsigned int code; /* entry code */ float offset; /* clock offset (ms) */ float delay; /* total delay (ms) */ float disp; /* total dispersion (ms) */ /* Temporary lists */ float list[3*NMAX]; /* temporary list */ int index[3*NMAX]; /* index list */ /* Stuff */ unsigned int datey; /* starting date */ FILE *fopen(), *fp_in; /* input file handle */ char fn_in[50]; /* input file name */ float t, u, v, w, x; /* float temps */ int i, j, k, m, n; /* int temps */ main() { printf ("input file: "); scanf ("%s", fn_in); fp_in = fopen (fn_in, "r"); if (fp_in == NULL) { printf ("input file %s does not exist\n", fn_in); exit (1); } n1 = 1; n2 = 25; reset(); printf(" Time Offset Freq Delay Disp Dist Clk Src\n"); /* Read next sample */ while (fscanf(fp_in, "%i%lu%x%f%f%f", &date, &timex, &code, &offset, &delay, &disp) != EOF) { timex = timex/1e3; offset = offset/1e3; delay = delay/1e3; disp = disp/1e3; if (datey <= 0) datey = date; tstamp = (date-datey)*DAY+timex; if (utc[0] <= 0) utc[0] = tstamp; for (n = n1; n <= n2; n++) { if (utc[n] <= 0) utc[n] = tstamp; if ((cd[n] != 0) && ((tstamp-utc[n]) > MAXREACH)) { for (i = FMAX-1; i >= 0; i--) { filtp[n][i] = 0.; fildp[n][i] = 0.; filep[n][i] = MAXDISP; } tp[n] = 0.; rp[n] = 0.; ep[n] = MAXDISP; utc[n] = 0.; cd[n] = 0; } } n = code & 0xff; if ((n < n1) || (n > n2)) continue; cd[n] = 0; if (code&C_SYNC == 0) continue; cd[n] = code; st[n] = (code&0x0f00)/256.; /* Update clock filter and statistics n = peer id, offset = clock offset, delay = roundtrip delay to root, disp = dispersion to root */ t = tstamp-utc[n]; epsil = rho+phi*t+disp; filter(n, offset, delay, epsil); if (t > 0.) rp[n] = (tp[n]-u)/t; stat(n, tp[n]*1e3); /* Determine survivors and combine nclock = number of survivor clocks */ i = dts(); if (i < nclock/2) continue; i = select(); if (i < 1) continue; nclock = i; theta = combine(); if ((source != n) || (epsil+delta/2. >= maxdist)) continue; /* Update local clock and statistics */ dp[0] = delta; ep[0] = epsil; Vs = theta; mu = tstamp-utc[0]; utc[0] = tstamp; u = tp[0]; x = mu; if (x > 1024.) x = 1024.; freq = freq+Vs*x/(tau*tau); phase = Vs/tau; tau = Ks-fabs(comp); if (tau < 1.) tau = 1.; comp = comp+(Kt*Vs-comp)*tau/Kh; i = mu/sigma; x = 1.-1./Kg; x = 1.-pow(x, i); tp[0] = tp[0]+x*phase+i/Kf*freq; printf("%8.1f%11.3f%10.4f%8.1f%8.1f%8.1f%5i%5i\n", tstamp/60., tp[0]*1e3, 1./(sigma*Kf)*freq*1e6, dp[0]*1e3, ep[0]*1e3, distance(0)*1e3, nclock, cd[source]&0xff); if (mu > 0.) rp[0] = (tp[0]-u)/mu; stat(0, tp[0]*1e3); } fclose(fp_in); printf("Summary\n"); display(); exit(0); } /* Subroutines Clock filter algorithm n = peer id, offset = sample offset, delay = sample delay, disp = sample dispersion; computes tp[n] = peer offset, dp[n] = peer delay, ep[n] = peer dispersion */ void filter(int n, double offset, float delay, float disp) { int i, j, k, m; /* int temps */ float x; /* float temps */ for (i = FMAX-1; i > 0; i--) { /* update/shift filter */ filtp[n][i] = filtp[n][i-1]; fildp[n][i] = fildp[n][i-1]; filep[n][i] = filep[n][i-1]+phi*(tstamp-utc[n]); } utc[n] = tstamp; filtp[n][0] = offset-tp[0]; fildp[n][0] = delay; filep[n][0] = disp; m = 0; /* construct/sort temp list */ for (i = 0; i < FMAX; i++) { if (filep[n][i] >= MAXDISP) continue; list[m] = filep[n][i]+fildp[n][i]/2.; index[m] = i; for (j = 0; j < m; j++) { if (list[j] > list[m]) { x = list[j]; k = index[j]; list[j] = list[m]; index[j] = index[m]; list[m] = x; index[m] = k; } } m = m+1; } if (m <= 0) /* compute filter dispersion */ ep[n] = MAXDISP; else { ep[n] = 0; for (i = FMAX-1; i >= 0; i--) { if (i < m) { x = fabs(filtp[n][index[0]]- filtp[n][index[i]]); } else x = MAXDISP; ep[n] = FILTER*(ep[n]+x); } i = index[0]; ep[n] = ep[n]+filep[n][i]; tp[n] = filtp[n][i]; dp[n] = fildp[n][i]; } return; } /* Intersection algorithm computes nclock = candidate clocks, bot = lowpoint, top = highpoint; returns number of survivor clocks */ int dts() { int f; /* intersection ceiling */ int end; /* endpoint counter */ int clk; /* falseticker counter */ int i, j, k, n; /* int temps */ float x, y; /* float temps */ nclock = 0; i = 0; for (n = n1; n <= n2; n++) { /* construct endpoint list */ if (ep[n] >= MAXDISP) continue; nclock++; list[i] = tp[n]-distance(n); index[i] = -1; /* lowpoint */ for (j = 0; j < i; j++) { if ((list[j] > list[i]) || ((list[j] == list[i]) && (index[j] > index[i]))) { x = list[j]; k = index[j]; list[j] = list[i]; index[j] = index[i]; list[i] = x; index[i] = k; } } i = i+1; list[i] = tp[n]; index[i] = 0; /* midpoint */ for (j = 0; j < i; j++) { if ((list[j] > list[i]) || ((list[j] == list[i]) && (index[j] > index[i]))) { x = list[j]; k = index[j]; list[j] = list[i]; index[j] = index[i]; list[i] = x; index[i] = k; } } i = i+1; list[i] = tp[n]+distance(n); index[i] = 1; /* highpoint */ for (j = 0; j < i; j++) { if ((list[j] > list[i]) || ((list[j] == list[i]) && (index[j] > index[i]))) { x = list[j]; k = index[j]; list[j] = list[i]; index[j] = index[i]; list[i] = x; index[i] = k; } } i = i+1; } if (nclock <= 0) return 0; for (f = 0; ; f++) { /* find low endpoint */ clk = 0; end = 0; for (j = 0; j < i; j++) { end = end-index[j]; bot = list[j]; if (end >= (nclock-f)) break; if (index[j] == 0) clk = clk+1; } end = 0; for (j = i-1; j >= 0; j--) { end = end+index[j]; top = list[j]; if (end >= (nclock-f)) break; if (index[j] == 0) clk = clk+1; } if (clk <= f) break; } return nclock-clk; } /* Selection algorithm bot = lowpoint, top = highpoint; computes nclock = number of candidate clocks, index = candidate index list, source = clock source, theta = clock offset, delta = roundtrip delay, epsil = dispersion; returns number of survivor clocks */ int select() { float xi; /* max select dispersion */ float eps; /* min peer dispersion */ int i, j, k, m, n; /* int temps */ float x, y, z; /* float temps */ m = 0; for (n = n1; n <= n2; n++) { /* construct/sort candidate list */ if ((st[n] > 0) && (st[n] < MAXSTRAT) && (tp[n] >= bot) && (tp[n] <= top)) { list[m] = MAXDISP*st[n]+distance(n); index[m] = n; for (j = 0; j < m; j++) { if (list[j] > list[m]) { x = list[j]; k = index[j]; list[j] = list[m]; index[j] = index[m]; list[m] = x; index[m] = k; } } m = m+1; } } nclock = m; if (m == 0) { source = 0; return m; } if (m > MAXCLOCK) m = MAXCLOCK; while (1) { /* cast out falsetickers */ xi = 0.; eps = MAXDISP; for (j = 0; j < m; j++) { x = 0.; for (k = m-1; k >= 0; k--) x = SELECT*(x+fabs(tp[index[j]]- tp[index[k]])); if (x > xi) { /* max(xi) */ xi = x; i = j; } x = ep[index[j]]+phi*(tstamp-utc[index[j]]); if (x < eps) eps = x; /* min(eps) */ } if ((xi <= eps) || (m <= MINCLOCK)) break; if (index[i] == source) source = 0; for (j = i; j < m-1; j++) index[j] = index[j+1]; m = m-1; } i = index[0]; /* declare winner */ if (source != i) if (source == 0) source = i; else if (st[i] < st[source]) source = i; theta = tp[i]; delta = dp[i]; epsil = ep[i]+phi*(tstamp-utc[i])+xi+fabs(tp[i]); return m; } /* Combining algorithm index = candidate index list, m = number of candidates; returns combined clock offset */ float combine() { int i; /* int temps */ float x, y, z; /* float temps */ z = 0. ; y = 0.; for (i = 0; i < nclock; i++) { /* compute weighted offset */ j = index[i]; x = distance(j); z = z+tp[j]/x; y = y+1./x; } return z/y; /* normalize */ } /* Compute synch distance n = peer id; returns synchronization distance */ float distance(int n) { return ep[n]+phi*(tstamp-utc[n])+dp[n]/2.; } /* Update statistics */ void stat(int n, float u) { d0[n] = d0[n]+1; d1[n] = d1[n]+u; d2[n] = d2[n]+u*u; if (u > d3[n]) d3[n] = u; if (u < d4[n]) d4[n] = u; return; } /* Display statistics */ void display() { int i; /* int temps */ printf(" ID Samp Mean StdDev Max Min\n"); for (i = 0; i < NMAX; i++) { if (d0[i] <= 1.) continue; v = sqrt(d2[i]/d0[i]-d1[i]/d0[i]*d1[i]/d0[i]); /* v = sqrt(d2[i]/(2*(d0[i]-1.))); */ printf("%3i%8.0f%10.3f%10.3f%10.3f%10.3f\n", i, d0[i], d1[i]/d0[i], v, d3[i], d4[i]); /* fprintf(fp_out, "%3i%8.0f%10.3f%10.3f%12.3f%12.3f%10.3f\n", i, d0[i], d1[i]/d0[i], v, d3[i], d4[i]); */ } } /* Reset statistics */ void reset() { int i, j; /* int temps */ freq = 0.; phase = 0.; comp = 0.; tau = 1.; source = 0; for (i = 0; i < NMAX; i++) { for (j = FMAX-1; j >= 0; j--) { filtp[i][j] = 0.; fildp[i][j] = 0.; filep[i][j] = MAXDISP; } tp[i] = 0.; rp[i] = 0.; ep[i] = MAXDISP; utc[i] = 0.; cd[i] = 0; st[i] = 0; d0[i] = 0.; d1[i] = 0.; d2[i] = 0.; d3[i] = -1e10; d4[i] = 1e10; } }