Purpose: Kevin Kenny is contemplating starting a project to rework the clock command. The purpose of this page is to present some of the issues with the current clock command, and collect feedback about how best to fix it.
I wish I had time to more than merely applaud Kevin's work. This is long overdue! -- CLN
The ideas explored in this page have been brought one step closer to appearing in the Tcl core; TIP #173 [L1 ] is now (2004-03-12) being discussed.
The current (Tcl 8.4a3) clock command has served us well since Tcl 7.5, but it's starting to show cracks at all levels. It appears that nothing less than a complete rework may suffice to fix all the issues. (In that case, the existing [clock] command will be preserved, but deprecated - see Free-format clock scan)
1. The measurement of time.
Mickey's little hand is on the six, and his big hand is on the nine.
At present, Tcl's concept of absolute time is the count of seconds from a fixed epoch or zero point. The count is expressed as a signed 32-bit integer. This representation has a number of drawbacks.
Proposal: Add to [clock] a [clock milliseconds] command that returns the nominal time from the Julian epoch expressed in milliseconds as a double-precision floating-point number.
Double-precision floating point is chosen as a representation because it is available on all the platforms, unlike 64-bit integers, which are the most obvious alternative. The granularity of milliseconds, rather than seconds or days, ensures that any integer number of milliseconds will be represented exactly.
The Julian epoch is chosen because many existing calendar algorithms, such as those published in the second edition of Reingold [L2 ], use Julian day number as their internal representation. The Julian day can be obtained from the millisecond count simply by dividing by 86400000 and casting to an integer.
Since leap seconds cannot be predicted in advance, it is useful to assume that the day has a fixed length of 86400 seconds. For this reason, I propose that the Tcl clock be Smoothed Universal Time (UTS) [L3 ]. The code in TclpGetTime on the Unix platform (where the kernel clock is corrected with adjtime) and on the Windows platform (where the Tcl clock is derived from a separate 'performance counter' that is disciplined with a phase-locked loop to the system clock) already comes close to the desired behavior.
Anyone wishing to see the pain involved in combining the range of Tcl's per-second clock with the accuracy of its -millisecond support can check out: http://expect.nist.gov/stopwatch - Don Libes
Couldn't we, during initialization, just capture the millisecond value when the second value changes, and use that as an offset to extract the millisecond values. Like:
set curtime [clock seconds] while {$curtime == [clock seconds]} { # Wait here until the seconds count changes } set milli_offset [clock clicks -milliseconds] set milli_offset [expr $milli_offset % 1000]
Then, every time we want seconds and milliseconds, we could use this:
set insec [clock seconds] set inclicks [clock clicks -milliseconds] set intime [format "%d.%03d" $insec [expr ($inclicks - $milli_offset) % 1000]]
This would set intime to a time in seconds.milliseconds. Even the wraparound case (where $inclicks is less than $milli_offset) works because the modulo result is positive. This would work assuming that the execution time in the first block between the while and the following set is roughly equivalent to the execution time in the second block between the two set statements.
2. Calendar
I've been on a calendar, but never on time. - Marilyn Monroe
The handling of the calendar is lacking some points that various users have requested.
For what it's worth, there's information on various calendars at http://www.copi.org/craig/events/calendar.html . -- CLN
3. Time zone
At the back of the Daylight Saving scheme I detect the bony, blue-fingered hand of Puritanism, eager to push people into bed earlier, and get them up earlier, to make them healthy, wealthy and wise in spite of themselves. - Robertson Davies
Conversion of times between local and UTC (mislabelled, 'GMT,' in the Tcl documentation) depends on the calculations of the underlying system.
(Or consider time clocks on Hoover Dam: one end of the dam is in Arizona which never observes DST, the other is in Nevada which does observe DST! -- CLN)
4. Localization and input/output issues
...ad calendas græcas reponere.
Tcl's handling of time ignores the locale. In many cases, it simply cannot be localized easily. Moreover, Tcl's input conversion for times suffers from an attempt to be excessively general; it succeds only in having peculiar bugs.
set nextMonth [clock scan {+1 month} -base $today]
AK: See http://www.purl.org/net/akupries/soft/pool/f_base_date.tcl.html
5. Related work.
Non est, crede mihi, sapientis dicere, Vivam: Sera nimis vita est crastina: vive hodie. - Martial
There are a number of related date-and-time packages out there. Alas, none of them seems both to do the whole job and to be suitable for incorporating wholesale into the Tcl core, for various reasons.
Time zone manipulations: The Olson codes at [L7 ] and [L8 ] are comprehensive and widely used. The chief difficulty with them is that they are based on the 32-bit Posix clock and therefore die after 2037. Certainly, the time zone data sets should be considered for any implementation that we do of time zone conversions.
Java and ICU: The Java classes, Calendar, DateFormat, and so on [L9 ], and the corresponding classes in ICU [L10 ] and [L11 ] are comprehensive. They use a format scheme that is incompatible with strftime and strptime, but the two are fairly easily interconverted (such a conversion would be necessary in any case, in light of the fact that the system locale is specified with one scheme in Unix and the other in Windows). The key drawbacks to adopting these codes wholesale are their deficiencies in time zone handling and the fact that no C implementation exists. The C API provided in ICU is a wrapper layer around a C++ implementation; the Tcl core does not presume the existence of a C++ compiler on the target platform.
The Reingold/Dershowitz codes: The standard references on computer calendrical calculations are a series of books and papers by Reingold and Dershowitz [L12 ]. Their Web site has a number of interesting reference implementations that could be good starting points for some of the needed calculations. Again, there is a programming-language barrier: the provided codes are in Common Lisp and C++, not in C or Tcl.
The Pool library: Andreas Kupries provides one fine starting point for many date calculations in Tcl at [L13 ].
The Hall codes: Mike Hall has codes [L14 ] for calculations with Julian Day Number; again, these may prove a fine starting point.
6. Some contemplations
Tempora mutantur et nos in illis
Arjen Markus I have a number of remarks about the previous sections:
Indeed, [clock seconds -milliseconds] might well be an alternative syntax. I just didn't like it that much because it seemed to be contradictory: seconds are not milliseconds. --KBK We might use "-keepmillis" instead --AM I would like to introduce a -fmt option : default being %d i.e. truncated to int, alternativ %f or %.nf i.e. %.3f for millisecond resolution --UK
I'm pretty sure that I have scripts that would break if presented with something that's not an integer. --KBK
That would be ideal: alas, that would require 64-bit integers, or else another format for [clock seconds]. The difficulty here is that seconds from the Julian epoch overflows a 32-bit word. --KBK
I'd like to make the necessary documentation changes to indicate that [file stat] and [clock seconds] track the Posix epoch on all platforms (incidentally, fixing any that don't -- but I don't think there are any left). --KBK
The underlying system provides TclpGetTime (soon to be renamed Tcl_GetTime) which returns time in a structure comprising a 32-bit count of seconds from the Posix epoch and a count of microseconds within the second. The time is ignorant of leap seconds; inserting or deleting a second results in the clock's being sped up or slowed down by a factor of 1.001 until it once again agrees with its external reference. The smoothed time is truncated to give the count of seconds. (This is how it's implemented on Unix, Windows and MacOSX, just not explicitly documented.) Note that TclpGetTime on Windows provides significantly better precision than the system clock. --KBK
You have a good point there, and I'd consider using the Posix epoch instead. In any case, there are going to be codes that want Posix, ones that want Reingold's 'absolute day', and ones that want JD or MJD; adding or subtracting an offset or multiplying by a constant factor isn't too much of a burden if the data are provided. --KBK
Indeed. L10n of [clock scan] is the issue on which [clock scan] founders. What I want to do is break it into layers: + a low-level layer like the Calendar class in Java or ICU, which accepts dates and times in numeric format with the fields identified. + an intermediate layer like SimpleDateFormat in Java or ICU, or like strptime in C99, which accepts string dates in a format specified by the caller and takes them apart into the numeric fields. + a high-level layer that accepts a looser definition such as 'this is an RFC2822 date' or 'this is an ISO8066 date' and identifies the precise format in use to pass to the intermediate layer. This layer would include a simple-minded scanner generator so that l10n could happen fairly easily. I've been playing with some implementation ideas here. --KBK
I've started a section on 'Related Work' above. --KBK
Arjen Markus In answer to questions by Brett Schwartz:
You are probably doing that already, but manipulating dates internally via julian dates (just doubles in fact expressing date/time as the number of days and fractions thereof) makes life quite a bit easier.
As for further suggestions: why not make the parser "programmable", that is add a feature that allows dates to be passed in in ways the user prefers, rather than try to do all of this yourself (something akin to Java's DateFormat class).
That way you put the burden of understanding all formats yourself on the shoulders of the users. This also solves ambiguities like:
1-2-2002
which in most European countries would be interpreted as 1 february 2002, but in others and in the United States, as january 2 2002.
I have no clear idea on how to do that in a simple way (or perhaps, yes, I do: use the format codes that clock already provides and reuse them in a parser/scanner).
ambiguities like: 1-2-2002
I've got a feeling, that the simplest way could be to assume the time units order from the date separators used. Have a look at http://www.postgresql.org/docs/7.3/interactive/datatype-datetime.html (Table 5-10. Date Input). In short:
Consider the date 01 February 2003. Such date can be written as:
...and without "year"-abbreviation:
Pay attention, that in fact the year can be abbreviated as in earlier examples (like: "1-II-03" and "030201"), and there's still no ambiguities "what's what".
I'm not sure, why in that table is the remark about some "ambiguities" when using slash as date separator. Using slash clearly suggests the order MM/DD/yy(YYYY)
Keeping the rules given above, there'll be no problem with any ambiguities.
adavis (22nd August 2007): In the UK the common date format is DD/MM/YY or DD/MM/YYYY. And I would also say - We are the "original" Anglo-Saxons!!
ZB OK, so just one ambiguity remains, and can be easily resolved looking at system locale - either it's en_US (MM/DD/YY) or "rest of the world" (DD/MM/YY).
LV I seem to recall that this package has a LOT of nice time/date/calendar functionality - most of which is in C. However, perhaps there ideas, etc. that could be used for ideas:
What: Remind Where: http://www.roaringpenguin.com/products/remind/ http://www.roaringpenguin.com/products/remind/remind-03.00.22.tar.gz Description: Remind is an alarm/calendar program which handles Roman and Hebrew calendars, sunrise, sunset and moon phases, is multilingual, does complicated date calculations (handling holidays propers), alarms, includes a WWW calendar server, and produces PostScript output. Uses Tk for an X front end. Available for UNIX, MS-DOS, OS/2 and other platforms. Updated: 03/2001 Contact: mailto:[email protected] (David F. Skoll)
Related work, of potential interest: Pythonic normalDate [L15 ], astrolabe [L16 ], and the widely-lauded mxDateTime [L17 ].
nl More related work, of potential interest: Hebrew/Jewish calendar algorithm can be found as "Chelm.org's algorithms of the Jewish calendar" [L18 ].
escargo 12 Mar 2004 - Isn't there really an issue of Julian dates that use noon as the time for a new day, versus midnight (customary calendar)? For that matter, some calendars use sunrise or sunset to determine the start of a new day. How would those issues be factored in?
You're right that the Julian Day Number technically changes at noon. Tcl's "Julian Day" is actually the Julian Day Number beginning at noon on the given date. (It's convenient for astronomers to change the date at noon, they're asleep then anyway!) I could foresee the Hebraic calendar either being implemented with l10n for latitude and longitude or with the nominal date being that beginning at sundown on the previous day. The Hijri calendar needs solar time in any case, so there has to be some astronomy in any implementation of it. (It also has to include a disclaimer, since in many jurisdictions, the month does not begin until a mullah has actually observed the Moon.)
The TLA TAI was used above. Is that Temps Atomique International?
Yes. Tcl's time model, however, is UTS [http://www.cl.cam.ac.uk/~mgk25/uts.txt].
jmn 2006-06-24 So presumably this means that Tcl is adjusting the output of [clock seconds], [clock milliseconds] etc with a slight offset for a period of 1000 seconds prior to each leap second?
Is this then on top of a separate skewing that may already be occuring as an NTP client adjusts the system clock to keep in sync with the leap second? I've heard for example than some windows NTP clients will start skewing the clock an hour before the leap second.
Do we really have two separate adjustments being made around this time? (I assume the NTP one to the actual system clock, the Tcl 'adjustment' merely being to reported output?)
How on earth then would one compare timestamps generated on a system that uses TAI?
This is a complicated issue - and I think my understanding of it is pretty limited, but what are the possibilities say of extending [clock seconds] etc so that we know exactly what timestamps we're actually dealing with?
e.g
[[clock seconds UTS]] - presumably the Tcl skewed implementation that exists now? [[clock seconds UTC]] - tracks the systems notion of UTC - including ambiguities around leap seconds [[clock seconds TAI]] - I guess it would require leap-second lookup tables if the system clock isn't already directly in TAI
Anyone care to comment on the feasibility and/or desirability of this?
dzach 2007-8-15 Trying to solve a timing issue, MJ (Tcl Chatroom nick: mjanssen) suggested to use tcl::clock::milliseconds as the fastest command available (in tcl8.5) to retrieve time with millisecond granularity. So in my system, running tcl 8.5a6:
% time {tcl::clock::milliseconds} 1000 0.962 microseconds per iteration
which is much better than
% time {clock milliseconds} 1000 1.523 microseconds per iteration
but when trying to acquire a fractional unix epoch
% time {expr {[clock milliseconds]/1000.0}} 1000 1.618 microseconds per iteration
its performance becomes slower. This little C extension, providing the tcl command fraclock, restores performance to the .9 usec range:
#include <tcl.h> static int fraclock_Cmd(ClientData cdata, Tcl_Interp *interp, int objc, Tcl_Obj * CONST objv[]) { Tcl_Time t; Tcl_GetTime(&t); Tcl_SetObjResult(interp, Tcl_NewDoubleObj(t.sec + t.usec / 1000000.0)); return TCL_OK; } int DLLEXPORT Fraclock_Init(Tcl_Interp *interp) { if (Tcl_InitStubs(interp, TCL_VERSION, 0) == 0L) { return TCL_ERROR; } Tcl_CreateObjCommand(interp, "fraclock", fraclock_Cmd, NULL, NULL); Tcl_PkgProvide(interp, "fraclock", "1.0"); return TCL_OK; }
Example use:
% fraclock 1187180620.302878 % time {fraclock} 1000 0.935 microseconds per iteration
Kevin Kenny's original proposal was for a double precision floating point epoch value. The current tcl8.5 clock milliseconds implementation returns an integer value, not a floating point one. Proposal: Although it might be late for that to be in tcl8.5, wouldn't a [clock] (no arguments) format, with the minimum possible overhead, be a possible solution, without breaking existing code, like:
% clock 1187180620.302878
KBK Wouldn't our time be more profitably spent on bytecoding ensemble dispatch (which would achieve the same performance gain on all ensembles, not just [clock])? Moreover, wouldn't it be more profitably spent addressing other performance "hot spots," some of which are even hotter?