Purpose: Kevin Kenny is comtemplating starting a project to rework the clock command. The purpose of this page is to present some of the issues with the current clock command, and collect feedback about how best to fix it.
I wish I had time to more than merely applaud Kevin's work. This is long overdue! -- CLN
The current (Tcl 8.4a3) clock command has served us well since Tcl 7.5, but it's starting to show cracks at all levels. It appears that nothing less than a complete rework may suffice to fix all the issues. (In that case, the existing [clock] command will be preserved, but deprecated.)
1. The measurement of time.
Mickey's little hand is on the six, and his big hand is on the nine.
At present, Tcl's concept of absolute time is the count of seconds from a fixed epoch or zero point. The count is expressed as a signed 32-bit integer. This representation has a number of drawbacks.
Proposal: Add to [clock] a [clock milliseconds] command that returns the nominal time from the Julian epoch expressed in milliseconds as a double-precision floating-point number.
Double-precision floating point is chosen as a representation because it is available on all the platforms, unlike 64-bit integers, which are the most obvious alternative. The granularity of milliseconds, rather than seconds or days, ensures that any integer number of milliseconds will be represented exactly.
The Julian epoch is chosen because many existing calendar algorithms, such as those published in the second edition of Reingold [L1 ], use Julian day number as their internal representation. The Julian day can be obtained from the millisecond count simply by dividing by 86400000 and casting to an integer.
Since leap seconds cannot be predicted in advance, it is useful to assume that the day has a fixed length of 86400 seconds. For this reason, I propose that the Tcl clock be Smoothed Universal Time (UTS) [L2 ]. The code in TclpGetTime on the Unix platform (where the kernel clock is corrected with adjtime) and on the Windows platform (where the Tcl clock is derived from a separate 'performance counter' that is disciplined with a phase-locked loop to the system clock) already comes close to the desired behavior.
Anyone wishing to see the pain involved in combining the range of Tcl's per-second clock with the accuracy of its -millisecond support can check out: [L3 ] - DEL
2. Calendar
I've been on a calendar, but never on time. - Marilyn Monroe
The handling of the calendar is lacking some points that various users have requested.
For what it's worth, there's information on various calendars at http://www.copi.org/craig/events/calendar.html . -- CLN
3. Time zone
At the back of the Daylight Saving scheme I detect the bony, blue-fingered hand of Puritanism, eager to push people into bed earlier, and get them up earlier, to make them healthy, wealthy and wise in spite of themselves. - Robertson Davies
Conversion of times between local and UTC (mislabelled, 'GMT,' in the Tcl documentation) depends on the calculations of the underlying system.
(Or consider time clocks on Hoover Dam: one end of the dam is in Arizona which never observes DST, the other is in Nevada which does observe DST! -- CLN)
4. Localization and input/output issues
...ad calendas gr�cas reponere.
Tcl's handling of time ignores the locale. In many cases, it simply cannot be localized easily. Moreover, Tcl's input conversion for times suffers from an attempt to be excessively general; it succeds only in having peculiar bugs.
set nextMonth [clock scan {+1 month} -base $today]
AK: See http://www.purl.org/net/akupries/soft/pool/f_base_date.tcl.html
5. Related work.
Non est, crede mihi, sapientis dicere, Vivam: Sera nimis vita est crastina: vive hodie. - Martial
There are a number of related date-and-time packages out there. Alas, none of them seems both to do the whole job and to be suitable for incorporating wholesale into the Tcl core, for various reasons.
Time zone manipulations: The Olson codes at [L7 ] and [L8 ] are comprehensive and widely used. The chief difficulty with them is that they are based on the 32-bit Posix clock and therefore die after 2037. Certainly, the time zone data sets should be considered for any implementation that we do of time zone conversions.
Java and ICU: The Java classes, Calendar, DateFormat, and so on [L9 ], and the corresponding classes in ICU [L10 ] and [L11 ] are comprehensive. They use a format scheme that is incompatible with strftime and strptime, but the two are fairly easily interconverted (such a conversion would be necessary in any case, in light of the fact that the system locale is specified with one scheme in Unix and the other in Windows). The key drawbacks to adopting these codes wholesale are their deficiencies in time zone handling and the fact that no C implementation exists. The C API provided in ICU is a wrapper layer around a C++ implementation; the Tcl core does not presume the existence of a C++ compiler on the target platform.
The Reingold/Dershowitz codes: The standard references on computer calendrical calculations are a series of books and papers by Reingold and Dershowitz [L12 ]. Their Web site has a number of interesting reference implementations that could be good starting points for some of the needed calculations. Again, there is a programming-language barrier: the provided codes are in Common Lisp and C++, not in C or Tcl.
The Pool library: Andreas Kupries provides one fine starting point for many date calculations in Tcl at [L13 ].
The Hall codes: Mike Hall has codes [L14 ] for calculations with Julian Day Number; again, these may prove a fine starting point.
6. Some contemplations
Tempora mutantur et nos in illis
Arjen Markus I have a number of remarks about the previous sections:
Indeed, [clock seconds -milliseconds] might well be an alternative syntax. I just didn't like it that much because it seemed to be contradictory: seconds are not milliseconds. --KBK We might use "-keepmillis" instead --AM
I'm pretty sure that I have scripts that would break if presented with something that's not an integer. --KBK
That would be ideal: alas, that would require 64-bit integers, or else another format for [clock seconds]. The difficulty here is that seconds from the Julian epoch overflows a 32-bit word. --KBK
I'd like to make the necessary documentation changes to indicate that [file stat] and [clock seconds] track the Posix epoch on all platforms (incidentally, fixing any that don't -- but I don't think there are any left). --KBK
The underlying system provides TclpGetTime (soon to be renamed Tcl_GetTime) which returns time in a structure comprising a 32-bit count of seconds from the Posix epoch and a count of microseconds within the second. The time is ignorant of leap seconds; inserting or deleting a second results in the clock's being sped up or slowed down by a factor of 1.001 until it once again agrees with its external reference. The smoothed time is truncated to give the count of seconds. (This is how it's implemented on Unix, Windows and MacOSX, just not explicitly documented.) Note that TclpGetTime on Windows provides significantly better precision than the system clock. --KBK
You have a good point there, and I'd consider using the Posix epoch instead. In any case, there are going to be codes that want Posix, ones that want Reingold's 'absolute day', and ones that want JD or MJD; adding or subtracting an offset or multiplying by a constant factor isn't too much of a burden if the data are provided. --KBK
Indeed. L10n of [clock scan] is the issue on which [clock scan] founders. What I want to do is break it into layers: + a low-level layer like the Calendar class in Java or ICU, which accepts dates and times in numeric format with the fields identified. + an intermediate layer like SimpleDateFormat in Java or ICU, or like strptime in C99, which accepts string dates in a format specified by the caller and takes them apart into the numeric fields. + a high-level layer that accepts a looser definition such as 'this is an RFC2822 date' or 'this is an ISO8066 date' and identifies the precise format in use to pass to the intermediate layer. This layer would include a simple-minded scanner generator so that l10n could happen fairly easily. I've been playing with some implementation ideas here. --KBK
I've started a section on 'Related Work' above. --KBK
Arjen Markus In answer to questions by Brett Schwartz:
You are probably doing that already, but manipulating dates internally via julian dates (just doubles in fact expressing date/time as the number of days and fractions thereof) makes life quite a bit easier.
As for further suggestions: why not make the parser "programmable", that is add a feature that allows dates to be passed in in ways the user prefers, rather than try to do all of this yourself (something akin to Java's DateFormat class).
That way you put the burden of understanding all formats yourself on the shoulders of the users. This also solves ambiguities like:
1-2-2002
which in most European countries would be interpreted as 1 february 2002, but in others and in the United States, as january 2 2002.
I have no clear idea on how to do that in a simple way (or perhaps, yes, I do: use the format codes that clock already provides and reuse them in a parser/scanner).
LV I seem to recall that this package has a LOT of nice time/date/calendar functionality - most of which is in C. However, perhaps there ideas, etc. that could be used for ideas:
What: Remind Where: <URL: http://www.roaringpenguin.com/remind.html > <URL: http://www.roaringpenguin.com/remind-03.00.22.tar.gz > Description: Remind is an alarm/calendar program which handles Roman and Hebrew calendars, sunrise, sunset and moon phases, is multilingual, does complicated date calculations (handling holidays propers), alarms, includes a WWW calendar server, and produces PostScript output. Uses Tk for an X front end. Available for UNIX, MS-DOS, OS/2 and other platforms. Updated: 03/2001 Contact: <URL: mailto:[email protected] > (David F. Skoll)
Related work, of potential interest: Pythonic normalDate [L15 ], astrolabe [L16 ], and the widely-lauded mxDateTime [L17 ].