Version 6 of shimmering

Updated 2003-12-18 16:12:36

Purpose: define what shimmering is, what causes it, why one wants it to occur or wants to avoid it, and how to cause it or avoid it.


"Shimmering" is a Tcl quirk or strength, depending on where you're coming from. It refers to the fact that internally, Tcl can keep two representations for each value (dare one say "object"?). One of them is always a string, the other is usually a fast equivalent for it, i.e. ints, floats, and lists.

When you do "set x {a b c}", you store a 5-character string in x.

When you then do "lappend x d", you will end up with a 4-item list. Which, as far as you're normally concerned, is simply "a b c d".

And it is! - but Tcl will play a clever trick, and convert the string to an efficient internal representation of a list of things.

It's all hidden. When you do "puts $x", you get the result you expect. What happens is that puts wants a string, Tcl sees it has none any more, and then creates a proper string for you, on-the-fly.

At this time, you may decide to do "lappend x e". Tcl will happily detect that a list is there and quickly append. As a crucial side-effect, it will also discard the 7-character string, which is no longer appropriate. Keep in mind that no new string gets created at this point.

The point of all this is speed (lots of it). But conceptually, scripts can be built without caring one shred about this duality.

Until "shimmering" sets in... this is used to describe the effect that Tcl continuously alternates between creating a string representation, discarding the other one, and going back to the underlying one and discarding the string.

Here's a non-shimmering loop (no string operations at all):

  for {set i 0} {$i < 10000} {incr i} { lappend x $i }

Here's one which shimmers a bit (the list is never lost):

  for {set i 0} {$i < 10000} {incr i} { lappend x [string length $x] }

This one shimmers badly (a list, a string, a list, ...):

  for {set i 0} {$i < 10000} {incr i} { lappend x $i; append x . }

Timing comparison left as exercise for the reader...


All conversions in tcl currently go through the string representation. Consider:

        set i [expr {$n+1}]
        puts [llength $i]

Here, "i" will be an int first, which then needs to be turned into a list. To do so, a string is constructed, parsed, and converted to a one-item list.

Which we know is doing too much - an int cannot be anything but a one-item list. This was a contrived case, but now compare the two below:

        uplevel 1 [list myproc $arg]
        uplevel [list myproc $arg]

Trouble (a bit: only in terms of trying to achieve top performance).

How about introducing a "typecast matrix"? With a growing, perhaps dynamically extensible, set of smart conversions for special cases? If a converter is present it gets called. If it fails or there is none, then conversion progresses as usual - through an intermediate string rep.

Note that the uplevel example above requires yet more smarts. It's a list of two items, hence it cannot possibly be an int - no need to convert, fail, and have lost the list in the process.

A thought for Tcl9 perhaps? -jcw

As to the specific case jcw cites - a list of more than one element can never be successfully converted to an int ... would it not suffice to build that peephole optimisation into the C function which converts arbitrary objects to ints? In other words, rather than an extensible table of smart convertors (which seems to me would be pretty sparse), use the C code itself to implement short circuits for failure. CMCc

DGP Along these lines, see Tcl Patch 738900.