Version 12 of double substitution

Updated 2013-02-20 16:50:51 by pooryorick

Tcl is a language of values. By the time a command gets to see its arguments, Tcl has already performed its various substitutions on those arguments. Many commands perform their own substitutions on their arguments, and this is what "double substitution" usually refers to. The quintessential example is the [expr] command, but it is not the only one. Here are some other examples:

[expr]
interprets $ as variable substitution, [ as command substition, etc.
[set]
interpets parenthesis as array element access
[list]
interprets \ as backslash substitution, " and { as a grouping operators. Actally, this is also true of all the list commands
[eval]
interprets its arguments as entire Tcl scripts
[regexp] and [regsub]
interpret regular expression syntax in all its glory
[string match]
interprets *, ?, [, and \

Anywhere a command is performing some sort of interpretation on some of the characters in its arguments, there are two layers of interpretation happening: Tcl performs its substitutions on command arguments first, and then the command may perform its own substitutions. Those just learning Tcl may be caught off-guard by this double subsitution, but it is not considered a bug, and with practice, it comes to feel like a natural part of the design of Tcl. To avoid injection attacks, it is important to understand and be aware of command substitution behaviour.

The reason [expr] arguments should always be braced is that it avoids the first layer of substitution by the Tcl script interpreter. The same is true for the first argument to if and while. This is mentioned on the Tcl Style Guide page and is discussed a bit on A Question of Style. Use TclPro's syntax checker (named procheck) to check for mistakes.

As Cameron Laird says, "Tcl's syntax is small enough to fit in the working memory of a typical human." However, the main work of understanding double substitution lies in understanding how each command you choose to use operates, and how it interprets its arguments. The standard rules of Tcl describe the syntax of Tcl, and the documentation for commands such as [expr], [if], and [while] describe how each of those commands interpret, and possibly perform substitutions on their arguments.

Rule 2 says that a Tcl command is evaluated in two steps. First, the Tcl interpreter breaks the command into words and performs substitutions. Then the interpreter uses the first word as a command name, calls the command, and passes the rest of the words to the command as arguments.

Rule 5 says that substitutions are not performed on words surrounded by braces. That's a good thing because the expression parser (on the expr man page) will perform its own substitutions. You are getting bitten by unbraced expressions performing double-substitution.

Here is an example of a common mistake:

#warning:  bad code ahead!
set myString "hello"
if $myString=={} {puts "empty string"}

By the time the [if] command sees its first argument, it looks like this:

hello=={}

In expressions, strings must be quoted, and now hello is not, so an error occurs. hello was quoted for Tcl when it performed its substitutions, but it then was not quoted for [expr] (via [if]), which, requires that strings be enclosed in double quotes or braces.

#warning:  bad code ahead!
set myString "hello there"
if $myString=={} {puts "empty string"}

In this case, [expr] gets the following value for its argument:

hello there==

which is even more of an error because now [expr] can't make sense of the number of arguments

a complex example

In the following example, there are many issuses:

#warning: bad code ahead!
set myString "This is a string with \[special characters\}"
if $myString=={} {puts "empty string"}

[expr] (once again, via [if]), sees the following value:

This is a string with [special characters}==

there are seven arguments, which makes no sense to [expr]:

  • a left square bracket signaling command substitution, but no corresponding closing bracket
  • a right curly bracket signaling the end of a grouping operation, but no corresponding left curly bracket
  • seven arguments with only one opertor to be found, which syntactically makes no sense to [expr]

The moral of the story, of course, is to prevent the Tcl substitutions with curly braces:

if {$myString==""} {puts "empty string"}

Now, [expr] sees something more reasonable:

This\ is\ a\ string\ with\ \[special\ characters\}=={}

In summary, use braces on if and while and expr expressions. Use ProCheck to make sure you haven't missed any.


KBK - Another point that should be made is that braced expressions on [if], [while] and [expr] aren't just safer, they're also much, much faster. Unbraced ones have to be parsed at run time; braced ones can be compiled down to very tight bytecode sequences.