Version 8 of double substitution

Updated 2005-12-18 00:57:09

Always brace your expressions. Do this for the expr command, and the first argument to if and while. This is mentioned on the Tcl Style Guide page and is discussed a bit on A Question of Style. Use TclPro's syntax checker (named procheck) to check for mistakes.

What is double substitution, and why is it bad? This is when the Tcl interpreter performs more than one round of variable subsitition on a command or its arguments. It usually results in program bugs that casual programmers attribute to special characters in their data, or some sort of problem that can only be resolved by delving into Quoting hell.

Fortunately, Tcl is really very simple and easy to understand. As Cameron Laird says, "Tcl's syntax is small enough to fit in the working memory of a typical human."

Tcl syntax is documented on the Tcl command page as a set of rules. Read and understand those eleven rules, and you've got Tcl syntax down pat. Then read the expr man page to understand how expressions work, and you'll understand how the if and while commands are interpreting their arguments.

Rule 2 says that a Tcl command is evaluated in two steps. First, the Tcl interpreter breaks the command into words and performs substitutions. Then the interpreter uses the first word as a command name, calls the command, and passes the rest of the words to the command as arguments.

Rule 5 says that substitutions are not performed on words surrounded by braces. That's a good thing because the expression parser (on the expr man page) will perform its own substitutions. You are getting bitten by unbraced expressions performing double-substitution.

Here is an example:

  set myString "This is a string with \[special characters\}"
  if $myString=="" {puts "empty string"}

will generate a syntax error. The first line assigns a string with special characters to the variable myString, which is fine. The problem is on the second line. First, the Tcl interpreter breaks the command into three words and performs substitutions. In Tcl syntax, you could think of this as

  $word1 $word2 $word3

where these words are (using >> and << to quote the strings)

  word1 is >>if<<
  word2 is >>This is a string with [[special characters}==""<<
  word3 is >>puts "empty string"<<

Note that the second and third words contain embedded spaces, quotes, and other special characters, but they are really just three words.

[I also note the double >>[[<< in word2. Where did that come from?] [And getting the right number of brackets in that line was tricky! ;-]

Now the interpreter assumes that the first word is a command (if), and it calls that command, passing in the other words as arguments. The if command treats its first argument as an expression. The expression has seven arguments (This is ...) and if you read the expr man page, you'll see that it will try to perform its own set of substitutions. This causes an error, because the string contains unbalanced braces and brackets. This is the double-substitution that I mentioned earlier. The Tcl interpreter performs one substitution, and expr tries to do another.

The answer, of course, is to surround your if expression with braces. Like this:

  if {$myString==""} {puts "empty string"}

Tcl performs its round of substitutions, and $word1 $word2 $word3 become:

  word1 is >>if<<
  word2 is >>$myString==""<<
  word3 is >>puts "empty string"<<

Now when the interpreter calls the if command, the first argument is interpreted as an expression, and the second round of substitutions expands $myString. Then the expression works perfectly.

In summary, use braces on if and while and expr expressions. Use ProCheck to make sure you haven't missed any.


KBK - Another point that should be made is that braced expressions on [if], [while] and [expr] aren't just safer, they're also much, much faster. Unbraced ones have to be parsed at run time; braced ones can be compiled down to very tight bytecode sequences.


Category String Processing