Version 18 of double substitution

Updated 2013-04-29 17:22:48 by pooryorick

Summary

Numerous commands hand values back to the interpreter for an additional round of interpretation and/or evaluation. Some of the usual suspects are [[eval], [[subst], [[expr], [[if] and [[while]`. It pays to understand the implications of additional layers of interpretation.

Description

Rule 2 says that a Tcl command is evaluated in two steps. First, the Tcl interpreter breaks the command into words and performs substitutions. Then the interpreter uses the first word as a command name, calls the command, and passes the rest of the words to the command as arguments.

By the time a command gets to see its arguments, Tcl has already performed its various substitutions on those arguments. Many commands in trun perform their own substitutions on their arguments and/or pass those arguments back to the interperter to be evaluated scripts in their own right. The quintessential example is the [expr] command, but it is not the only one. Here are some other examples:

[[after]
argument is a script to be evaluated later
[[expr]
interprets $ as variable substitution, [ as command substition, etc.
[[if]
second argument is interpreted according to [[expr]
[[set]
interpets parenthesis as array element access
[[list]
interprets \ as backslash substitution, " and { as a grouping operators. Actally, this is also true of all the list commands
[[eval]
interprets its arguments as entire Tcl scripts
[regexp] and [[regsub]
interpret regular expression syntax in all its glory
[[string match]
interprets *, ?, [, and \
[[trace]
argument is a script to be evaluated later
[[while]
second argument is interpreted according to [[expr]

Each of these commands can be thought of as a separate interpreter that implements its own mini-language. In the case of [eval] and some others, the mini-language is just [Tcl again, but anywhere a command is performing some sort of interpretation on some of the characters in its arguments, there are two layers of interpretation happening: Tcl performs its substitutions on command arguments first, and then the command may perform its own substitutions. Essentially, a script is being composed and then evaluated at runtime. This is a natural part of the design of Tcl, but if done incorrectly, it can leave one vulnerable to injection attacks, so it is important to understand and be aware of command substitution behaviour. This means understanding how each command you choose to use operates, and how it interprets its arguments. The standard rules of Tcl describe the syntax of Tcl, and each additional command documents its own parsing and interpretation behaviour.

Arguments to ﷒[always be braced because it avoids the first layer of substitution by the Tcl interpreter. The same is true for the first argument to if and while. This is mentioned on the Tcl Style Guide page and is discussed a bit on A Question of Style. Use TclPro's syntax checker (named procheck) to check for mistakes.

Here is an example of a common mistake:

#warning:  bad code ahead!
set myString "hello"
if $myString=={} {puts "empty string"}

By the time the [if] command sees its first argument, it looks like this:

hello=={}

In expressions, strings must be quoted, and now hello is not, so an error occurs. hello was quoted for Tcl when it performed its substitutions, but it then was not quoted for [expr] (via [if]), which, requires that strings be enclosed in double quotes or braces.

Note also that that the Tcl did not substitute away the curly brackets after ==, since brace and double-quote grouping only happens when a brace or double quote occur at the beginning of a word

#warning:  bad code ahead!
set myString "hello there"
if $myString=={} {puts "empty string"}

In this case, [expr] gets the following value for its argument:

hello there=={}

which is even more of an error because now [expr] can't make sense of the number of arguments

a complex example

In the following example, there are many issuses:

#warning: bad code ahead!
set myString "This is a string with \[special characters\}"
if $myString=={} {puts "empty string"}

[expr] (once again, via [if]), sees the following value:

This is a string with [special characters}==

there are seven arguments, which makes no sense to [expr]:

  • a left square bracket signaling command substitution, but no corresponding closing bracket
  • a right curly bracket signaling the end of a grouping operation, but no corresponding left curly bracket
  • seven arguments with only one opertor to be found, which syntactically makes no sense to [expr]

The moral of the story, of course, is to prevent the Tcl substitutions with curly braces:

if {$myString==""} {puts "empty string"}

Now, [expr] sees something more reasonable:

This\ is\ a\ string\ with\ \[special\ characters\}=={}

In summary, use braces on if and while and expr expressions. Use ProCheck to make sure you haven't missed any.


KBK - Another point that should be made is that braced expressions on [if], [while] and [expr] aren't just safer, they're also much, much faster. Unbraced ones have to be parsed at run time; braced ones can be compiled down to very tight bytecode sequences.


AMG: Double substitution leaves you vulnerable to injection attacks.