double substitution

In Tcl the words in a command are always exposed to two layers of interpretation: First, Tcl interprets them and prepares them as arguments to a routine. Second, the routine interprets those arguments according to its own principles. double substitution occurs when Tcl performs substitutions on original command and a routine then performs substitutions on the resulting arguments. Care is required in these cases to avoid errors, including vulnerability to injection attacks.

Description

Tcl interprets each command in the following way: It parses the command, performs substitutions on each word in the command, expands any words marked for expansion, and then calls the routine named by the first word of this prepared command, passing the remaining words to the routine as arguments. The routine in turn might subject those arguments to further interpretation. For example, routines like eval and subst concatenate their arguments into a script to be evaluated and then pass that script back to Tcl for another round of interpretation and evaluation. expr, if, and while interpret their arguments according to the grammar of expr, resulting in another round of variable and script substitution . Double substitution refers to the substitution that happens in the course this further interpretation.

Double substitution can be useful but is more often the result of inadvertently neglecting to quote the arguments to expr, if, and while. static syntax analysis tools can be used to identify double substitution in a script.

If a single quoted expression is passed to if, while, and expr those routines can be much more efficient because the expression is parsed and compiled, and the resulting bytecode cached in the value for reuse.

Routines that interpret their arguments as scripts or expressions:

after: The second argument is a script to be evaluated later.

apply: The second item of the first argument is a script.

eval: concatenates its arguments into a script and evalutes it.

expr: Concatenates is arguments into an expression, in which $ indicates variable substitution, and [ indicates script substitution. expr is an example of a routine whose arguments represent a distinct language embedded into a Tcl script, and that language in turn allows a Tcl script to be embedded into it. It's Linguaception.

for: The first, third, and last arguments are evaluated as scripts, and the second argument is evaluated as an expression.

if: The condition arguments are expressions and the body arguments are scripts.

proc: The last argument as a script.

trace: The last argument is a script to be evaluated.

uplevel: Concatenates its arguments into a script and evaluates it.

while: The second argument is an expression and the third argument is a script.

Routines that perform their own particular interpreation of their arguments:

set: ( accesses a variable in an array.

list: list itself doesn't treat its arguments as anything in particular, but it does format them into items in a list. In a list, \ " and { operate the same way they do in Tcl. Tcl is designed this way so that a list is a properly-formatted command.

regexp and regsub: Some arguments are regular expressions.

string match: *, ?, [, and \ have special meaning.

To use these routines correctly it is necessary to understand how they interpret their arguments, and then quote arguments to that routine accordingly. The standard rules of Tcl describe the syntax of Tcl, and each additional routine documents its own interpretation and treatment of its arguments.

Arguments to expr should almost always be quoted somehow so that Tcl doesn't perform any substitutions on them. Typically braces are the most convenient but any form of quoting that prevents substitution is sufficient. The same is true for the first argument to if and while, and for arguments of other routines as listed above. This is mentioned on the Tcl Style Guide page and is discussed a bit on A Question of Style.

One common mistake is to let Tcl substitute a variable:

set myString hello

#This doesn't work:
if "$myString eq {}" {puts {empty string}}

Tcl substitutes $myString, so the expression looks like this:

hello eq {}

hello is not quoted, which is an error because an expression requires strings to be quoted. The expression evaluator should perform the substitution instead:

set myString hello
if {$myString eq {}} {puts {empty string}}

A complex example

In the following example, there are many issues:

#warning: bad code ahead!
set myString "This is a string with \[special characters\}"
if $myString eq {} {puts {empty string}}

The value passed as an expression is

This is a string with [special characters} eq {}

, which makes no sense as an expression. Additionally, there is no corresponding right square bracket for the left square bracket that signals script substitution, and no corresponding left curly bracket for the right curly bracket.

The best course of action is to prevent the Tcl substitutions with curly braces:

if {$myString eq {}} {puts {empty string}}

Now the value passed as an expression is expression:

$myString eq {}

, which is syntactically correct.

Page Authors

KBK

PYK

Category String Processing