Version 29 of double substitution

Updated 2014-03-12 20:44:05 by AMG

Summary

An understanding of double substitution is necessary to avoid injection attacks.

Numerous commands interpret their arguments as scripts to be executed. [eval] and [subst] execute Tcl scripts. [expr] (and therefore [if] and [while]) execute [expr] scripts. Each of these commands parses its arguments according to the syntax rules of the target language. Because these arguments were already parsed once prior to the invocation of the command, any variable or command substitution that occurs during this new round of parsing and interpretation is referred to as double substitution.

Double substitution can be useful, but is often the result of inadvertently failing to brace the arguments to [expr] or to the [expr] components of [if] [while]. static syntax analysis tools can be used to locate these occurences in a script.

KBK: Braced expressions on [if], [while] and [expr] aren't just safer, they're also much, much faster. Unbraced ones have to be parsed at run time; braced ones can be compiled down to very tight bytecode sequences.

See Also

Brace your expr-essions
avoiding unintended double substitution
static syntax analysis
analysing script content prior to its execution

Description

Rule 2 says that a Tcl command is evaluated in two steps. First, the Tcl interpreter breaks the command into words and performs substitutions. Then the interpreter uses the first word as a command name, calls the command, and passes the rest of the words to the command as arguments.

By the time a command gets to see its arguments, Tcl has already performed its various substitutions on those arguments. Many commands in turn perform their own substitutions on their arguments and/or pass those arguments back to the interperter to be evaluated scripts in their own right. The quintessential example is the [expr] command, but it is not the only one. Here are some other examples:

[after]
argument is a script to be evaluated later
[expr]
In [expr] scripts, $ causes variable substitution and [ causes command substition.
[if]
second argument is an [expr]
[set]
interpets ( as array element access
[list]
interprets \ as backslash substitution, " and { as a grouping operators. This is true for all commands that interpret their arguments as lists
[eval]
[concat]enates its arguments, interprets the result as a single Tcl script, and executes it.
[regexp] and [regsub]
certain arguments are parsed as regular expressions
[string match]
interprets *, ?, [, and \
[trace]
argument is a script to be evaluated later
[while]
second argument is interpreted according as an [expr]ession

Each of these commands can be thought of as a separate interpreter that implements its own mini-language. In the case of [eval] and some others, the mini-language is just Tcl again, but anywhere a command is performing some sort of interpretation on some of the characters in its arguments, there are two layers of interpretation happening: Tcl performs its substitutions on command arguments first, and then the command may perform its own substitutions. Essentially, a script is being composed and then evaluated at runtime. This is a natural part of the design of Tcl, but if done incorrectly, it can leave one vulnerable to injection attacks, so it is important to understand and be aware of when double substitution occurs. This means understanding how each employed command use operates, and how it interprets its arguments. The standard rules of Tcl describe the syntax of Tcl, and each additional command documents its own parsing and interpretation behaviour.

Arguments to [expr] should almost always be braced because it avoids the first layer of substitution by the Tcl interpreter. The same is true for the first argument to [if] and `[while]. This is mentioned on the Tcl Style Guide page and is discussed a bit on A Question of Style.

Here is an example of a common mistake:

#warning:  bad code ahead!
set myString "hello"
if "$myString eq {}" {puts "empty string"}

By the time [if] sees its first argument, it looks like this:

hello eq {}

hello was quoted for Tcl when it performed its substitutions, but it then was not quoted for [expr] (via [if]), which currently requires that literal strings be enclosed in double quotes or braces, so an error occurs.

#warning:  bad code ahead!
set myString "hello there"
if $myString=={} {puts "empty string"}

In this case, [expr] gets the following value for its argument:

hello there=={}

which is even more of an error because now [expr] can't make sense of the number of arguments.

Note that Tcl did not substitute away the curly brackets after ==, since the Braces and Double quotes rules only apply when a brace or double quote occur at the beginning of a word

A complex example

In the following example, there are many issuses:

#warning: bad code ahead!
set myString "This is a string with \[special characters\}"
if $myString=={} {puts "empty string"}

[expr], via [if], receives the following value,

This is a string with [special characters}=={}

tries to parse it as a script, and errors at This, which is not quoted, violating the syntax rules of [expr]essions. Additionally, there is no corresponding right square bracket for the left square bracket that signals command substitution, and no corresponding left curly bracket for the right curly bracket. Even if those problems were fixed, the sequence of words is still nonsense to [expr].

The best course of action is to prevent the Tcl substitutions with curly braces:

if {$myString eq {}} {puts "empty string"}

Now [expr] receives the following argument, which is a well-formed [expr]ession:

$myString eq {}