Tracing inappropriate variable access

This page is intended eventually to become a TIP proposing an additional syntax for [trace] to allow for application-level callbacks when an array element access is performed on a scalar variable or vice versa.

Rationale

Several developers have requested the ability, at either Tcl or C level, to establish callbacks that trap access to a variable that has inappropriate type. In this context, type refers to whether the variable is an array or a scalar; this is Tcl's only notion of "type" of a variable; in all other cases, "everything is a string."

The intent of this capability is to allow early experimentation with ways to represent collections other than Tcl arrays. It would allow handles to keyed lists, tree maps, vectors, database tables, and other similar structures to be stored in scalar variables within a Tcl script and still accessed as arrays. It would, however, not represent a great departure from Tcl's current mode of operation; scalars are still scalars, and arrays are still arrays. Rather, it allows user code to intercept the error that would otherwise result when a variable is accessed inappropriately (that is, an array is accessed as a scalar or vice versa). It allows the user callback to determine the results of the operation. The default is to throw the same error that is thrown today.

Today's [trace] command does not allow this type of access to be made flexibly. While using read and write traces can allow objects with array semantics to be presented at the Tcl level, these objects must have Tcl arrays representing them. This limitation can be a significant consumer of memory, and introduces problems with redundant representations of the data: the array ::env, for instance, appears as both the environment variables of the native system and as a Tcl array.

Syntax

To the [trace] command, we propose adding the syntax:

   trace add variable bad-array callback
   trace add variable bad-scalar callback

The callback function is extended from other variable traces in several ways.

Its syntax has more arguments, to wit:

        callback name1 name2 op1 op2 args
name1
is the name of the scalar or array variable being traced.
name2
is the subscript used for an attempted array access, or the empty string for an attempted scalar access.
op1
is the trace operation for which the callback is being executed: bad-array for an attempted array access to a scalar variable or bad-scalar for an attempted scalar access to an array variable.
op2
indicates the type of access that has been requested. The possible types of access are given below.
args
The interpretation of additional arguments to the callback depends on the value of op2.

The possible values of op2 include:

read
An attempt was made to read a variable inappropriately. In this case, the callback receives no additional arguments.
write
An attempt was made to write a variable inappropriately. The callback receives one additional argument, which contains the value being written.
unset
An attempt was made to unset an array element of a scalar variable, or vice versa. The callback receives no additional arguments.
array
An attempt was made to execute the [array] command against a scalar variable. The [array] command, after all substitutions, appears as an additional argument to the callback.
trace
An attempt was made to establish a trace on an array element, but the variable in question is a scalar variable. The callback receives three additional arguments:
  1. One of the words add, remove or info, indicating whether the trace is being created, removed, or queried.
  2. A list containing some subset of the keywords, read, write, unset or array, indicating what operations are to be traced.
  3. A Tcl command that is to be invoked when the requested operation takes place. Note that this Tcl command may be a handle representing a C callback, if code establishes a trace at the C level.

The callback established by this version of the trace command is invoked whenever the requested operation takes place on the given variable. For read traces, the callback is expected to return the value that the given element should take on. For write and unset traces, the return value is ignored. For array traces, the return value from the trace procedure becomes the return value from the array command that caused it to be invoked. For traces upon the trace command itself, the return value is expected to be empty for adding and removing a trace, or contain the result that is to be returned to trace info variable.

If multiple commands establish tracing on the same variable, they are invoked in the order in which they are established.

There are several special returns that can be made from the command:

  1. Returning with -code error indicates that the requested operation failed, and is reported to the caller in the same way that errors on other variable traces are.
  2. Returning with -code return causes the return value to be reported immediately as the value of the trace operation, and later traces on the same element to be ignored.
  3. break and continue are inappropriate return codes and are treated as errors.

[Still need to specify the C level usage for this]


AMG: I came here hoping to find tips on locating where a variable is being set. My experiments with printing all the [info frame]s inside a write trace haven't borne fruit. In GDB I'd just set a watchpoint.