started by TV
I'm making a interactive command composer, and happened to be wondering about general usability, so that preferably it wouldn't choke on special cases such as quoted names/arguments, and still generate command lines which can be part of a nested eval.
In bwise, which contains a lot of nested eval stuff probably rightfully not being popular for decent, readable, commercially documented (...) code, though to me is quite reasonable and does a lot of stuff which some day probably will be better understood as lisp-like functional progamming (I learned lisp on the acorn electron, a bbc computer derivative, from an old book I stumbled on in the electrical engineering library, having picked up the programming language's properties somewhere), much does work as intended, and I promise I'll somehow get to the point of explaining the functional decomposition stuff graphically. Which to me is like decent bookkeeping, but it is an efficient, mathematically sound, way of talking to a computer about a certain problem, worthy of consideration ... Straying from the topic.
A procedure definition all tclers know:
proc {{arg1 default_1} {arg2 default_2} ....} { body return $ret_value }
This is like most structured programming languages with argument passing.
In Tcl, not just everything is a string but most likely everything is a list, which to my mind has a submeaning that everything can be dealt with at that level, symbolically, so we can automatically generate procedure definitions.
What can easily get in the way, and is usually a drag and at least break of programming language consistency (read: start getting the pointers and the hexdumps), is allowing general definitions and quoting to phrase them in practical use. Here, just that subject, preferably staying out of quoting hell, suppose we want to use function arguments with non-trivial names or defaults.
Examples:
(Theo) 9 % proc test {{{a\ a} content\ a}} {puts ${a\ a}} (Theo) 10 % test content a (Theo) 11 % proc test {{{a\ a} {content\ a}}} {puts ${a\ a}} (Theo) 12 % test content\ a (Theo) 13 % proc test {{{a\\a} {content\ a}}} {puts ${a\\a}} (Theo) 14 % test content\ a (Theo) 15 %
Let's not forget:
THE BASIC MODEL (courtesy of John O.)
Almost all problems can be explained with three simple rules:
...
DKF: This was extracted recently from the recesses of the internet (well, the mirror of the old procplace ftp server) [L1 ] as part of a discussion about eval and list for academic citation purposes. Unfortunately, it's exact provenance and other metadata is lost, but it is apparently by Brent Welch on comp.lang.tcl…
<filed in /project/tcl/doc/README.programmer> This is a short note to describe a deep "gotcha" with TCL and the standard way to handle it. Up front, TCL seems pretty straight-forward and easy to use. However, trying out some complex things will expose you to the gotcha, which is referred to as "quoting hell", "unexpected evaluation", or "just what is a TCL list?". These problems, which many very smart people have had, are indications that programmer's mental model of the TCL evaluator is incorrect. The point of this note is to sketch out the basic model, the gotcha, and the right way to think (and program) around it. THE BASIC MODEL (curtesy of John O.) Almost all problems can be explained with three simple rules: 1. Exactly one level of substitution and/or evaluation occurs in each pass through the Tcl interpreter, no more and no less. 2. Each character is scanned exactly once in each pass through the interpreter. 3. Any well-formed list is also a well-formed command; if evaluated, each element of the list will become exactly one word of the command with no further substitutions. For example, consider the following four one-line scripts: set a $b eval {set a $b} eval "set a $b" eval list set a $b In the first script the set command passes through the interpreter once. It is chopped into three words, "set", "a", and the value of variable "b". No further substitutions are performed on the value of b: spaces inside b are not treated as word breaks in the "set" command, dollar-signs in the value of b don't cause variable substitution, etc. In the second script the "set" command passes through the interpreter twice: once while parsing the "eval" command and again when "eval" passes its argument back to the Tcl interpreter for evaluation. However, the braces around the set command prevent the dollar-sign from inducing variable substitution: the argument to eval is "set a $b". So, when this command is evaluated it produces exactly the same effect as the first script. In the third script double quotes are used instead of braces, so variable substitution occurs in the argument to eval, and this could cause unwanted effects when eval evaluates its argument. For example, if b contains the string "x y z" then the argument to eval will be "set a x y z"; when this is evaluated as a Tcl script it results in a "set" command with five words, which causes an error. The problem occurs because $b is first substituted and then re-evaluated. This double-evaluation can sometimes be used to produce interesting effects. For example, if the value of $b were "$c", then the script would set variable a to the value of variable c (i.e. indirection). The fourth script is safe again. While parsing the "eval" command, command substitution occurs, which causes the result of the "list" command to be the second word of the "eval" command. The result of the list command will be a proper Tcl list with three elements: "set", "a", and the contents of variable b (all as one element). For example, if $b is "x y z" then the result of the "list" command will be "set a {x y z}". This is passed to "eval" as its argument, and when eval re-evaluates it the "set" command will be well-formed: by rule #3 above each element of the list becomes exactly one word of the command. Thus the fourth script produces the same effect as the first and second ones. THE GOTCHA (observations by Brent Welch) The basic theme to the problem is that you have an arbitrary string and want to protect it from evaluation while passing it around through scripts and perhaps in and out of C code you write. The short answer is that you must use the list command to protect the string if it originates in a TCL script, or you must use the Tcl_Merge library procedure if the string originiates in your C code. Also, avoid double quotes and use list instead so you can keep a grip on things. Now, lets rewind and start with a simple example to give some context. We want to create a TK button that has a command associated with it. The command will just print out the label on the button, and we'll define a procedure to create this kind of button. There are two opportunities for evaluation here, one when the button is created and the command string is parsed, and again later on when the button is clicked. Here is our TCL proc: proc mybutton1 { parent self label } { if {$parent == "."} { set myname $parent$self } else { set myname $parent.$self } button $myname -text $label -command "puts stdout $label" pack append $parent $myname {left fill} } The intent here is that the command associated with the button is puts stdout $label Now, label is only defined when creating the button, not later on when the button is clicked. Thus we use double-quoting to group the words in the command and to allow substitution of $label so that the button will print the right value. However, this version will only work if the value for label is a single list element. This is because the double quotes around "puts stdout $label" allows variable substitution before grouping the words into a list. If label had a value like "a b c", then the command string defined for the button would be puts stdout a b c and pass too many arguments to the puts procedure who would complain. THE SOLUTION The right solution is to compose the command using the list operator. list will preserve the list structure and protect the value that was in $label so it will survive correctly until the button is clicked: proc mybutton2 { parent self label } { if {$parent == "."} { set myname $parent$self } else { set myname $parent.$self } button $myname -text $label -command list puts stdout $label pack append $parent $myname {left fill} } In this case, list will "do the right thing" and massage the value of $label so that it appears as a single list element with respect to the invocation of puts. The command string for the button will be: puts stdout {a b c} The second place you experience this problem is when composing commands to be evaluated from inside C code. If the example is at all complex, you'll want to use Tcl_Merge to build up the command string before passing it into Tcl_Eval. Tcl_Merge takes an argc, argv parameter set and converts it to a string while preserving the list structure. That is, if you pass the result to Tcl_Eval, argv[0] will be interpreted as the command name, and argv[1] up through argv[argc-1] will be passed as the parameters to the command. Note that Tcl_VarEval *does not* make this guarantee. Instead, it behaves more like double-quotes by concatinating all its arguments together and then reparsing to determine list structure. ANOTHER GOTCHA Now, let's extend this example with another feature that I've found thorny. Suppose I want the caller of mybutton2 to be able to pass in more arguments that will be passed to the button primitive. Say they want to fiddle with the colors of the button. Now I can add the special parameter "args" to the end of the parameter list. When mybutton3 is called, the variable args will be a list of all the remaining arguments. The naive, and wrong, approach is: proc mybutton3 { parent name label args} { if {$parent == "."} { set myname $parent$self } else { set myname $parent.$self } button $myname -text $label -command list puts stdout $label $args pack append $parent $myname {left fill} } This is wrong because button doesn't want a sublist of more arguments, it wants many arguments. So, how am I gonna stick the value of $args onto my button command. Or, said another way, how am I going to create the proper list structure? It is tempting to do the following: eval "button $myname -text $label -command list puts stdout $label $args" However, this construct causes things to go through the evaluator twice, which will lead to unexpected results. The double quotes will allow substitution, so, again, if $label has spaces, then the button command will not like its argument list. Another (ugly) try: eval "button \$myname -text \$label -command [list puts stdout \$label\] $args" Now $args is the only variable that is evaluated twice, once to remove its outermost list structure, and the second time as individual arguments to the button command. I think a better approach is the following: eval concat {button $myname -text $label -command [list puts stdout $label} $args] In this case, $args is evaluated twice, once before the call to concat, and a second time explicitly by calling eval. The stuff between the curly braces is protected against substitution on the first pass, however, (which is good), and so all concat ends up doing is stripping off the outermost list structure (the curly braces) from its two arguments and putting a space between them. Another, perhaps clearer way of writing this is: set cmd {button $myname -text $label -command list puts stdout $label} eval concat $cmd $args Now, with this form it is fairly clear(?) that the items in the button command and the $args list will only be evaluated one time. Finally, it turns out you can eliminate the explicit call to concat because eval will do that for us if it is given multiple arguments: set cmd {button $myname -text $label -command list puts stdout $label} eval $cmd $args Which leads us back to: eval {button $myname -text $label -command list puts stdout $label} $args
Note that this was long before the introduction of {*}-syntax, which has a better solution to the problem:
button $myname -text $label -command [list puts stdout $label] {*}$args
PYK: There is also a copy of this at [L2 ]